nealrichardson commented on issue #13803:
URL: https://github.com/apache/arrow/issues/13803#issuecomment-1209245626

   > Thanks for your reply. My intended uses are
   > 
   > 1. Recreate the schema from json files.
   > 2. Simple validation of the table by checking json files.
   > 
   > I believe python currently allows we to create a schema from a json file. 
Like that:
   > 
   > ```json
   > {
   >     "col1": "int32"
   > }
   > ```
   > 
   > ```python
   > >>> import pyarrow as pa
   > >>> import json
   > >>> with open("test.json") as f:
   > ...     d = json.load(f)
   > ... 
   > >>> pa.schema(d)
   > col1: int32
   > ```
   > 
   
   Making a schema from a list like that nearly works in R, just would need to 
add support for accepting string type names instead of type instances:
   
   ```
   > schema(list(col1 = "int32"))
   Error: a must be a DataType, not character
   > schema(list(col1 = int32()))
   Schema
   col1: int32
   ```
   
   But you still have the problem with nested/list/dictionary types, which 
would require string parsing. Those don't work in pyarrow either:
   
   ```
   >>> import pyarrow as pa
   >>> pa.schema({"col1": "int32"})
   col1: int32
   >>> pa.schema({"col1": "dictionary<values=string, indices=int32>"})
   Traceback (most recent call last):
     File "pyarrow/types.pxi", line 3171, in pyarrow.lib.type_for_alias
   KeyError: 'dictionary<values=string, indices=int32>'
   ```
   
   > > On the R side, there is this nice code generating utility that Romain 
added recently:
   > 
   > I did not know this method existed because it was not mentioned in the R 
documentation. I think this is pretty close to what I want.
   
   Would you like to submit a PR to improve those docs? 😉  🙏 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to