westonpace commented on issue #15220:
URL: https://github.com/apache/arrow/issues/15220#issuecomment-1376594167
Reading json and writing parquet, purely in pyarrow, should look something
like:
```
import pyarrow.parquet
import pyarrow.json
table = pyarrow.json.read_json('/tmp/foo.jsonl')
pyarrow.parquet.write_table(table, '/tmp/foo.parquet')
```
Separating disk write from encoding would be trickier (there is an in-memory
filesystem in C++ but I don't think we expose it in python).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]