[GitHub] [arrow] westonpace commented on issue #15220: Speed up Parquet Writing?

GitBox Mon, 09 Jan 2023 17:26:12 -0800


westonpace commented on issue #15220:
URL: https://github.com/apache/arrow/issues/15220#issuecomment-1376594167


   Reading json and writing parquet, purely in pyarrow, should look something 
like:
   
   ```
   import pyarrow.parquet
   import pyarrow.json
   table = pyarrow.json.read_json('/tmp/foo.jsonl')
   pyarrow.parquet.write_table(table, '/tmp/foo.parquet')
   ```
   
   Separating disk write from encoding would be trickier (there is an in-memory 
filesystem in C++ but I don't think we expose it in python).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on issue #15220: Speed up Parquet Writing?

Reply via email to