[GitHub] [arrow] moria97 commented on issue #14606: [C++] memory consumption question

GitBox Tue, 15 Nov 2022 23:58:14 -0800


moria97 commented on issue #14606:
URL: https://github.com/apache/arrow/issues/14606#issuecomment-1316549270


   @westonpace Thanks for the explanation! Y
   es, I'm using write_table to write ndjson input to parquet file. I'm using 
PInvoke from dotnet and pass the original multiple line json file from C# code. 
In native c++ code, we call TableReader to construct the table. Specifically, 
the implementation is 
[here](https://github.com/microsoft/FHIR-Analytics-Pipelines/blob/main/FhirToDataLake/native/parquet/cpp/src/ParquetWriter.cpp)
   
   The requests are made in parallel in multiple threads repeatedly and it 
consumes about 10GB JSON data every hour (each file is about 6000 lines of 
json). 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] moria97 commented on issue #14606: [C++] memory consumption question

Reply via email to