complyue opened a new issue #295:
URL: https://github.com/apache/arrow-julia/issues/295


   
https://github.com/apache/arrow-julia/blob/614fce0a5d7db8fee078be32690c5220848538e2/src/table.jl#L276-L293
   
   I see from above that record batches will be parsed (esp. decompression 
could be rather intensive computation workload) in parallel if the Julia 
runtime has multithread enabled, which is great. 
   
   But according to the implementation, the original order of batches as they 
had been written will not be guaranteed as preserved, which I think is not 
ideal. I'm not sure how Arrow spec should say about this aspect, but I'm 
dealing with time series data recorded batch-by-batch where the order signifies 
a lot.
   
   I'd like to draft a PR to preserve batch order with regard to this concern, 
and as I start tinkering with the codebase, I file this issue to ask your 
opinions about it.
   
   (Btw, I'm also tinkering about a PR for #293, which is orthogonal wrt 
functionality, but seems closely related wrt implementation details. I'd think 
2 separate PRs would make better clarity for review and release purpose, but if 
you can accept a single PR addressing the 2 things together, it could be a lot 
easier for me, given I'm not fluent in git rebasing and related skills.)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to