emkornfield commented on pull request #9862:
URL: https://github.com/apache/arrow/pull/9862#issuecomment-813064288


   > Is there a test case that exists in the integration archery tests which 
has the metadata say to use compression but passes uncompressed buffers? The 
code currently correctly handles 0 length and nil buffers, but I didn't see 
anything in the spec or config (or the C++ implementation) for having non-zero 
length buffers which are uncompressed being sent along side compressed ones. I 
guess I'm missing something?
   
   Spec reference is here 
https://github.com/apache/arrow/blob/master/format/Message.fbs#L59  Looking at 
the C++ code it doesn't look like it is conformance with the specification 
here.  I opened https://issues.apache.org/jira/browse/ARROW-12196
   
   > The one optimization I can see here is that I could implement 
parallelization for the compression to attempt to compress the body buffers in 
parallel rather than serially like I'm doing right now.
   
   In C++ at least threading is configurable.  I don't have a strong preference 
here, and really profiling for specific use-cases is probably important.  For 
small batches with narrow schemas I can imagine multithreading being slower 
then single threaded.   So feel free to leave as is if someone runs into 
performance issues using these features we ca explore the options.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to