the80srobot opened a new issue, #38679:
URL: https://github.com/apache/arrow/issues/38679

   ### Describe the enhancement requested
   
   Hello, I'm trying to add parquet output support to 
github.com/wowsignal-io/pedro and github.com/google/santa, and it's proving to 
be a real pain. In Pedro's case, the build adds ~15 MB to the binary size, and 
I'd like reduce that.
   
   Unfortunately, it seems like cpp/parquet/ has grown a lot of dependencies on 
cpp/arrow/ over the years and it's almost impossible not to build 90% of arrow. 
I gave up when I realized `column_writer.cc` depends on stuff in 
cpp/arrow/compute/.
   
   Would it be possible to introduce some kind of layering and modularization 
to the codebase that'd enable a small libparquet-only build?
   
   If that's not something you're planning on doing, I'd be willing to maintain 
a reasonable size patchset as part of Pedro, if it helped me just build stuff 
in cpp/parquet/, but I can't see any obvious module or layer boundaries within 
arrow, where I could start reducing some of the inter-dependencies. Would you 
be able to point me in the right direction?
   
   Alternatives I'm considering is using an ancient version of Parquet, or 
building a separate library that just does Parquet output, but the former 
sounds like it'll eventually have a CVE and the latter like a fair amount of 
work.
   
   Thank you.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to