Hello,

I could take a shot at the Java one if you like?

I'm actually working in the codebase at the moment on something related that I 
was going to offer as a PR once it's ready. We use the Java Arrow library as 
the core of our data service, the VSR is our intermediate representation and we 
translate to/from various formats and across various storage backends. We 
really need non-blocking data read to make that efficient and scalable, so I've 
made alternate implementations of the Readers where you can feed in data as a 
series of ByteBuffer objects instead of calling loadNextBatch(). For streams 
this means feeding in bytes and buffering until a batch is available, for files 
we're reading the block info from the footer and then feeding in buffers 
(slices) for each block. I was able to reuse all the same serialization helpers 
etc.

Does this sound useful? If it does then I can raise a PR for Arrow when it's 
done. No worries if not and we just keep the non-blocking readers in our own 
codebase. They're not a lot of code either way.

Happy to take a shot at the row counts after that, weekend time probably. If I 
sketched out a draft PR would you be happy to take a look and tell me if I'm on 
the right lines?

Kind regards,

Martin Traverse
Technical Architect
UKI Risk
Tel: +44 7305 120 791
Email: martin.trave...@accenture.com

My regular office hours are 10:00 - 18:30 UK time, Monday - Thursday












-----Original Message-----
From: Weston Pace <weston.p...@gmail.com>
Sent: 28 March 2023 17:35
To: dev@arrow.apache.org
Subject: [External] Re: row counts in footer of IPC file format

This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links 
and attachments.

I suspect the next step will be to create two implementations and create test 
files for the integration test suite.  These will be required before we can 
vote on this.

Are either of you interested in contributing an implementation (C++, Rust, 
Java, and Go have been the usual suspects in the past but JS or C# should be 
viable too)?  In the past, once an implementation & test files have been 
created for one language, it has been easier to drum up a volunteer to create a 
second implementation.

________________________________

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. Your privacy is important to us. Accenture uses your personal data only 
in compliance with data protection laws. For further information on how 
Accenture processes your personal data, please see our privacy statement at 
https://www.accenture.com/us-en/privacy-policy.
______________________________________________________________________________________

www.accenture.com

Reply via email to