I don't think we should couple this discussion with the implementation of large list, etc since I think those two concepts are independent.
I've asked some others on my team their opinions on the risk here. I think we should probably review some our more complex vector interactions and see how the jvm's assembly changes with this kind of change. Using microbenchmarking is good but I think we also need to see whether we're constantly inserting additional instructions or if in most cases, this actually doesn't impact instruction count. On Wed, Aug 21, 2019 at 12:18 PM Micah Kornfield <[email protected]> wrote: > >> With regards to the reference implementation point. It is a good point. >> I'm on vacation this week. Unless you're pushing hard on this, can we pick >> this up and discuss more next week? > > > Hi Jacques, I hope you had a good rest. Any more thoughts on the > reference implementation aspect of this? > > >> To copy the sentiments from the 0.15.0 release thread, I think it >> would be best to decouple this discussion from the release timeline >> given how many people we have relying on regular releases coming out. >> We can keep continue making major 0.x releases until we're ready to >> release 1.0.0. > > > I'm OK with it as long as other stakeholders are. Timed releases are the > way to go. As stated on the release thread [1] we need a better mechanism > to avoid this type of issue arising again. The release thread also had > some more discussion on compatibility. > > Thanks, > Micah > > [1] > https://lists.apache.org/thread.html/d70feeceaf2570906ade117030b29887af7c77ca5c4a976e6d555920@%3Cdev.arrow.apache.org%3E > > > On Wed, Aug 14, 2019 at 3:23 PM Wes McKinney <[email protected]> wrote: > >> On Sun, Aug 11, 2019 at 9:40 PM Micah Kornfield <[email protected]> >> wrote: >> > >> > Hi Wes and Jacques, >> > See responses below. >> > >> > With regards to the reference implementation point. It is a good point. >> I'm >> > > on vacation this week. Unless you're pushing hard on this, can we >> pick this >> > > up and discuss more next week? >> > >> > >> > Sure thing, enjoy your vacation. I think the only practical >> implications >> > are it delays choices around implementing LargeList, LargeBinary, >> > LargeString in Java, which in turn might push out the 0.15.0 release. >> > >> >> To copy the sentiments from the 0.15.0 release thread, I think it >> would be best to decouple this discussion from the release timeline >> given how many people we have relying on regular releases coming out. >> We can keep continue making major 0.x releases until we're ready to >> release 1.0.0. >> >> > My stance on this is that I don't know how important it is for Java to >> > > support vectors over INT32_MAX elements. The use cases enabled by >> > > having very large arrays seem to be concentrated in the native code >> > > world (e.g. C/C++/Rust) -- that could just be implementation-centrism >> > > on my part, though. >> > >> > >> > A data point against this view is Spark has done work to eliminate 2GB >> > memory limits on its block sizes [1]. I don't claim to understand the >> > implications of this. Bryan might you have any thoughts here? I'm OK >> with >> > INT32_MAX, as well, I think we should think about what this means for >> > adding Large types to Java and implications for reference >> implementations. >> > >> > Thanks, >> > Micah >> > >> > [1] https://issues.apache.org/jira/browse/SPARK-6235 >> > >> > On Sun, Aug 11, 2019 at 6:31 PM Jacques Nadeau <[email protected]> >> wrote: >> > >> > > Hey Micah, >> > > >> > > Appreciate the offer on the compiling. The reality is I'm more >> concerned >> > > about the unknowns than the compiling issue itself. Any time you've >> been >> > > tuning for a while, changing something like this could be totally >> fine or >> > > cause a couple of major issues. For example, we've done a very large >> amount >> > > of work reducing heap memory footprint of the vectors. Are target is >> to >> > > actually get it down to 24 bytes per ArrowBuf and 24 bytes heap per >> vector >> > > (not including arrow bufs). >> > > >> > > With regards to the reference implementation point. It is a good >> point. >> > > I'm on vacation this week. Unless you're pushing hard on this, can we >> pick >> > > this up and discuss more next week? >> > > >> > > thanks, >> > > Jacques >> > > >> > > On Sat, Aug 10, 2019 at 7:39 PM Micah Kornfield < >> [email protected]> >> > > wrote: >> > > >> > >> Hi Jacques, >> > >> I definitely understand these concerns and this change is risky >> because it >> > >> is so large. Perhaps, creating a new hierarchy, might be the >> cleanest way >> > >> of dealing with this. This could have other benefits like cleaning >> up >> > >> some >> > >> cruft around dictionary encode and "orphaned" method. Per past >> e-mail >> > >> threads I agree it is beneficial to have 2 separate reference >> > >> implementations that can communicate fully, and my intent here was to >> > >> close >> > >> that gap. >> > >> >> > >> Trying to >> > >> > determine the ramifications of these changes would be challenging >> and >> > >> time >> > >> > consuming against all the different ways we interact with the >> Arrow Java >> > >> > library. >> > >> >> > >> >> > >> Understood. I took a quick look at Dremio-OSS it seems like it has a >> > >> simple java build system? If it is helpful, I can try to get a fork >> > >> running that at least compiles against this PR. My plan would be to >> cast >> > >> any place that was changed to return a long back to an int, so in >> essence >> > >> the Dremio algorithms would reman 32-bit implementations. >> > >> >> > >> I don't have the infrastructure to test this change properly from a >> > >> distributed systems perspective, so it would still take some time >> from >> > >> Dremio to validate for regressions. >> > >> >> > >> I'm not saying I'm against this but want to make sure we've >> > >> > explored all less disruptive options before considering changing >> > >> something >> > >> > this fundamental (especially when I generally hold the view that >> large >> > >> cell >> > >> > counts against massive contiguous memory is an anti pattern to >> scalable >> > >> > analytical processing--purely subjective of course). >> > >> >> > >> >> > >> I'm open to other ideas here, as well. I don't think it is out of the >> > >> question to leave the Java implementation as 32-bit, but if we do, >> then I >> > >> think we should consider a different strategy for reference >> > >> implementations. >> > >> >> > >> Thanks, >> > >> Micah >> > >> >> > >> On Sat, Aug 10, 2019 at 5:09 PM Jacques Nadeau <[email protected]> >> > >> wrote: >> > >> >> > >> > Hey Micah, I didn't have a particular path in mind. Was thinking >> more >> > >> along >> > >> > the lines of extra methods as opposed to separate classes. >> > >> > >> > >> > Arrow hasn't historically been a place where we're writing >> algorithms in >> > >> > Java so the fact that they aren't there doesn't mean they don't >> exist. >> > >> We >> > >> > have a large amount of code that depends on the current behavior >> that is >> > >> > deployed in hundreds of customer clusters (you can peruse our >> dremio >> > >> repo >> > >> > to see how extensively we leverage Arrow if interested). Trying to >> > >> > determine the ramifications of these changes would be challenging >> and >> > >> time >> > >> > consuming against all the different ways we interact with the >> Arrow Java >> > >> > library. I'm not saying I'm against this but want to make sure >> we've >> > >> > explored all less disruptive options before considering changing >> > >> something >> > >> > this fundamental (especially when I generally hold the view that >> large >> > >> cell >> > >> > counts against massive contiguous memory is an anti pattern to >> scalable >> > >> > analytical processing--purely subjective of course). >> > >> > >> > >> > On Sat, Aug 10, 2019, 4:17 PM Micah Kornfield < >> [email protected]> >> > >> > wrote: >> > >> > >> > >> > > Hi Jacques, >> > >> > > What avenue were you thinking for supporting both paths? I >> didn't >> > >> want >> > >> > > to pursue a different class hierarchy, because I felt like that >> would >> > >> > > effectively fork the code base, but that is potentially an >> option that >> > >> > > would allow us to have a complete reference implementation in >> Java >> > >> that >> > >> > can >> > >> > > fully interact with C++, without major changes to this code. >> > >> > > >> > >> > > For supporting both APIs on the same classes/interfaces, I think >> they >> > >> > > roughly fall into three categories, changes to input parameters, >> > >> changes >> > >> > to >> > >> > > output parameters and algorithm changes. >> > >> > > >> > >> > > For inputs, changing from int to long is essentially a no-op >> from the >> > >> > > compiler perspective. From the limited micro-benchmarking this >> also >> > >> > > doesn't seem to have a performance impact. So we could keep two >> > >> versions >> > >> > > of the methods that only differ on inputs, but it is not clear >> what >> > >> the >> > >> > > value of that would be. >> > >> > > >> > >> > > For outputs, we can't support methods "long getLength()" and "int >> > >> > > getLength()" in the same class, so we would be forced into >> something >> > >> like >> > >> > > "long getLength(boolean dummy)" which I think is a less >> desirable. >> > >> > > >> > >> > > For algorithm changes, there did not appear to be too many places >> > >> where >> > >> > we >> > >> > > actually loop over all elements (it is quite possible I missed >> > >> something >> > >> > > here), the ones that I did find I was able to mitigate >> performance >> > >> > > penalties as noted above. Some of the current implementation >> will >> > >> get a >> > >> > > lot slower for "large arrays", but we can likely fix those later >> or in >> > >> > this >> > >> > > PR with a nested while loop instead of 2 for loops. >> > >> > > >> > >> > > Thanks, >> > >> > > Micah >> > >> > > >> > >> > > >> > >> > > On Saturday, August 10, 2019, Jacques Nadeau <[email protected] >> > >> > >> wrote: >> > >> > > >> > >> > >> This is a pretty massive change to the apis. I wonder how nasty >> it >> > >> would >> > >> > >> be to just support both paths. Have you evaluated how complex >> that >> > >> > would be? >> > >> > >> >> > >> > >> On Wed, Aug 7, 2019 at 11:08 PM Micah Kornfield < >> > >> [email protected]> >> > >> > >> wrote: >> > >> > >> >> > >> > >>> After more investigation, it looks like Float8Benchmarks at >> least >> > >> on my >> > >> > >>> machine are within the range of noise. >> > >> > >>> >> > >> > >>> For BitVectorHelper I pushed a new commit [1], seems to bring >> the >> > >> > >>> BitVectorHelper benchmarks back inline (and even with some >> > >> improvement >> > >> > >>> for >> > >> > >>> getNullCountBenchmark). >> > >> > >>> >> > >> > >>> Benchmark Mode Cnt >> Score >> > >> > >>> Error >> > >> > >>> Units >> > >> > >>> BitVectorHelperBenchmarks.allBitsNullBenchmark avgt 5 >> 3.821 ± >> > >> > >>> 0.031 >> > >> > >>> ns/op >> > >> > >>> BitVectorHelperBenchmarks.getNullCountBenchmark avgt 5 >> 14.884 ± >> > >> > >>> 0.141 >> > >> > >>> ns/op >> > >> > >>> >> > >> > >>> I applied the same pattern to other loops that I could find, >> and for >> > >> > any >> > >> > >>> "for (long" loop on the critical path, I broke it up into two >> loops. >> > >> > the >> > >> > >>> first loop does iteration by integer, the second finishes off >> for >> > >> any >> > >> > >>> long >> > >> > >>> values. As a side note it seems like optimization for loops >> using >> > >> long >> > >> > >>> counters at least have a semi-recent open bug for the JVM [2] >> > >> > >>> >> > >> > >>> Thanks, >> > >> > >>> Micah >> > >> > >>> >> > >> > >>> [1] >> > >> > >>> >> > >> > >>> >> > >> > >> > >> >> https://github.com/apache/arrow/pull/5020/commits/2ea2c1ae83e3baa7b9a99a6d06276d968df41797 >> > >> > >>> [2] https://bugs.openjdk.java.net/browse/JDK-8223051 >> > >> > >>> >> > >> > >>> On Wed, Aug 7, 2019 at 8:11 PM Micah Kornfield < >> > >> [email protected]> >> > >> > >>> wrote: >> > >> > >>> >> > >> > >>> > Indeed, the BoundChecking and CheckNullForGet variables can >> make a >> > >> > big >> > >> > >>> > difference. I didn't initially run the benchmarks with these >> > >> turned >> > >> > on >> > >> > >>> > (you can see the result from above with Float8Benchmarks). >> Here >> > >> are >> > >> > >>> new >> > >> > >>> > numbers including with the flags enabled. It looks like >> using >> > >> longs >> > >> > >>> might >> > >> > >>> > be a little bit slower, I'll see what I can do to mitigate >> this. >> > >> > >>> > >> > >> > >>> > Ravindra also volunteered to try to benchmark the changes >> with >> > >> > Dremio's >> > >> > >>> > code on today's sync call. >> > >> > >>> > >> > >> > >>> > New >> > >> > >>> > >> > >> > >>> > Benchmark Mode Cnt >> Score >> > >> > >>> Error >> > >> > >>> > Units >> > >> > >>> > >> > >> > >>> > BitVectorHelperBenchmarks.allBitsNullBenchmark avgt 5 >> > >> 4.176 ± >> > >> > >>> 1.292 >> > >> > >>> > ns/op >> > >> > >>> > >> > >> > >>> > BitVectorHelperBenchmarks.getNullCountBenchmark avgt 5 >> > >> 26.102 ± >> > >> > >>> 0.700 >> > >> > >>> > ns/op >> > >> > >>> > >> > >> > >>> > Float8Benchmarks.copyFromBenchmark avgt 5 7.398 ± 0.084 >> > >> us/op >> > >> > >>> > >> > >> > >>> > Float8Benchmarks.readWriteBenchmark avgt 5 2.711 ± 0.057 >> > >> us/op >> > >> > >>> > >> > >> > >>> > >> > >> > >>> > >> > >> > >>> > old >> > >> > >>> > >> > >> > >>> > BitVectorHelperBenchmarks.allBitsNullBenchmark avgt 5 >> > >> 3.828 ± >> > >> > >>> 0.030 >> > >> > >>> > ns/op >> > >> > >>> > >> > >> > >>> > BitVectorHelperBenchmarks.getNullCountBenchmark avgt 5 >> > >> 20.611 ± >> > >> > >>> 0.188 >> > >> > >>> > ns/op >> > >> > >>> > >> > >> > >>> > Float8Benchmarks.copyFromBenchmark avgt 5 6.597 ± 0.462 >> > >> us/op >> > >> > >>> > >> > >> > >>> > Float8Benchmarks.readWriteBenchmark avgt 5 2.615 ± 0.027 >> > >> us/op >> > >> > >>> > >> > >> > >>> > On Wed, Aug 7, 2019 at 7:13 PM Fan Liya < >> [email protected]> >> > >> > wrote: >> > >> > >>> > >> > >> > >>> >> Hi Gonzalo, >> > >> > >>> >> >> > >> > >>> >> Thanks for sharing the performance results. >> > >> > >>> >> I am wondering if you have turned off the flag >> > >> > >>> >> BoundsChecking#BOUNDS_CHECKING_ENABLED. >> > >> > >>> >> If not, the lower throughput should be expected. >> > >> > >>> >> >> > >> > >>> >> Best, >> > >> > >>> >> Liya Fan >> > >> > >>> >> >> > >> > >>> >> On Wed, Aug 7, 2019 at 10:23 PM Micah Kornfield < >> > >> > >>> [email protected]> >> > >> > >>> >> wrote: >> > >> > >>> >> >> > >> > >>> >>> Hi Gonzalo, >> > >> > >>> >>> Thank you for the feedback. I wasn't aware of the JIT >> > >> > >>> implications. At >> > >> > >>> >>> least on the benchmark run they don't seem to have an >> impact. >> > >> > >>> >>> >> > >> > >>> >>> If there are other benchmarks that people have that can >> > >> validate if >> > >> > >>> this >> > >> > >>> >>> change will be problematic I would appreciate trying to >> run them >> > >> > >>> with the >> > >> > >>> >>> PR. I will try to run the ones for zeroing/popcnt tonight >> to >> > >> see >> > >> > if >> > >> > >>> >>> there >> > >> > >>> >>> is a change in those. >> > >> > >>> >>> >> > >> > >>> >>> -Micah >> > >> > >>> >>> >> > >> > >>> >>> >> > >> > >>> >>> >> > >> > >>> >>> On Wednesday, August 7, 2019, Gonzalo Ortiz Jaureguizar < >> > >> > >>> >>> [email protected]> wrote: >> > >> > >>> >>> >> > >> > >>> >>> > I would recommend to take care with this kind of changes. >> > >> > >>> >>> > >> > >> > >>> >>> > I didn't try Arrow in more than one year, but by then the >> > >> > >>> performance >> > >> > >>> >>> was >> > >> > >>> >>> > quite bad in comparison with plain byte buffer access >> > >> > >>> >>> > (see >> http://git.net/apache-arrow-development/msg02353.html *) >> > >> > and >> > >> > >>> >>> > there are several optimizations that the JVM >> (specifically, >> > >> C2) >> > >> > >>> does >> > >> > >>> >>> not >> > >> > >>> >>> > apply when dealing with int instead of longs. One of the >> > >> > >>> >>> > most commons is the loop unrolling and vectorization. >> > >> > >>> >>> > >> > >> > >>> >>> > * It doesn't seem the best way to reference an old email >> on >> > >> the >> > >> > >>> list, >> > >> > >>> >>> but >> > >> > >>> >>> > it is the only result shown by Google >> > >> > >>> >>> > >> > >> > >>> >>> > El mié., 7 ago. 2019 a las 11:42, Fan Liya (< >> > >> > [email protected] >> > >> > >>> >) >> > >> > >>> >>> > escribió: >> > >> > >>> >>> > >> > >> > >>> >>> >> Hi Micah, >> > >> > >>> >>> >> >> > >> > >>> >>> >> Thanks for your effort. The performance result looks >> good. >> > >> > >>> >>> >> >> > >> > >>> >>> >> As you indicated, ArrowBuf will take additional 12 >> bytes (4 >> > >> > bytes >> > >> > >>> for >> > >> > >>> >>> each >> > >> > >>> >>> >> of length, write index, and read index). >> > >> > >>> >>> >> Similar overheads also exist for vectors like >> > >> > >>> BaseFixedWidthVector, >> > >> > >>> >>> >> BaseVariableWidthVector, etc. >> > >> > >>> >>> >> >> > >> > >>> >>> >> IMO, such overheads are small enough to justify the >> change. >> > >> > >>> >>> >> Let's check if there are other overheads. >> > >> > >>> >>> >> >> > >> > >>> >>> >> Best, >> > >> > >>> >>> >> Liya Fan >> > >> > >>> >>> >> >> > >> > >>> >>> >> On Wed, Aug 7, 2019 at 3:30 PM Micah Kornfield < >> > >> > >>> [email protected] >> > >> > >>> >>> > >> > >> > >>> >>> >> wrote: >> > >> > >>> >>> >> >> > >> > >>> >>> >> > Hi Liya Fan, >> > >> > >>> >>> >> > Based on the Float8Benchmark there does not seem to >> be any >> > >> > >>> >>> meaningful >> > >> > >>> >>> >> > performance difference on my machine. At least for >> me, the >> > >> > >>> >>> benchmarks >> > >> > >>> >>> >> are >> > >> > >>> >>> >> > not stable enough to say one is faster than the other >> (I've >> > >> > >>> pasted >> > >> > >>> >>> >> results >> > >> > >>> >>> >> > below). That being said my machine isn't necessarily >> the >> > >> most >> > >> > >>> >>> reliable >> > >> > >>> >>> >> for >> > >> > >>> >>> >> > benchmarking. >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > On an intuitive level, this makes sense to me, for >> the >> > >> most >> > >> > >>> part it >> > >> > >>> >>> >> seems >> > >> > >>> >>> >> > like the change just moves casting from "int" to >> "long" >> > >> > further >> > >> > >>> up >> > >> > >>> >>> the >> > >> > >>> >>> >> > stack for "PlatformDepdendent" operations. If >> there are >> > >> > other >> > >> > >>> >>> >> benchmarks >> > >> > >>> >>> >> > that you think are worth running let me know. >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > One downside performance wise I think for his change >> is it >> > >> > >>> >>> increases the >> > >> > >>> >>> >> > size of ArrowBuf objects, which I suppose could >> influence >> > >> > cache >> > >> > >>> >>> misses >> > >> > >>> >>> >> at >> > >> > >>> >>> >> > some level or increase the size of call-stacks, but >> this >> > >> > doesn't >> > >> > >>> >>> seem to >> > >> > >>> >>> >> > show up in the benchmark.. >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > Thanks, >> > >> > >>> >>> >> > Micah >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > Sample benchmark numbers: >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > [New Code] >> > >> > >>> >>> >> > Benchmark Mode Cnt Score >> > >> Error >> > >> > >>> >>> Units >> > >> > >>> >>> >> > Float8Benchmarks.copyFromBenchmark avgt 5 >> 15.441 ± >> > >> 0.469 >> > >> > >>> >>> us/op >> > >> > >>> >>> >> > Float8Benchmarks.readWriteBenchmark avgt 5 >> 14.057 ± >> > >> 0.115 >> > >> > >>> >>> us/op >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > [Old code] >> > >> > >>> >>> >> > Benchmark Mode Cnt Score >> > >> Error >> > >> > >>> >>> Units >> > >> > >>> >>> >> > Float8Benchmarks.copyFromBenchmark avgt 5 >> 16.248 ± >> > >> 1.409 >> > >> > >>> >>> us/op >> > >> > >>> >>> >> > Float8Benchmarks.readWriteBenchmark avgt 5 >> 14.150 ± >> > >> 0.084 >> > >> > >>> >>> us/op >> > >> > >>> >>> >> > >> > >> > >>> >>> >> > On Tue, Aug 6, 2019 at 1:18 AM Fan Liya < >> > >> [email protected] >> > >> > > >> > >> > >>> >>> wrote: >> > >> > >>> >>> >> > >> > >> > >>> >>> >> >> Hi Micah, >> > >> > >>> >>> >> >> >> > >> > >>> >>> >> >> Thanks a lot for doing this. >> > >> > >>> >>> >> >> >> > >> > >>> >>> >> >> I am a little concerned about if there is any >> negative >> > >> > >>> performance >> > >> > >>> >>> >> impact >> > >> > >>> >>> >> >> on the current 32-bit-length based applications. >> > >> > >>> >>> >> >> Can we do some performance comparison on our existing >> > >> > >>> benchmarks? >> > >> > >>> >>> >> >> >> > >> > >>> >>> >> >> Best, >> > >> > >>> >>> >> >> Liya Fan >> > >> > >>> >>> >> >> >> > >> > >>> >>> >> >> >> > >> > >>> >>> >> >> On Tue, Aug 6, 2019 at 3:35 PM Micah Kornfield < >> > >> > >>> >>> [email protected]> >> > >> > >>> >>> >> >> wrote: >> > >> > >>> >>> >> >> >> > >> > >>> >>> >> >>> There have been some previous discussions on the >> mailing >> > >> > about >> > >> > >>> >>> >> supporting >> > >> > >>> >>> >> >>> 64-bit lengths for Java ValueVectors (this is what >> the >> > >> IPC >> > >> > >>> >>> >> specification >> > >> > >>> >>> >> >>> and C++ support). I created a PR [1] that changes >> all >> > >> APIs >> > >> > >>> that I >> > >> > >>> >>> >> could >> > >> > >>> >>> >> >>> find that take an index to take an "long" instead >> of an >> > >> > "int" >> > >> > >>> (and >> > >> > >>> >>> >> >>> similarly change "size/rowcount" APIs). >> > >> > >>> >>> >> >>> >> > >> > >>> >>> >> >>> It is a big change, so I think it is worth >> discussing if >> > >> it >> > >> > is >> > >> > >>> >>> >> something >> > >> > >>> >>> >> >>> we >> > >> > >>> >>> >> >>> still want to move forward with. It would be nice >> to >> > >> come >> > >> > to >> > >> > >>> a >> > >> > >>> >>> >> >>> conclusion >> > >> > >>> >>> >> >>> quickly, ideally in the next few days, to avoid a >> lot of >> > >> > merge >> > >> > >>> >>> >> conflicts. >> > >> > >>> >>> >> >>> >> > >> > >>> >>> >> >>> The reason I did this work now is the C++ >> implementation >> > >> has >> > >> > >>> added >> > >> > >>> >>> >> >>> support >> > >> > >>> >>> >> >>> for LargeList, LargeBinary and LargeString arrays >> and >> > >> based >> > >> > on >> > >> > >>> >>> prior >> > >> > >>> >>> >> >>> discussions we need to have similar support in Java >> > >> before >> > >> > our >> > >> > >>> >>> next >> > >> > >>> >>> >> >>> release. Support 64-bit indexes means we can have >> full >> > >> > >>> >>> compatibility >> > >> > >>> >>> >> and >> > >> > >>> >>> >> >>> make the most use of the types in Java. >> > >> > >>> >>> >> >>> >> > >> > >>> >>> >> >>> Look forward to hearing feedback. >> > >> > >>> >>> >> >>> >> > >> > >>> >>> >> >>> Thanks, >> > >> > >>> >>> >> >>> Micah >> > >> > >>> >>> >> >>> >> > >> > >>> >>> >> >>> [1] https://github.com/apache/arrow/pull/5020 >> > >> > >>> >>> >> >>> >> > >> > >>> >>> >> >> >> > >> > >>> >>> >> >> > >> > >>> >>> > >> > >> > >>> >>> >> > >> > >>> >> >> > >> > >>> >> > >> > >> >> > >> > >> > >> >> > > >> >
