Re: [Discuss][Java] 64-bit lengths for ValueVectors

Jacques Nadeau Thu, 22 Aug 2019 20:57:05 -0700

>
>  Hi Jacques, I hope you had a good rest.


I did, thanks!

On Fri, Aug 23, 2019 at 9:25 AM Jacques Nadeau <[email protected]> wrote:

> I don't think we should couple this discussion with the implementation of
> large list, etc since I think those two concepts are independent.
>
> I've asked some others on my team their opinions on the risk here. I think
> we should probably review some our more complex vector interactions and see
> how the jvm's assembly changes with this kind of change. Using
> microbenchmarking is good but I think we also need to see whether we're
> constantly inserting additional instructions or if in most cases, this
> actually doesn't impact instruction count.
>
>
>
> On Wed, Aug 21, 2019 at 12:18 PM Micah Kornfield <[email protected]>
> wrote:
>
>>
>>> With regards to the reference implementation point. It is a good point.
>>> I'm on vacation this week. Unless you're pushing hard on this, can we pick
>>> this up and discuss more next week?
>>
>>
>> Hi Jacques, I hope you had a good rest.  Any more thoughts on the
>> reference implementation aspect of this?
>>
>>
>>> To copy the sentiments from the 0.15.0 release thread, I think it
>>> would be best to decouple this discussion from the release timeline
>>> given how many people we have relying on regular releases coming out.
>>> We can keep continue making major 0.x releases until we're ready to
>>> release 1.0.0.
>>
>>
>> I'm OK with it as long as other stakeholders are. Timed releases are the
>> way to go.  As stated on the release thread [1] we need a better mechanism
>> to avoid this type of issue arising again.  The release thread also had
>> some more discussion on compatibility.
>>
>> Thanks,
>> Micah
>>
>> [1]
>> https://lists.apache.org/thread.html/d70feeceaf2570906ade117030b29887af7c77ca5c4a976e6d555920@%3Cdev.arrow.apache.org%3E
>>
>>
>> On Wed, Aug 14, 2019 at 3:23 PM Wes McKinney <[email protected]> wrote:
>>
>>> On Sun, Aug 11, 2019 at 9:40 PM Micah Kornfield <[email protected]>
>>> wrote:
>>> >
>>> > Hi Wes and Jacques,
>>> > See responses below.
>>> >
>>> > With regards to the reference implementation point. It is a good
>>> point. I'm
>>> > > on vacation this week. Unless you're pushing hard on this, can we
>>> pick this
>>> > > up and discuss more next week?
>>> >
>>> >
>>> > Sure thing, enjoy your vacation.  I think the only practical
>>> implications
>>> > are it delays choices around implementing LargeList, LargeBinary,
>>> > LargeString in Java, which in turn might push out the 0.15.0 release.
>>> >
>>>
>>> To copy the sentiments from the 0.15.0 release thread, I think it
>>> would be best to decouple this discussion from the release timeline
>>> given how many people we have relying on regular releases coming out.
>>> We can keep continue making major 0.x releases until we're ready to
>>> release 1.0.0.
>>>
>>> > My stance on this is that I don't know how important it is for Java to
>>> > > support vectors over INT32_MAX elements. The use cases enabled by
>>> > > having very large arrays seem to be concentrated in the native code
>>> > > world (e.g. C/C++/Rust) -- that could just be implementation-centrism
>>> > > on my part, though.
>>> >
>>> >
>>> > A data point against this view is Spark has done work to eliminate 2GB
>>> > memory limits on its block sizes [1].  I don't claim to understand the
>>> > implications of this. Bryan might you have any thoughts here?  I'm OK
>>> with
>>> > INT32_MAX, as well, I think we should think about what this means for
>>> > adding Large types to Java and implications for reference
>>> implementations.
>>> >
>>> > Thanks,
>>> > Micah
>>> >
>>> > [1] https://issues.apache.org/jira/browse/SPARK-6235
>>> >
>>> > On Sun, Aug 11, 2019 at 6:31 PM Jacques Nadeau <[email protected]>
>>> wrote:
>>> >
>>> > > Hey Micah,
>>> > >
>>> > > Appreciate the offer on the compiling. The reality is I'm more
>>> concerned
>>> > > about the unknowns than the compiling issue itself. Any time you've
>>> been
>>> > > tuning for a while, changing something like this could be totally
>>> fine or
>>> > > cause a couple of major issues. For example, we've done a very large
>>> amount
>>> > > of work reducing heap memory footprint of the vectors. Are target is
>>> to
>>> > > actually get it down to 24 bytes per ArrowBuf and 24 bytes heap per
>>> vector
>>> > > (not including arrow bufs).
>>> > >
>>> > > With regards to the reference implementation point. It is a good
>>> point.
>>> > > I'm on vacation this week. Unless you're pushing hard on this, can
>>> we pick
>>> > > this up and discuss more next week?
>>> > >
>>> > > thanks,
>>> > > Jacques
>>> > >
>>> > > On Sat, Aug 10, 2019 at 7:39 PM Micah Kornfield <
>>> [email protected]>
>>> > > wrote:
>>> > >
>>> > >> Hi Jacques,
>>> > >> I definitely understand these concerns and this change is risky
>>> because it
>>> > >> is so large.  Perhaps, creating a new hierarchy, might be the
>>> cleanest way
>>> > >> of dealing with this.  This could have other benefits like cleaning
>>> up
>>> > >> some
>>> > >> cruft around dictionary encode and "orphaned" method.   Per past
>>> e-mail
>>> > >> threads I agree it is beneficial to have 2 separate reference
>>> > >> implementations that can communicate fully, and my intent here was
>>> to
>>> > >> close
>>> > >> that gap.
>>> > >>
>>> > >> Trying to
>>> > >> > determine the ramifications of these changes would be challenging
>>> and
>>> > >> time
>>> > >> > consuming against all the different ways we interact with the
>>> Arrow Java
>>> > >> > library.
>>> > >>
>>> > >>
>>> > >> Understood.  I took a quick look at Dremio-OSS it seems like it has
>>> a
>>> > >> simple java build system?  If it is helpful, I can try to get a fork
>>> > >> running that at least compiles against this PR.  My plan would be
>>> to cast
>>> > >> any place that was changed to return a long back to an int, so in
>>> essence
>>> > >> the Dremio algorithms would reman 32-bit implementations.
>>> > >>
>>> > >> I don't  have the infrastructure to test this change properly from a
>>> > >> distributed systems perspective, so it would still take some time
>>> from
>>> > >> Dremio to validate for regressions.
>>> > >>
>>> > >> I'm not saying I'm against this but want to make sure we've
>>> > >> > explored all less disruptive options before considering changing
>>> > >> something
>>> > >> > this fundamental (especially when I generally hold the view that
>>> large
>>> > >> cell
>>> > >> > counts against massive contiguous memory is an anti pattern to
>>> scalable
>>> > >> > analytical processing--purely subjective of course).
>>> > >>
>>> > >>
>>> > >> I'm open to other ideas here, as well. I don't think it is out of
>>> the
>>> > >> question to leave the Java implementation as 32-bit, but if we do,
>>> then I
>>> > >> think we should consider a different strategy for reference
>>> > >> implementations.
>>> > >>
>>> > >> Thanks,
>>> > >> Micah
>>> > >>
>>> > >> On Sat, Aug 10, 2019 at 5:09 PM Jacques Nadeau <[email protected]>
>>> > >> wrote:
>>> > >>
>>> > >> > Hey Micah, I didn't have a particular path in mind. Was thinking
>>> more
>>> > >> along
>>> > >> > the lines of extra methods as opposed to separate classes.
>>> > >> >
>>> > >> > Arrow hasn't historically been a place where we're writing
>>> algorithms in
>>> > >> > Java so the fact that they aren't there doesn't mean they don't
>>> exist.
>>> > >> We
>>> > >> > have a large amount of code that depends on the current behavior
>>> that is
>>> > >> > deployed in hundreds of customer clusters (you can peruse our
>>> dremio
>>> > >> repo
>>> > >> > to see how extensively we leverage Arrow if interested). Trying to
>>> > >> > determine the ramifications of these changes would be challenging
>>> and
>>> > >> time
>>> > >> > consuming against all the different ways we interact with the
>>> Arrow Java
>>> > >> > library. I'm not saying I'm against this but want to make sure
>>> we've
>>> > >> > explored all less disruptive options before considering changing
>>> > >> something
>>> > >> > this fundamental (especially when I generally hold the view that
>>> large
>>> > >> cell
>>> > >> > counts against massive contiguous memory is an anti pattern to
>>> scalable
>>> > >> > analytical processing--purely subjective of course).
>>> > >> >
>>> > >> > On Sat, Aug 10, 2019, 4:17 PM Micah Kornfield <
>>> [email protected]>
>>> > >> > wrote:
>>> > >> >
>>> > >> > > Hi Jacques,
>>> > >> > > What avenue were you thinking for supporting both paths?   I
>>> didn't
>>> > >> want
>>> > >> > > to pursue a different class hierarchy, because I felt like that
>>> would
>>> > >> > > effectively fork the code base, but that is potentially an
>>> option that
>>> > >> > > would allow us to have a complete reference implementation in
>>> Java
>>> > >> that
>>> > >> > can
>>> > >> > > fully interact with C++, without major changes to this code.
>>> > >> > >
>>> > >> > > For supporting both APIs on the same classes/interfaces, I
>>> think they
>>> > >> > > roughly fall into three categories, changes to input parameters,
>>> > >> changes
>>> > >> > to
>>> > >> > > output parameters and algorithm changes.
>>> > >> > >
>>> > >> > > For inputs, changing from int to long is essentially a no-op
>>> from the
>>> > >> > > compiler perspective.  From the limited micro-benchmarking this
>>> also
>>> > >> > > doesn't seem to have a performance impact.  So we could keep two
>>> > >> versions
>>> > >> > > of the methods that only differ on inputs, but it is not clear
>>> what
>>> > >> the
>>> > >> > > value of that would be.
>>> > >> > >
>>> > >> > > For outputs, we can't support methods "long getLength()" and
>>> "int
>>> > >> > > getLength()" in the same class, so we would be forced into
>>> something
>>> > >> like
>>> > >> > > "long getLength(boolean dummy)" which I think is a less
>>> desirable.
>>> > >> > >
>>> > >> > > For algorithm changes, there did not appear to be too many
>>> places
>>> > >> where
>>> > >> > we
>>> > >> > > actually loop over all elements (it is quite possible I missed
>>> > >> something
>>> > >> > > here), the ones that I did find I was able to mitigate
>>> performance
>>> > >> > > penalties as noted above.  Some of the current implementation
>>> will
>>> > >> get a
>>> > >> > > lot slower for "large arrays", but we can likely fix those
>>> later or in
>>> > >> > this
>>> > >> > > PR with a nested while loop instead of 2 for loops.
>>> > >> > >
>>> > >> > > Thanks,
>>> > >> > > Micah
>>> > >> > >
>>> > >> > >
>>> > >> > > On Saturday, August 10, 2019, Jacques Nadeau <
>>> [email protected]>
>>> > >> wrote:
>>> > >> > >
>>> > >> > >> This is a pretty massive change to the apis. I wonder how
>>> nasty it
>>> > >> would
>>> > >> > >> be to just support both paths. Have you evaluated how complex
>>> that
>>> > >> > would be?
>>> > >> > >>
>>> > >> > >> On Wed, Aug 7, 2019 at 11:08 PM Micah Kornfield <
>>> > >> [email protected]>
>>> > >> > >> wrote:
>>> > >> > >>
>>> > >> > >>> After more investigation, it looks like Float8Benchmarks at
>>> least
>>> > >> on my
>>> > >> > >>> machine are within the range of noise.
>>> > >> > >>>
>>> > >> > >>> For BitVectorHelper I pushed a new commit [1], seems to bring
>>> the
>>> > >> > >>> BitVectorHelper benchmarks back inline (and even with some
>>> > >> improvement
>>> > >> > >>> for
>>> > >> > >>> getNullCountBenchmark).
>>> > >> > >>>
>>> > >> > >>> Benchmark                                        Mode  Cnt
>>>  Score
>>> > >> > >>>  Error
>>> > >> > >>>  Units
>>> > >> > >>> BitVectorHelperBenchmarks.allBitsNullBenchmark   avgt    5
>>>  3.821 ±
>>> > >> > >>> 0.031
>>> > >> > >>>  ns/op
>>> > >> > >>> BitVectorHelperBenchmarks.getNullCountBenchmark  avgt    5
>>> 14.884 ±
>>> > >> > >>> 0.141
>>> > >> > >>>  ns/op
>>> > >> > >>>
>>> > >> > >>> I applied the same pattern to other loops that I could find,
>>> and for
>>> > >> > any
>>> > >> > >>> "for (long" loop on the critical path, I broke it up into two
>>> loops.
>>> > >> > the
>>> > >> > >>> first loop does iteration by integer, the second finishes off
>>> for
>>> > >> any
>>> > >> > >>> long
>>> > >> > >>> values.  As a side note it seems like optimization for loops
>>> using
>>> > >> long
>>> > >> > >>> counters at least have a semi-recent open bug for the JVM [2]
>>> > >> > >>>
>>> > >> > >>> Thanks,
>>> > >> > >>> Micah
>>> > >> > >>>
>>> > >> > >>> [1]
>>> > >> > >>>
>>> > >> > >>>
>>> > >> >
>>> > >>
>>> https://github.com/apache/arrow/pull/5020/commits/2ea2c1ae83e3baa7b9a99a6d06276d968df41797
>>> > >> > >>> [2] https://bugs.openjdk.java.net/browse/JDK-8223051
>>> > >> > >>>
>>> > >> > >>> On Wed, Aug 7, 2019 at 8:11 PM Micah Kornfield <
>>> > >> [email protected]>
>>> > >> > >>> wrote:
>>> > >> > >>>
>>> > >> > >>> > Indeed, the BoundChecking and CheckNullForGet variables can
>>> make a
>>> > >> > big
>>> > >> > >>> > difference.  I didn't initially run the benchmarks with
>>> these
>>> > >> turned
>>> > >> > on
>>> > >> > >>> > (you can see the result from above with Float8Benchmarks).
>>> Here
>>> > >> are
>>> > >> > >>> new
>>> > >> > >>> > numbers including with the flags enabled.  It looks like
>>> using
>>> > >> longs
>>> > >> > >>> might
>>> > >> > >>> > be a little bit slower, I'll see what I can do to mitigate
>>> this.
>>> > >> > >>> >
>>> > >> > >>> > Ravindra also volunteered to try to benchmark the changes
>>> with
>>> > >> > Dremio's
>>> > >> > >>> > code on today's sync call.
>>> > >> > >>> >
>>> > >> > >>> > New
>>> > >> > >>> >
>>> > >> > >>> > Benchmark                                        Mode  Cnt
>>>  Score
>>> > >> > >>>  Error
>>> > >> > >>> > Units
>>> > >> > >>> >
>>> > >> > >>> > BitVectorHelperBenchmarks.allBitsNullBenchmark   avgt    5
>>> > >>  4.176 ±
>>> > >> > >>> 1.292
>>> > >> > >>> > ns/op
>>> > >> > >>> >
>>> > >> > >>> > BitVectorHelperBenchmarks.getNullCountBenchmark  avgt    5
>>> > >> 26.102 ±
>>> > >> > >>> 0.700
>>> > >> > >>> > ns/op
>>> > >> > >>> >
>>> > >> > >>> > Float8Benchmarks.copyFromBenchmark   avgt    5  7.398 ±
>>> 0.084
>>> > >> us/op
>>> > >> > >>> >
>>> > >> > >>> > Float8Benchmarks.readWriteBenchmark  avgt    5  2.711 ±
>>> 0.057
>>> > >> us/op
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>> >
>>> > >> > >>> > old
>>> > >> > >>> >
>>> > >> > >>> > BitVectorHelperBenchmarks.allBitsNullBenchmark   avgt    5
>>> > >>  3.828 ±
>>> > >> > >>> 0.030
>>> > >> > >>> > ns/op
>>> > >> > >>> >
>>> > >> > >>> > BitVectorHelperBenchmarks.getNullCountBenchmark  avgt    5
>>> > >> 20.611 ±
>>> > >> > >>> 0.188
>>> > >> > >>> > ns/op
>>> > >> > >>> >
>>> > >> > >>> > Float8Benchmarks.copyFromBenchmark   avgt    5  6.597 ±
>>> 0.462
>>> > >> us/op
>>> > >> > >>> >
>>> > >> > >>> > Float8Benchmarks.readWriteBenchmark  avgt    5  2.615 ±
>>> 0.027
>>> > >> us/op
>>> > >> > >>> >
>>> > >> > >>> > On Wed, Aug 7, 2019 at 7:13 PM Fan Liya <
>>> [email protected]>
>>> > >> > wrote:
>>> > >> > >>> >
>>> > >> > >>> >> Hi Gonzalo,
>>> > >> > >>> >>
>>> > >> > >>> >> Thanks for sharing the performance results.
>>> > >> > >>> >> I am wondering if you have turned off the flag
>>> > >> > >>> >> BoundsChecking#BOUNDS_CHECKING_ENABLED.
>>> > >> > >>> >> If not, the lower throughput should be expected.
>>> > >> > >>> >>
>>> > >> > >>> >> Best,
>>> > >> > >>> >> Liya Fan
>>> > >> > >>> >>
>>> > >> > >>> >> On Wed, Aug 7, 2019 at 10:23 PM Micah Kornfield <
>>> > >> > >>> [email protected]>
>>> > >> > >>> >> wrote:
>>> > >> > >>> >>
>>> > >> > >>> >>> Hi Gonzalo,
>>> > >> > >>> >>> Thank you for the feedback.  I wasn't aware of the JIT
>>> > >> > >>> implications.   At
>>> > >> > >>> >>> least on the benchmark run they don't seem to have an
>>> impact.
>>> > >> > >>> >>>
>>> > >> > >>> >>> If there are other benchmarks that people have that can
>>> > >> validate if
>>> > >> > >>> this
>>> > >> > >>> >>> change will be problematic I would appreciate trying to
>>> run them
>>> > >> > >>> with the
>>> > >> > >>> >>> PR.  I will try to run the ones for zeroing/popcnt
>>> tonight to
>>> > >> see
>>> > >> > if
>>> > >> > >>> >>> there
>>> > >> > >>> >>> is a change in those.
>>> > >> > >>> >>>
>>> > >> > >>> >>> -Micah
>>> > >> > >>> >>>
>>> > >> > >>> >>>
>>> > >> > >>> >>>
>>> > >> > >>> >>> On Wednesday, August 7, 2019, Gonzalo Ortiz Jaureguizar <
>>> > >> > >>> >>> [email protected]> wrote:
>>> > >> > >>> >>>
>>> > >> > >>> >>> > I would recommend to take care with this kind of
>>> changes.
>>> > >> > >>> >>> >
>>> > >> > >>> >>> > I didn't try Arrow in more than one year, but by then
>>> the
>>> > >> > >>> performance
>>> > >> > >>> >>> was
>>> > >> > >>> >>> > quite bad in comparison with plain byte buffer access
>>> > >> > >>> >>> > (see
>>> http://git.net/apache-arrow-development/msg02353.html *)
>>> > >> > and
>>> > >> > >>> >>> > there are several optimizations that the JVM
>>> (specifically,
>>> > >> C2)
>>> > >> > >>> does
>>> > >> > >>> >>> not
>>> > >> > >>> >>> > apply when dealing with int instead of longs. One of the
>>> > >> > >>> >>> > most commons is the loop unrolling and vectorization.
>>> > >> > >>> >>> >
>>> > >> > >>> >>> > * It doesn't seem the best way to reference an old
>>> email on
>>> > >> the
>>> > >> > >>> list,
>>> > >> > >>> >>> but
>>> > >> > >>> >>> > it is the only result shown by Google
>>> > >> > >>> >>> >
>>> > >> > >>> >>> > El mié., 7 ago. 2019 a las 11:42, Fan Liya (<
>>> > >> > [email protected]
>>> > >> > >>> >)
>>> > >> > >>> >>> > escribió:
>>> > >> > >>> >>> >
>>> > >> > >>> >>> >> Hi Micah,
>>> > >> > >>> >>> >>
>>> > >> > >>> >>> >> Thanks for your effort. The performance result looks
>>> good.
>>> > >> > >>> >>> >>
>>> > >> > >>> >>> >> As you indicated, ArrowBuf will take additional 12
>>> bytes (4
>>> > >> > bytes
>>> > >> > >>> for
>>> > >> > >>> >>> each
>>> > >> > >>> >>> >> of length, write index, and read index).
>>> > >> > >>> >>> >> Similar overheads also exist for vectors like
>>> > >> > >>> BaseFixedWidthVector,
>>> > >> > >>> >>> >> BaseVariableWidthVector, etc.
>>> > >> > >>> >>> >>
>>> > >> > >>> >>> >> IMO, such overheads are small enough to justify the
>>> change.
>>> > >> > >>> >>> >> Let's check if there are other overheads.
>>> > >> > >>> >>> >>
>>> > >> > >>> >>> >> Best,
>>> > >> > >>> >>> >> Liya Fan
>>> > >> > >>> >>> >>
>>> > >> > >>> >>> >> On Wed, Aug 7, 2019 at 3:30 PM Micah Kornfield <
>>> > >> > >>> [email protected]
>>> > >> > >>> >>> >
>>> > >> > >>> >>> >> wrote:
>>> > >> > >>> >>> >>
>>> > >> > >>> >>> >> > Hi Liya Fan,
>>> > >> > >>> >>> >> > Based on the Float8Benchmark there does not seem to
>>> be any
>>> > >> > >>> >>> meaningful
>>> > >> > >>> >>> >> > performance difference on my machine.  At least for
>>> me, the
>>> > >> > >>> >>> benchmarks
>>> > >> > >>> >>> >> are
>>> > >> > >>> >>> >> > not stable enough to say one is faster than the
>>> other (I've
>>> > >> > >>> pasted
>>> > >> > >>> >>> >> results
>>> > >> > >>> >>> >> > below).  That being said my machine isn't
>>> necessarily the
>>> > >> most
>>> > >> > >>> >>> reliable
>>> > >> > >>> >>> >> for
>>> > >> > >>> >>> >> > benchmarking.
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> > On an intuitive level, this makes sense to me,  for
>>> the
>>> > >> most
>>> > >> > >>> part it
>>> > >> > >>> >>> >> seems
>>> > >> > >>> >>> >> > like the change just moves casting from "int" to
>>> "long"
>>> > >> > further
>>> > >> > >>> up
>>> > >> > >>> >>> the
>>> > >> > >>> >>> >> > stack  for  "PlatformDepdendent" operations.  If
>>> there are
>>> > >> > other
>>> > >> > >>> >>> >> benchmarks
>>> > >> > >>> >>> >> > that you think are worth running let me know.
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> > One downside performance wise I think for his change
>>> is it
>>> > >> > >>> >>> increases the
>>> > >> > >>> >>> >> > size of ArrowBuf objects, which I suppose could
>>> influence
>>> > >> > cache
>>> > >> > >>> >>> misses
>>> > >> > >>> >>> >> at
>>> > >> > >>> >>> >> > some level or increase the size of call-stacks, but
>>> this
>>> > >> > doesn't
>>> > >> > >>> >>> seem to
>>> > >> > >>> >>> >> > show up in the benchmark..
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> > Thanks,
>>> > >> > >>> >>> >> > Micah
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> > Sample benchmark numbers:
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> > [New Code]
>>> > >> > >>> >>> >> > Benchmark                            Mode  Cnt
>>>  Score
>>> > >>  Error
>>> > >> > >>> >>> Units
>>> > >> > >>> >>> >> > Float8Benchmarks.copyFromBenchmark   avgt    5
>>> 15.441 ±
>>> > >> 0.469
>>> > >> > >>> >>> us/op
>>> > >> > >>> >>> >> > Float8Benchmarks.readWriteBenchmark  avgt    5
>>> 14.057 ±
>>> > >> 0.115
>>> > >> > >>> >>> us/op
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> > [Old code]
>>> > >> > >>> >>> >> > Benchmark                            Mode  Cnt
>>>  Score
>>> > >>  Error
>>> > >> > >>> >>> Units
>>> > >> > >>> >>> >> > Float8Benchmarks.copyFromBenchmark   avgt    5
>>> 16.248 ±
>>> > >> 1.409
>>> > >> > >>> >>> us/op
>>> > >> > >>> >>> >> > Float8Benchmarks.readWriteBenchmark  avgt    5
>>> 14.150 ±
>>> > >> 0.084
>>> > >> > >>> >>> us/op
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> > On Tue, Aug 6, 2019 at 1:18 AM Fan Liya <
>>> > >> [email protected]
>>> > >> > >
>>> > >> > >>> >>> wrote:
>>> > >> > >>> >>> >> >
>>> > >> > >>> >>> >> >> Hi Micah,
>>> > >> > >>> >>> >> >>
>>> > >> > >>> >>> >> >> Thanks a lot for doing this.
>>> > >> > >>> >>> >> >>
>>> > >> > >>> >>> >> >> I am a little concerned about if there is any
>>> negative
>>> > >> > >>> performance
>>> > >> > >>> >>> >> impact
>>> > >> > >>> >>> >> >> on the current 32-bit-length based applications.
>>> > >> > >>> >>> >> >> Can we do some performance comparison on our
>>> existing
>>> > >> > >>> benchmarks?
>>> > >> > >>> >>> >> >>
>>> > >> > >>> >>> >> >> Best,
>>> > >> > >>> >>> >> >> Liya Fan
>>> > >> > >>> >>> >> >>
>>> > >> > >>> >>> >> >>
>>> > >> > >>> >>> >> >> On Tue, Aug 6, 2019 at 3:35 PM Micah Kornfield <
>>> > >> > >>> >>> [email protected]>
>>> > >> > >>> >>> >> >> wrote:
>>> > >> > >>> >>> >> >>
>>> > >> > >>> >>> >> >>> There have been some previous discussions on the
>>> mailing
>>> > >> > about
>>> > >> > >>> >>> >> supporting
>>> > >> > >>> >>> >> >>> 64-bit lengths for  Java ValueVectors (this is
>>> what the
>>> > >> IPC
>>> > >> > >>> >>> >> specification
>>> > >> > >>> >>> >> >>> and C++ support).  I created a PR [1] that changes
>>> all
>>> > >> APIs
>>> > >> > >>> that I
>>> > >> > >>> >>> >> could
>>> > >> > >>> >>> >> >>> find that take an index to take an "long" instead
>>> of an
>>> > >> > "int"
>>> > >> > >>> (and
>>> > >> > >>> >>> >> >>> similarly change "size/rowcount" APIs).
>>> > >> > >>> >>> >> >>>
>>> > >> > >>> >>> >> >>> It is a big change, so I think it is worth
>>> discussing if
>>> > >> it
>>> > >> > is
>>> > >> > >>> >>> >> something
>>> > >> > >>> >>> >> >>> we
>>> > >> > >>> >>> >> >>> still want to move forward with.  It would be nice
>>> to
>>> > >> come
>>> > >> > to
>>> > >> > >>> a
>>> > >> > >>> >>> >> >>> conclusion
>>> > >> > >>> >>> >> >>> quickly, ideally in the next few days, to avoid a
>>> lot of
>>> > >> > merge
>>> > >> > >>> >>> >> conflicts.
>>> > >> > >>> >>> >> >>>
>>> > >> > >>> >>> >> >>> The reason I did this work now is the C++
>>> implementation
>>> > >> has
>>> > >> > >>> added
>>> > >> > >>> >>> >> >>> support
>>> > >> > >>> >>> >> >>> for LargeList, LargeBinary and LargeString arrays
>>> and
>>> > >> based
>>> > >> > on
>>> > >> > >>> >>> prior
>>> > >> > >>> >>> >> >>> discussions we need to have similar support in Java
>>> > >> before
>>> > >> > our
>>> > >> > >>> >>> next
>>> > >> > >>> >>> >> >>> release. Support 64-bit indexes means we can have
>>> full
>>> > >> > >>> >>> compatibility
>>> > >> > >>> >>> >> and
>>> > >> > >>> >>> >> >>> make the most use of the types in Java.
>>> > >> > >>> >>> >> >>>
>>> > >> > >>> >>> >> >>> Look forward to hearing feedback.
>>> > >> > >>> >>> >> >>>
>>> > >> > >>> >>> >> >>> Thanks,
>>> > >> > >>> >>> >> >>> Micah
>>> > >> > >>> >>> >> >>>
>>> > >> > >>> >>> >> >>> [1] https://github.com/apache/arrow/pull/5020
>>> > >> > >>> >>> >> >>>
>>> > >> > >>> >>> >> >>
>>> > >> > >>> >>> >>
>>> > >> > >>> >>> >
>>> > >> > >>> >>>
>>> > >> > >>> >>
>>> > >> > >>>
>>> > >> > >>
>>> > >> >
>>> > >>
>>> > >
>>>
>>

Re: [Discuss][Java] 64-bit lengths for ValueVectors

Reply via email to