Looks like there are 2 PRs for this work -- https://github.com/apache/arrow/pull/4186 this PR adds new get<type>Unsafe type APIs to ArrowBuf that don't do checkIndex() before calling PlatformDependent.get(memory address). So the access will go through vector.get() -> buffer.get() -> PlatformDependent.get() -> UNSAFE.get which is what we do today but without doing any bounds checking
I believe the proposal suggested here and the WIP PR -- https://github.com/apache/arrow/pull/4212 adds new versions of vectors where the call to vector.get() bypasses the call to ArrowBuf and directly invokes PlatformDependent with absolute address at which we want to read/write. Correct? Looks like the call to arrowbuf is still needed to get the starting address of buffer before computing the absolute address I am wondering if much of the overhead is coming from conditions and branches inside bound checking or just the chain of method calls? If it is bounds checking then I think the first PR would suffice probably. On Tue, Apr 30, 2019 at 9:46 AM Parth Chandra <par...@apache.org> wrote: > FWIW, in Drill's Value Vector code, we found that bounds checking was a > major performance bottleneck in operators that wrote to vectors. Scans, as > a result, we particularly affected. Another bottleneck was the zeroing of > vectors. > There were many unnecessary bounds checks. For example in a varchar vector, > there is one check while writing the data, one while writing the validity > bit, one more in the buffer allocator for the data buffer, one more in the > buffer allocator for the validity bit buffer, one more each in the > underlying ByteBuf implementation. It gets worse with repeated/array types. > Some code paths in Drill were optimized to get rid of these bounds checks > (eventually I suppose all of them will be updated). The approach was to > bypass the ValueVector API and write directly to the Drill(/Arrow)Buf. > Writing to the memory address directly, as is being proposed by Liya Fan, > was initially tried but did not have any measurable performance > improvements. BTW, writing to the memory address would also conflict with > ARROW-3191. > Note that the performance tests were for Drill queries, not Vectors, so > writing to memory directly may still have a noticeable performance benefit > for different use cases. > Sorry, I don't have actual numbers with me to share and I'm not sure how > much Arrow has diverged from the original Drill implementation, but the > Drill experience would suggest that this proposal certainly has merit. > > Parth > > On Mon, Apr 29, 2019 at 11:18 AM Wes McKinney <wesmck...@gmail.com> wrote: > > > I'm also curious which APIs are particularly problematic for > > performance. In ARROW-1833 [1] and some related discussions there was > > the suggestion of adding methods like getUnsafe, so this would be like > > get(i) [2] but without checking the validity bitmap > > > > [1] : https://issues.apache.org/jira/browse/ARROW-1833 > > [2]: > > > https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/Float8Vector.java#L99 > > > > On Mon, Apr 29, 2019 at 1:05 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > > > > Thanks for the design. Personally, I'm not a huge fan of creating a > > > parallel classes for every vector type, this ends up being confusing > for > > > developers and adds a lot of boiler plate. I wonder if you could use a > > > similar approach that the memory module uses for turning bounds > checking > > > on/off [1]. > > > > > > Also, I think there was a comment on the JIRA, but are there any > > benchmarks > > > to show the expected improvements? My limited understanding is that > for > > > small methods the JVM's JIT should inline them anyways [2] , so it is > not > > > clear how much this will improve performance. > > > > > > > > > Thanks, > > > Micah > > > > > > > > > [1] > > > > > > https://github.com/apache/arrow/blob/master/java/memory/src/main/java/org/apache/arrow/memory/BoundsChecking.java > > > [2] > > > > > > https://stackoverflow.com/questions/24923040/do-modern-java-compilers-jvm-inline-functions-methods-which-are-called-exactly-f > > > > > > On Sun, Apr 28, 2019 at 2:50 AM Fan Liya <liya.fa...@gmail.com> wrote: > > > > > > > Hi all, > > > > > > > > We are proposing a new set of APIs in Arrow - unsafe vector APIs. The > > > > general ideas is attached below, and also accessible from our online > > > > document > > > > < > > > https://docs.google.com/document/d/13oZFVS1EnNedZd_7udx-h10G2tRTjfgHe2ngp2ZWJ70/edit?usp=sharing > > >. > > > > Please give your valuable comments by directly commenting in our > online > > > > document > > > > < > > > https://docs.google.com/document/d/13oZFVS1EnNedZd_7udx-h10G2tRTjfgHe2ngp2ZWJ70/edit?usp=sharing > > >, > > > > or relaying this email thread. > > > > > > > > Thank you so much in advance. > > > > > > > > Best, > > > > Liya Fan > > > > > > > > Support Fast/Unsafe Vector APIs for Arrow Background > > > > > > > > In our effort to support columnar data format in Apache Flink, we > chose > > > > Apache Arrow as the basic data structure. Arrow greatly simplifies > the > > > > support of the columnar data format. However, for many scenarios, we > > find > > > > the performance unacceptable. Our investigation shows the reason is > > that, > > > > there are too many redundant checks and computations in current Arrow > > API. > > > > > > > > > > > > > > > > For example, the following figures shows that in a single call to > > > > Float8Vector.get(int) method (this is one of the most frequently used > > APIs > > > > in Flink computation), there are 20+ method invocations. > > > > > > > > > > > > [image: image.png] > > > > > > > > > > > > > > > > > > > > > > > > There are many other APIs with similar problems. The redundant checks > > and > > > > computations impact performance severely. According to our > evaluation, > > the > > > > performance may degrade by two or three orders of magnitude. > > > > Our Proposal > > > > > > > > For many scenarios, the checks can be avoided, if the application > > > > developers can guarantee that all checks will pass. So our proposal > is > > to > > > > provide some light-weight APIs. The APIs are also named *unsafe > APIs*, > > in > > > > the sense that that skip most of the checks (not safe) to improve the > > > > performance. > > > > > > > > > > > > > > > > In the light-weight APIs, we only provide minimum checks, or avoid > > checks > > > > at all. The application owner can still develop and debug their code > > using > > > > the original safe APIs. Once all bugs have been fixed, they can > switch > > to > > > > unsafe APIs in the final version of their products and enjoy the high > > > > performance. > > > > Our Design > > > > > > > > Our goal is to include unsafe vector APIs in Arrow code base, and > allow > > > > our customers switching to the new unsafe APIs, without being aware > of > > it, > > > > except for the high performance. To achieve this goal, we make the > > > > following design choices: > > > > Vector Class Hierarchy > > > > > > > > Each unsafe vector is the subclass of the safe vector. For example, > the > > > > unsafe Float8Vector is a subclass of > > org.apache.arrow.vector.Float8Vector: > > > > > > > > > > > > > > > > package org.apache.arrow.vector.unsafe; > > > > > > > > > > > > > > > > public class Float8Vector extends > org.apache.arrow.vector.Float8Vector > > > > > > > > > > > > > > > > So the safe vector acts as a façade of the unsafe vector, and through > > > > polymorphism, the users may not be aware of which type of vector > > he/she is > > > > working with. In addition, the common logics can be reused in the > > unsafe > > > > vectors, and we only need to override get/set related methods. > > > > Vector Creation > > > > > > > > We use factory methods to create each type of vectors. Compared with > > > > vector constructors, the factory methods take one more parameter, the > > > > vectorType: > > > > > > > > > > > > > > > > public class VectorFactory { > > > > > > > > public static Float8Vector createFloat8Vector(VectorType > vectorType, > > > > String name, BufferAllocator allocator); > > > > > > > > } > > > > > > > > > > > > > > > > VectorType is an enum to separate safe vectors from unsafe ones: > > > > > > > > > > > > > > > > public enum VectorType { > > > > > > > > SAFE, > > > > > > > > UNSAFE > > > > > > > > } > > > > > > > > > > > > > > > > With the factory methods, the old way of creating vectors by > > constructors > > > > can be gradually depreciated. > > > > Vector Implementation > > > > > > > > As discussed above, unsafe vectors mainly override get/set methods. > For > > > > get methods, we directly operate on the off-heap memory, without any > > check: > > > > > > > > > > > > > > > > public double get(int index) { > > > > > > > > return > > > > > > > Double.longBitsToDouble(PlatformDependent.getLong(valueBuffer.memoryAddress() > > > > + (index << TYPE_LOG2_WIDTH))); > > > > > > > > } > > > > > > > > > > > > > > > > Note that the PlatformDependent API is only 2 stack layers above the > > > > underlying UNSAFE method call. > > > > > > > > > > > > > > > > For set methods, we still need to set the validity bit. However, this > > is > > > > through an unsafe method that directly sets the bits without > checking: > > > > > > > > > > > > > > > > public void set(int index, double value) { > > > > > > > > UnsafeBitVectorHelper.setValidityBitToOne(validityBuffer, > index); > > > > > > > > PlatformDependent.putLong( > > > > > > > > valueBuffer.memoryAddress() + (index << TYPE_LOG2_WIDTH), > > > > Double.doubleToRawLongBits(value)); > > > > > > > > } > > > > > > > > > > > > > > > > Method UnsafeBitVectorHelper.setValidityBitToOne is the unsafe > version > > of > > > > BitVectorHelper.setValidityBitToOne that avoids checks. > > > > > > > > > > > > Test Cases > > > > > > > > We can reuse existing test cases by employing parameterized test > > classes > > > > to test both safe and unsafe vectors. > > > > Current Progress > > > > > > > > We have opened a JIRA for this work item FlINK-5200 > > > > <https://issues.apache.org/jira/browse/ARROW-5200>, and a PR > > > > <https://github.com/apache/arrow/pull/4212> with initial > > implementations > > > > have been opened. We would appreciate if you could give some > comments. > > > > > > >