Thanks for the explanation Jacques. You're correct, the issue must be in the appender. If I take that out, and read the list data from Flight like so, the data is in the VectorSchemaRoot.
Is this something a bug should be opened for? Or is it possible I'm invoking the appender incorrectly? Based on your explanation, I'm wondering if I need to somehow pre-allocate the target VectorSchemaRoot in such a manner so that it has enough space to hold the data of any VectorSchemaRoot's that I want to join. That seems true in the unit test for the individual vector appender test - https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/util/TestVectorAppender.java#L149 On Tue, Feb 16, 2021 at 12:31 PM Jacques Nadeau <[email protected]> wrote: > Hey John, a brief review of your code makes me wonder if the problem may > be associated with VectorSchemaRootAppender. Can you try your test without > that. Basically, once you get a batch of data back, inspect it to see that > you have your values. VectorSchemaRootAppender is new code that I haven't > reviewed and I'm wondering if it isn't handling reference counting > correctly. > > The exception you're seeing is most frequently associated with what could > be thought of as a NPE for the memory backing a vector. When a vectors are > like a container. The design was built so a vector has batches stream > through it. When no buffer is available, rather than setting the buffer to > null, we set it to the empty buffer (which is of zero length). If you try > to do something with the vector when it is empty. In this case, my guess is > you are trying to read the start offset for the first item in a list e.g. > the first four bytes [0..4) of the vector but the vector is only 0 bytes in > length (thus the exception). > > On Mon, Feb 15, 2021 at 7:21 PM John Peterson < > [email protected]> wrote: > >> Appreciate the help Jacques. Unfortunately calling setPosition(0) on the >> writer for the list did not solve it. >> >> I put the entirety of the code up on pastebin so it should be an easy >> copy/paste if anybody else wants to try to reproduce it. I suppose it could >> also be a bug in VectorAppender, but again I'm not sure if the error is in >> my code or in Arrow. >> >> https://pastebin.com/vwvnYY40 >> >> Thanks in advance. >> >> >> On Mon, Feb 15, 2021 at 1:33 PM Jacques Nadeau <[email protected]> >> wrote: >> >>> I think you need to call setPosition(0) before you start writing the >>> list. (This is from memory when I wrote the code 6-7 years ago so I may be >>> off.) >>> >>> On Sun, Feb 14, 2021 at 6:20 PM John Peterson < >>> [email protected]> wrote: >>> >>>> Hi Bryan, >>>> >>>> This is the stacktrace I get: >>>> >>>> java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: >>>> range(0, 0)) >>>> at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318) >>>> at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305) >>>> at org.apache.arrow.memory.ArrowBuf.getInt(ArrowBuf.java:424) >>>> at >>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:97) >>>> at >>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:45) >>>> at >>>> org.apache.arrow.vector.BaseVariableWidthVector.accept(BaseVariableWidthVector.java:1402) >>>> at >>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:233) >>>> at >>>> org.apache.arrow.vector.util.VectorAppender.visit(VectorAppender.java:45) >>>> at >>>> org.apache.arrow.vector.complex.ListVector.accept(ListVector.java:449) >>>> at >>>> org.apache.arrow.vector.util.VectorSchemaRootAppender.append(VectorSchemaRootAppender.java:67) >>>> at >>>> org.apache.arrow.vector.util.VectorSchemaRootAppender.append(VectorSchemaRootAppender.java:81) >>>> >>>> Thanks for your help. >>>> >>>> On Thu, Jan 14, 2021 at 2:23 PM Bryan Cutler <[email protected]> wrote: >>>> >>>>> Hi John, could you include the error with stacktrace? >>>>> >>>>> On Sat, Jan 9, 2021 at 9:34 PM John Peterson < >>>>> [email protected]> wrote: >>>>> >>>>>> I believe I'm running into a bug with Flight but I'd like to confirm >>>>>> and get some advice on a potential fix. I'm not sure where to look or >>>>>> what >>>>>> could be causing it. >>>>>> >>>>>> The code in question simply uploads a one-element List<Integer> to >>>>>> the example server, fetches it from the server, and attempts to append >>>>>> the >>>>>> data from the server to a new VectorSchemaRoot. It fails in the same way >>>>>> regardless of whether or not I construct a VectorSchemaRoot instance. >>>>>> >>>>>> Likewise, the data from the server can't be written out with the JSON >>>>>> writer, it'll fail in the same way. However, changing the data from a >>>>>> ListVector to an IntVector causes it to succeed. >>>>>> >>>>>> Any help would be appreciated. >>>>>> >>>>>> Thanks, >>>>>> John >>>>>> >>>>>> Code in question: >>>>>> // Set up the server and client >>>>>> BufferAllocator allocator = new RootAllocator(Long.MAX_VALUE); >>>>>> Location l = Location.forGrpcInsecure(FlightTestUtil.LOCALHOST, >>>>>> 12233); >>>>>> ExampleFlightServer server = new ExampleFlightServer(allocator, l); >>>>>> server.start(); >>>>>> FlightClient client = FlightClient.builder(allocator, l).build(); >>>>>> >>>>>> // Write a one-element List<Integer> >>>>>> ListVector listVector = ListVector.empty("list", allocator); >>>>>> UnionListWriter writer = listVector.getWriter(); >>>>>> writer.startList(); >>>>>> writer.integer().writeInt(1); >>>>>> writer.endList(); >>>>>> writer.setValueCount(1); >>>>>> >>>>>> // Send that data to the server >>>>>> VectorSchemaRoot root = VectorSchemaRoot.of(listVector); >>>>>> ClientStreamListener listener = >>>>>> client.startPut(FlightDescriptor.path("test"), root, new >>>>>> AsyncPutListener()); >>>>>> root.setRowCount(1); >>>>>> listener.putNext(); >>>>>> root.clear(); >>>>>> listener.completed(); >>>>>> >>>>>> // wait for ack to avoid memory leaks. >>>>>> listener.getResult(); >>>>>> >>>>>> // Attempt to read it back >>>>>> FlightInfo info = client.getInfo(FlightDescriptor.path("test")); >>>>>> try (final FlightStream stream = >>>>>> client.getStream(info.getEndpoints().get(0).getTicket())) { >>>>>> VectorSchemaRoot newRoot = stream.getRoot(); >>>>>> while (stream.next()) { >>>>>> // Copying into an entirely new VectorSchemaRoot fails >>>>>> try { >>>>>> ListVector newList = ListVector.empty("list", allocator); >>>>>> >>>>>> newList.addOrGetVector(FieldType.nullable(Types.MinorType.INT.getType())); >>>>>> VectorSchemaRoot copyRoot = VectorSchemaRoot.of(newList); >>>>>> VectorSchemaRootAppender.append(copyRoot, newRoot); >>>>>> } catch (IndexOutOfBoundsException e) { >>>>>> System.err.println("Expected IOOBE caught"); >>>>>> } >>>>>> >>>>>> // The same is true if we try to copy the data from the server to >>>>>> our VectorSchemaRoot >>>>>> try { >>>>>> VectorSchemaRootAppender.append(root, newRoot); >>>>>> } catch (IndexOutOfBoundsException e) { >>>>>> System.err.println("Expected IOOBE caught again"); >>>>>> throw e; >>>>>> } >>>>>> >>>>>> root.clear(); >>>>>> newRoot.clear(); >>>>>> } >>>>>> } >>>>>> >>>>>
