[ 
https://issues.apache.org/jira/browse/ARROW-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568056#comment-17568056
 ] 

James Henderson commented on ARROW-17107:
-----------------------------------------

> All vectors that use offsets must have at least one offset (or more 
> specifically: the number of offsets is always the number of values + 1, see 
> [the 
> spec|https://arrow.apache.org/docs/format/Columnar.html#variable-size-binary-layout])

mm, although this isn't the case for dense unions - I guess their 'offsets' are 
conceptually different from the offsets in the variable-width vectors?

 

> possibly empty vectors may not have allocated any memory as a 
> micro-optimization?

that was my assumption too, yeah :)

> [Java] JSONFileWriter throws IOOBE writing an empty list
> --------------------------------------------------------
>
>                 Key: ARROW-17107
>                 URL: https://issues.apache.org/jira/browse/ARROW-17107
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 8.0.0
>            Reporter: James Henderson
>            Priority: Minor
>
> Hey folks,
> I'm trying to write an empty ListVector out through the `JsonFileWriter`, and 
> am getting an IOOBE. Stack trace is as follows:
>  
> ```
> java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 
> 0))
>  at org.apache.arrow.memory.ArrowBuf.checkIndexD (ArrowBuf.java:318)
>     org.apache.arrow.memory.ArrowBuf.chk (ArrowBuf.java:305)
>     org.apache.arrow.memory.ArrowBuf.getInt (ArrowBuf.java:424)
>     org.apache.arrow.vector.ipc.JsonFileWriter.writeValueToGenerator 
> (JsonFileWriter.java:270)
>     org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson 
> (JsonFileWriter.java:237)
>     org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson 
> (JsonFileWriter.java:253)
>     org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson 
> (JsonFileWriter.java:253)
>     org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson 
> (JsonFileWriter.java:253)
>     org.apache.arrow.vector.ipc.JsonFileWriter.writeBatch 
> (JsonFileWriter.java:200)
>     org.apache.arrow.vector.ipc.JsonFileWriter.write (JsonFileWriter.java:190)
> ```
> It's trying to write the offset buffer of the list, which is empty. L224 of 
> JFW.java sets `bufferValueCount` to 1 (because we're not a DUV), so we enter 
> the `for` loop. We don't hit the `valueCount=0` condition in L230 (because 
> we're not a varbinary or a varchar vector). So we fall into the `else`, which 
> tries to write the 0th element in the offset vector, and IOOBE.
> Could we include 'list' in either the L224 or the L230 checks?
> Admittedly, I'm not aware of the history of this section, but it seems that, 
> by the time we hit L230 (i.e. excluding DUV), any empty vector should yield a 
> single 0?
> Let me know if there's any more info I can provide!
> Cheers,
> James



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to