[
https://issues.apache.org/jira/browse/ARROW-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksandr Kopilov updated ARROW-10508:
--------------------------------------
Description:
h3. General problem
FixedSizeListVector was designed to hold a list of lists with data (filled
rectangle matrix) but in real life we can have list of empty lists (zero-width
matrix). Java implementation of FixedSizeListVector does not allow to create
(and read) such list. But in
[specification|http://arrow.apache.org/docs/format/Columnar.html#fixed-size-list-layout]
there is no any restriction that listSize can not be zero. Also zero-width
matrix can be successfully created by C++ implementation. Java fails on reading
it. The most problem is that entire table (VectorSchemaRoot) can not be read if
we have at least one FixedSizeListVector.listSize==0 field.
h3. Expected behavior
* Read zero-width matrix (FixedSizeListVector with listSize==0) written by C++
as zero-width matrix.
* Read any field from the correct table (written and readable by C++)
h3. Actual behavior
* java.lang.IllegalArgumentException: list size must be positive at
org.apache.arrow.util.Preconditions.checkArgument(Preconditions.java:136)
* Filled vectors can not be read if at least one fixed list size is 0
h3. Examples to reproduce
h4. Reproduce on creating
Add
{{FixedSizeListVector.empty("ZeroWidthMatrix", 0, new RootAllocator());}}
line of code anywhere
h4. Reproduce on reading
[https://github.com/Kopilov/Arrow_FixedSizeListVector_ZeroWidth] repository
contains C++ and Java code with GitHub Action workflow that reproduces Actual
behavior.
Here are some logs:
[https://github.com/Kopilov/Arrow_FixedSizeListVector_ZeroWidth/runs/1363206662]
(look "Run C++ example", "Build and run Java example")
was:
h3. General problem
FixedSizeListVector was designed to hold a list of lists with data (filled
rectangle matrix) but in real life we can have list of empty lists (zero-width
matrix). Java implementation of FixedSizeListVector does not allow to create
(and read) such list. But in
[specification|http://arrow.apache.org/docs/format/Columnar.html#fixed-size-list-layout]
there is no any restriction that listSize can not be zero. Also zero-width
matrix can be successfully created by C++ implementation. Java fails on reading
it. The most problem is that entire table (VectorSchemaRoot) can not be read if
we have at least one FixedSizeListVector.listSize==0 field.
h3. Expected behavior
* Read zero-width matrix (FixedSizeListVector with listSize==0) written by C++
as zero-width matrix.
* Read any field from the correct table (written and readable by C++)
h3. Actual behavior
* java.lang.IllegalArgumentException: list size must be positive at
org.apache.arrow.util.Preconditions.checkArgument(Preconditions.java:136)
* Filled vectors can not be read if at least one fixed list size is 0
h3. Example to reproduce
[https://github.com/Kopilov/Arrow_FixedSizeListVector_ZeroWidth] repository
contains C++ and Java code with GitHub Action workflow that reproduces Actual
behavior.
Here are some logs:
[https://github.com/Kopilov/Arrow_FixedSizeListVector_ZeroWidth/runs/1363206662]
(look "Run C++ example", "Build and run Java example")
> [Java] Allow FixedSizeListVector to have empty children
> -------------------------------------------------------
>
> Key: ARROW-10508
> URL: https://issues.apache.org/jira/browse/ARROW-10508
> Project: Apache Arrow
> Issue Type: Bug
> Affects Versions: 1.0.0, 2.0.0
> Reporter: Aleksandr Kopilov
> Priority: Major
>
> h3. General problem
> FixedSizeListVector was designed to hold a list of lists with data (filled
> rectangle matrix) but in real life we can have list of empty lists
> (zero-width matrix). Java implementation of FixedSizeListVector does not
> allow to create (and read) such list. But in
> [specification|http://arrow.apache.org/docs/format/Columnar.html#fixed-size-list-layout]
> there is no any restriction that listSize can not be zero. Also zero-width
> matrix can be successfully created by C++ implementation. Java fails on
> reading it. The most problem is that entire table (VectorSchemaRoot) can not
> be read if we have at least one FixedSizeListVector.listSize==0 field.
> h3. Expected behavior
> * Read zero-width matrix (FixedSizeListVector with listSize==0) written by
> C++ as zero-width matrix.
> * Read any field from the correct table (written and readable by C++)
> h3. Actual behavior
> * java.lang.IllegalArgumentException: list size must be positive at
> org.apache.arrow.util.Preconditions.checkArgument(Preconditions.java:136)
> * Filled vectors can not be read if at least one fixed list size is 0
> h3. Examples to reproduce
> h4. Reproduce on creating
> Add
> {{FixedSizeListVector.empty("ZeroWidthMatrix", 0, new RootAllocator());}}
> line of code anywhere
> h4. Reproduce on reading
> [https://github.com/Kopilov/Arrow_FixedSizeListVector_ZeroWidth] repository
> contains C++ and Java code with GitHub Action workflow that reproduces Actual
> behavior.
> Here are some logs:
> [https://github.com/Kopilov/Arrow_FixedSizeListVector_ZeroWidth/runs/1363206662]
> (look "Run C++ example", "Build and run Java example")
--
This message was sent by Atlassian Jira
(v8.3.4#803005)