awildturtok edited a comment on pull request #7214:
URL: https://github.com/apache/arrow/pull/7214#issuecomment-852158288
Ahoi,
we are experiencing issues that we feel is related to this PR/a workaround
we are trying to do basically the following:
```{java}
private static RowConsumer listVectorFiller(ListVector vector, int
RowNumber, List<String> values){
// Values is a vertical list
int start = vector.startNewValue(rowNumber);
final FieldVector innerVector = vector.getDataVector();
for (int i = 0; i < values.size(); i++) {
String value = values.get(i);
innerVector.setSafe(start + i, new Text(value));
}
// Workaround for https://issues.apache.org/jira/browse/ARROW-8842
int valueCount = innerVector.getValueCount();
innerVector.setValueCount(valueCount + values.size()); // ie grow the
innerVector by the inner values
vector.endValue(rowNumber,values.size());
}
```
We are generating an Arrow file that rendered as CSV has 34MB, but as
arrs/arrf comes in a 1GB also taking quite long to generate. Since it also
contains a crude fix for this PR we wanted to make sure if this is the proper
way of creating a ListVector.
I've attached a flamegraph of the generation:
[arrow-download.svg.zip](https://github.com/apache/arrow/files/6576795/arrow-download.svg.zip)
After reading then writing the file again using python, the file is only
11MB.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]