awildturtok edited a comment on pull request #7214:
URL: https://github.com/apache/arrow/pull/7214#issuecomment-852158288


   Ahoi,
   
   we are experiencing issues that we feel is related to this PR/a workaround 
we are trying to do basically the following:
   
   ```{java}
   private static RowConsumer listVectorFiller(ListVector vector, int 
RowNumber, List<String> values){
       // Values is a vertical list
   
       int start = vector.startNewValue(rowNumber);
       final FieldVector innerVector = vector.getDataVector();
   
       for (int i = 0; i < values.size(); i++) {
           String value = values.get(i);
           innerVector.setSafe(start + i, new Text(value));        
       }
   
       // Workaround for https://issues.apache.org/jira/browse/ARROW-8842
       int valueCount = innerVector.getValueCount();
       innerVector.setValueCount(valueCount + values.size()); // ie grow the 
innerVector by the inner values
   
       vector.endValue(rowNumber,values.size());
   }
   ```
   
   We are generating an Arrow file that rendered as CSV has  34MB, but as 
arrs/arrf comes in a 1GB also taking quite long to generate. Since it also 
contains a crude fix for this PR we wanted to make sure if this is the proper 
way of creating a ListVector.
   
   I've attached a flamegraph of the generation: 
[arrow-download.svg.zip](https://github.com/apache/arrow/files/6576795/arrow-download.svg.zip)
   
   After reading then writing the file again using python, the file is only 
11MB.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to