Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/21312
  
    @viirya I looked into it a bit more and calling `clear()` won't cause any 
problems but it does trigger a reallocation of the vector buffers the next time 
writing.  What do you think about changing this to do a manual "reset" so that 
the buffers can be reused?  It just needs to zero out the buffers and set the 
value count to 0, so something like this:
    ```
    val buffers = repeatedValueVector.getBuffers(false)
    buffers.foreach(buf => buf.setZero(0, buf.capacity()))
    repeatedValueVector.setValueCount(0)
    ```
    
    Once we upgrade to Arrow 0.10.0, this can be cleaned up because there is a 
common interface to `reset()`.  I think we should definitely get this 
backported to the 2.3 branch too.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to