Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19601
  
    This approach also works for nested array. I have this implementation in my 
machine. For ease of review, I commit the version of only primitive type array 
support. If you like it, I can commit the version for nested array support.  
    Yes, I created a wrap `ColumnarArray` corresponding to an `UnsafeArrayData` 
for an array at each nest level. It can **avoid expensive data copy**, which is 
very important for array and complex type. This is because data size of these 
data structures are large.
    
    If we can still avoid expensive data copy, which is accomplished by 
pointing a part of one large array (not copying data from the large array), by 
using the new design (you may think about [such a 
format](https://github.com/apache/arrow/blob/master/format/Layout.md#example-layout-listlistbyte)),
 I am happy to revisit the format with you. It is an issue for internal 
implementation. The redesign would not affect the external interfaces 
`ColumnVector` and `ColumnarArray`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to