[GitHub] spark pull request: [SPARK-13043][SQL] Implement remaining catalys...

nongli Fri, 29 Jan 2016 11:42:55 -0800

Github user nongli commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10961#discussion_r51305930
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java
 ---
    @@ -126,12 +147,25 @@ public ArrayData copy() {
                 list[i] = data.getLong(offset + i);
               }
             }
    +      } else if (dt instanceof DecimalType) {
    +        DecimalType decType = (DecimalType)dt;
    +        for (int i = 0; i < length; i++) {
    +          if (!data.getIsNull(offset + i)) {
    +            list[i] = getDecimal(i, decType.precision(), decType.scale());
    +          }
    +        }
           } else if (dt instanceof StringType) {
             for (int i = 0; i < length; i++) {
               if (!data.getIsNull(offset + i)) {
                 list[i] = ColumnVectorUtils.toString(data.getByteArray(offset 
+ i));
               }
             }
    +      } else if (dt instanceof CalendarIntervalType) {
    --- End diff --
    
    This is exposing missing implementation. We never generate arrays or 
complex types in the testing currently but implementing it is quite a bit more 
work. The issue is that to materialize the list of nested types inside the 
array, we need a deep copy of everything.
    
    e.g. array<struct> means we need to copy the row for the struct. 
    
    This patch is big enough that I'd prefer to defer. Also, this is extremely 
expensive so we might want to rethink how we do this. For example, having 
getArray return an iterator instead of the list.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13043][SQL] Implement remaining catalys...

Reply via email to