[ 
https://issues.apache.org/jira/browse/DRILL-7429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076111#comment-17076111
 ] 

ASF GitHub Bot commented on DRILL-7429:
---------------------------------------

vvysotskyi commented on pull request #2048: DRILL-7429: Wrong column order when 
selecting complex data using Hive storage plugin
URL: https://github.com/apache/drill/pull/2048#discussion_r403732045
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ##########
 @@ -339,6 +331,14 @@ protected IterOutcome handleNullInput() {
     return IterOutcome.OK_NEW_SCHEMA;
   }
 
+  private boolean isNotSchemalessInput() {
+    RecordBatch incomingBatch = incoming instanceof 
IteratorValidatorBatchIterator
+        ? ((IteratorValidatorBatchIterator) incoming).getIncoming()
+        : incoming;
 
 Review comment:
   Is it required to ensure that incoming batch after 
`IteratorValidatorBatchIterator` is `SchemalessBatch`?
   
   Looks like `SchemalessBatch.getSchema()` always returns null, so at least 
the second check covers this `instance of` check...
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Wrong column order when selecting complex data using Hive storage plugin.
> -------------------------------------------------------------------------
>
>                 Key: DRILL-7429
>                 URL: https://issues.apache.org/jira/browse/DRILL-7429
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Hive
>    Affects Versions: 1.16.0
>            Reporter: Anton Gozhiy
>            Assignee: Igor Guzenko
>            Priority: Major
>             Fix For: 1.18.0
>
>         Attachments: customer_complex.zip
>
>
> *Data:*
> customer_complex.zip attached
> *Query:*
> {code:sql}
> select t3.a, t3.b from (select t2.a, t2.a.o_lineitems[1].l_part.p_name b from 
> (select t1.c_orders[0] a from hive.customer_complex t1) t2) t3 limit 1
> {code}
> *Expected result:*
> Column order: a, b
> *Actual result:*
> Column order: b, a
> *Physical plan:*
> {noformat}
> 00-00    Screen
> 00-01      Project(a=[ROW($0, $1, $2, $3, $4, $5, $6, $7)], b=[$8])
> 00-02        Project(a=[ITEM($0, 0).o_orderstatus], a1=[ITEM($0, 
> 0).o_totalprice], a2=[ITEM($0, 0).o_orderdate], a3=[ITEM($0, 
> 0).o_orderpriority], a4=[ITEM($0, 0).o_clerk], a5=[ITEM($0, 
> 0).o_shippriority], a6=[ITEM($0, 0).o_comment], a7=[ITEM($0, 0).o_lineitems], 
> b=[ITEM(ITEM(ITEM(ITEM($0, 0).o_lineitems, 1), 'l_part'), 'p_name')])
> 00-03          Project(c_orders=[$0])
> 00-04            SelectionVectorRemover
> 00-05              Limit(fetch=[10])
> 00-06                Scan(table=[[hive, customer_complex]], 
> groupscan=[HiveDrillNativeParquetScan [entries=[ReadEntryWithPath 
> [path=/drill/customer_complex/000000_0]], numFiles=1, numRowGroups=1, 
> columns=[`c_orders`[0].`o_orderstatus`, `c_orders`[0].`o_totalprice`, 
> `c_orders`[0].`o_orderdate`, `c_orders`[0].`o_orderpriority`, 
> `c_orders`[0].`o_clerk`, `c_orders`[0].`o_shippriority`, 
> `c_orders`[0].`o_comment`, `c_orders`[0].`o_lineitems`, 
> `c_orders`[0].`o_lineitems`[1].`l_part`.`p_name`]]])
> {noformat}
> *Note:* Reproduced with both Hive and Native readers. Non-reproducible with 
> Parquet reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to