[ 
https://issues.apache.org/jira/browse/DRILL-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401193#comment-16401193
 ] 

ASF GitHub Bot commented on DRILL-6223:
---------------------------------------

GitHub user sachouche opened a pull request:

    https://github.com/apache/drill/pull/1170

    DRILL-6223: Fixed several Drillbit failures due to schema changes

    Fixed several Issues due to Schema changes:
    1) Changes in complex data types
    Drill Query Failing when selecting all columns from a Complex Nested Data 
File (Parquet) Set). There are differences in Schema among the files:
    
    The Parquet files exhibit differences both at the first level and within 
nested data types
    A select * will not cause an exception but using a limit clause will
    Note also this issue seems to happen only when multiple Drillbit minor 
fragments are involved (concurrency higher than one)
    
    2) Dangling columns (both simple and complex)
    This situation can be easily reproduced for:
    - Select STAR queries which involve input data with different schemas
    - LIMIT or / and PROJECT operators are used
    - The data will be read from more than one minor fragment
    - This is because individual readers have logic to handle such use-cases 
but not downstream operators
    - So is reader-1 sends one batch with F1, F2, and F3
    - The reader-2 sends batch F2, F3
    - Then the LIMIT and PROJECT operator will fail to cleanup the dangling 
column F1 which will cause failures when downstream operators copy logic 
attempts copy the stale column F1
    - This pull request adds logic to detect and eliminate dangling columns   

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sachouche/drill DRILL-6223

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1170.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1170
    
----
commit d986b6c7588c107bb7e49d2fc8eb3f25a60e1214
Author: Salim Achouche <sachouche2@...>
Date:   2018-02-21T02:17:14Z

    DRILL-6223: Fixed several Drillbit failures due to schema changes

----


> Drill fails on Schema changes 
> ------------------------------
>
>                 Key: DRILL-6223
>                 URL: https://issues.apache.org/jira/browse/DRILL-6223
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 1.10.0, 1.12.0
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>             Fix For: 1.14.0
>
>
> Drill Query Failing when selecting all columns from a Complex Nested Data 
> File (Parquet) Set). There are differences in Schema among the files:
>  * The Parquet files exhibit differences both at the first level and within 
> nested data types
>  * A select * will not cause an exception but using a limit clause will
>  * Note also this issue seems to happen only when multiple Drillbit minor 
> fragments are involved (concurrency higher than one)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to