[ 
https://issues.apache.org/jira/browse/DRILL-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677656#comment-16677656
 ] 

Aman Sinha commented on DRILL-6829:
-----------------------------------

[~Paul.Rogers] I want to clarify .. by 'union' of two incompatible schemas I 
did not mean using the union type.  I meant the union like operation that we 
normally do for record batches.  Step #7 in my first comment is about doing 
this cross-schema-union.  Suppose there are 3 record batches each with 
different schema for the sort key.  These will be sitting in separate internal 
queues of the blocking operator and each will be individually sorted. The 
cross-schema-union will traverse these queues in a certain order (e.g all 
Numeric types appear first, followed by all String types, followed by Date 
types) consuming all batches from the first queue and emitting them, followed 
by second queue and so on.  

> Handle schema change in ExternalSort
> ------------------------------------
>
>                 Key: DRILL-6829
>                 URL: https://issues.apache.org/jira/browse/DRILL-6829
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Aman Sinha
>            Priority: Major
>
> While we continue to enhance the schema provision and metastore aspects in 
> Drill, we also should explore what it means to be truly schema-less such that 
> we can better handle \{semi, un}structured data, data sitting in DBs that 
> store JSON documents (e.g Mongo, MapR-DB). 
>  
> The blocking operators are the main hurdles in this goal (other operators 
> also need to be smarter about this but the problem is harder for the blocking 
> operators).   This Jira is specifically about ExternalSort. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to