[ 
https://issues.apache.org/jira/browse/DRILL-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Phillips updated DRILL-1781:
-----------------------------------
    Attachment: DRILL-1781.patch

> For complex functions, don't return until schema is known
> ---------------------------------------------------------
>
>                 Key: DRILL-1781
>                 URL: https://issues.apache.org/jira/browse/DRILL-1781
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Steven Phillips
>            Priority: Blocker
>             Fix For: 0.7.0
>
>         Attachments: DRILL-1781.patch, DRILL-1781.patch
>
>
> In the case of complex output functions, it is impossible to determine the 
> output schema until the actual data is consumed. For example, with 
> convert_form(VARCHAR, 'json'), unlike most other functions, it is not 
> sufficient to know that the incoming data type is VARCHAR, we actually need 
> to decode the contents of the record before we can determine what the output 
> type is, whether it be map, list, or primitive type.
> For fast schema return, we worked around this problem by simply assuming the 
> type was Map, and if it happened to be different, there would be a schema 
> change. This solution is not satisfactory, as it ends up breaking other 
> functions, like flatten.
> The solution is to continue returning a schema whenever possible, but when it 
> is not possible, drill will wait until it is.
> For non-blocking operators, drill will immediately consume the incoming 
> batch, and thus will not return empty schema batches if there is data to 
> consume. Blocking operators will return an empty schema batch. If a flattten 
> function occurs downstream from a blocking operator, it will not be able to 
> return a schema, and thus fast schema return will not happen in this case.
> In the cases where the complex function is not downstream from a blocking 
> operator, fast schema return should continue to work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to