[ 
https://issues.apache.org/jira/browse/DRILL-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka resolved DRILL-5612.
------------------------------------
    Resolution: Fixed

> Random failure in TestMergeJoinWithSchemaChanges
> ------------------------------------------------
>
>                 Key: DRILL-5612
>                 URL: https://issues.apache.org/jira/browse/DRILL-5612
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
>            Assignee: Vitalii Diravka
>            Priority: Major
>         Attachments: image-2021-11-16-02-35-25-690.png
>
>
> The unit test 
> {{org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns}}
>  is subject to random failures, perhaps due to changes in file order in 
> readers.
> The test builds a number of input files, then executes queries against them. 
> On most runs, the output is fine:
> {code}
> Running 
> org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns
> /home/.../target/1498606483211-0/mergejoin-schemachanges-left
> /home/.../target/1498606483211-1/mergejoin-schemachanges-right
> {code}
> But, on occasion, the query fails:
> {code}
> org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges
> testMissingAndNewColumns(org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges)
>   Time elapsed: 0.569 sec  <<< ERROR!
> ...: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with 
> changing schemas
> Fragment 0:0
>   (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only 
> supports a single schema.
>     
> org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():152
>     
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():476
> ...
> {code}
> The line in the exception above:
> {code}
>   public void build(VectorContainer outputContainer) throws 
> SchemaChangeException {
>     outputContainer.clear();
>     if (batches.keySet().size() > 1) {
>       throw new SchemaChangeException("Sort currently only supports a single 
> schema.");
>     }
> {code}
> The above code has not changed in quite some time. The failure is in the 
> "legacy" external sort.
> Although the external sort does support schema changes, it only does so in 
> the form of a union vector, which must be enabled. (Other tests validate that 
> schema changes work.)
> What is likely happening here is that the sort sometimes sees two files with 
> differing schemas, sometimes multiple threads run so that a single sort sees 
> only one file. This speculation can be verified by looking at a log file (not 
> available in the test run that failed) to see if the scan under the sort read 
> more than one file.
> Or, perhaps the order of the JSON files matters. Perhaps file order varies 
> across machines (since the Linux command to list directories does not 
> guarantee order.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to