[ https://issues.apache.org/jira/browse/DRILL-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vitalii Diravka updated DRILL-5612: ----------------------------------- Attachment: image-2021-11-16-02-35-25-690.png > Random failure in TestMergeJoinWithSchemaChanges > ------------------------------------------------ > > Key: DRILL-5612 > URL: https://issues.apache.org/jira/browse/DRILL-5612 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.11.0 > Reporter: Paul Rogers > Priority: Major > Attachments: image-2021-11-16-02-35-25-690.png > > > The unit test > {{org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns}} > is subject to random failures, perhaps due to changes in file order in > readers. > The test builds a number of input files, then executes queries against them. > On most runs, the output is fine: > {code} > Running > org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges#testMissingAndNewColumns > /home/.../target/1498606483211-0/mergejoin-schemachanges-left > /home/.../target/1498606483211-1/mergejoin-schemachanges-right > {code} > But, on occasion, the query fails: > {code} > org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges > testMissingAndNewColumns(org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges) > Time elapsed: 0.569 sec <<< ERROR! > ...: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with > changing schemas > Fragment 0:0 > (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only > supports a single schema. > > org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():152 > > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():476 > ... > {code} > The line in the exception above: > {code} > public void build(VectorContainer outputContainer) throws > SchemaChangeException { > outputContainer.clear(); > if (batches.keySet().size() > 1) { > throw new SchemaChangeException("Sort currently only supports a single > schema."); > } > {code} > The above code has not changed in quite some time. The failure is in the > "legacy" external sort. > Although the external sort does support schema changes, it only does so in > the form of a union vector, which must be enabled. (Other tests validate that > schema changes work.) > What is likely happening here is that the sort sometimes sees two files with > differing schemas, sometimes multiple threads run so that a single sort sees > only one file. This speculation can be verified by looking at a log file (not > available in the test run that failed) to see if the scan under the sort read > more than one file. > Or, perhaps the order of the JSON files matters. Perhaps file order varies > across machines (since the Linux command to list directories does not > guarantee order.) -- This message was sent by Atlassian Jira (v8.20.1#820001)