[ https://issues.apache.org/jira/browse/DRILL-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879786#comment-16879786 ]
ASF GitHub Bot commented on DRILL-7306: --------------------------------------- paul-rogers commented on issue #1813: DRILL-7306: Disable schema-only batch for new scan framework URL: https://github.com/apache/drill/pull/1813#issuecomment-508966464 @arina-ielchiieva, very sorry for the trouble that PR has caused. Thank you for the SF1 data. It allowed me to find the offending change and fix the issue. This is a case of "no good deed goes unpunished." I tried to work around the bad code explained in DRILL-7308 by setting precision only if non-zero. For reasons I did not track down, this change caused some part of the TPC-H queries to fail. The failure was in the "tuple/column metadata" classes. I had not realized that these classes are now used outside of just the new scan and schema work. (I did not track down the usage.) This exercise shows that Drill requires the following rules regarding precision: * Precision must be set for all types that need it (including VarChar and Decimal.) * Precision must be set for these types even if the value is zero. * Code that wants to know if the precision is non-zero should check the precision value itself. Code should not use an "is set" check as a substitute for "value != 0". (This is the flaw in the REST code described in DRILL-7308.) Filed DRILL-7318 to suggest we clean up and standardize our type-to-string implementations so that they build type strings based on the above rules. Reverted the precision-related change in {{PrimitiveColumnMetadata}}. The TPC-H union03 query now passes locally. I presume the others will also. The reversion caused a test to fail. It seems that the `EXPLAIN PLAN` for the new schema provisioning stuff uses the `toString()` method to get the type string shown in the plan for a provided schema. Not sure this is the best idea because `toString()` is for debugging and uses internal type names. It also shows the precision and scale for all types, resulting in the output `INT(0, 0):OPTIONAL`. (The type names and cardinality notation should be fixed as part of DRILL-7318.) The reverted change had removed the unwanted precision and scale. By reverting the change, the unwanted precision and scale returned. So, for now, I modified `MaterializedField.toString()` to not emit the precision if the value is zero (unless the type is Decimal.) `EXPLAIN PLAN` now produces `INT:OPTIONAL` (without the precision and scale.) Left the changes as a new commit so you can review just the new changes. Please try this PR again against the functional tests. Let's hope this time things work this time. If so, I'll squash commits. We talked about creating a merge branch with the three open PRs. Since you need to run the functional tests again to verify the fix, do you want to then just commit this PR? I can create a merge branch for the other two. Otherwise, I'm happy to create a merge branch with all three. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Disable "fast schema" batch for new scan framework > -------------------------------------------------- > > Key: DRILL-7306 > URL: https://issues.apache.org/jira/browse/DRILL-7306 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.16.0 > Reporter: Paul Rogers > Assignee: Paul Rogers > Priority: Major > Fix For: 1.17.0 > > > The EVF framework is set up to return a "fast schema" empty batch with only > schema as its first batch because, when the code was written, it seemed > that's how we wanted operators to work. However, DRILL-7305 notes that many > operators cannot handle empty batches. > Since the empty-batch bugs show that Drill does not, in fact, provide a "fast > schema" batch, this ticket asks to disable the feature in the new scan > framework. The feature is disabled with a config option; it can be re-enabled > if ever it is needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)