[
https://issues.apache.org/jira/browse/DRILL-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878917#comment-16878917
]
ASF GitHub Bot commented on DRILL-7306:
---------------------------------------
paul-rogers commented on issue #1813: DRILL-7306: Disable schema-only batch for
new scan framework
URL: https://github.com/apache/drill/pull/1813#issuecomment-508583884
Thanks, @arina-ielchiieva, for pointing me to the Parquet data sources. As
it turns out, these failures are quite a mystery.
First, I don't think the files you mentioned are those used by the tests
that failed. The set stored on GitHub is for scale factor (SF) 0.1 which has
1500 customers in the customer table with ids from 0 to 1499. The tests seem to
use SF1 which, perhaps, is generated by the test framework during its setup. If
we look at the union03 query, the expected results include customer IDs in the
six-digit range.
That said, I did recreate the union03 query locally, using the SF0.1 files
and got 3 result rows. To verify, I wrote a test that scanned the entire table
(just a `SELECT * FROM ...`), and "manually" applied the where clause. Three
rows matched. So, looks like, at least locally, that particular query works OK
against the SF0.1 data set.
Unfortunately, I can't check the contents of the `customer.parquet` file
because I can't get Parquet tools to work after several hours of fighting one
thing after another. I seem to recall we discussed bundling that tool with
Drill. Doing so would be very handy. Building by hand requires far more steps
than is documented in the Parquet and HortonWorks web site: 1) install gcc, 2)
download and compile thrift, 3) build Parquet-tools, 4) figure out the set of
dependent jars that must be on the class path, 5)... not sure, here is where I
gave up in frustration...
Taking a step back, I'm actually completely mystified at how my changes
could impact Parquet (only). This PR only changed source files are for the
"new" scan, which Parquet does not use. Oddly, none of the text file queries
fail; which is the one area I *did* change.
Were the parquet files used in the tests rebuilt recently? Might there be a
problem with the data itself?
Just to make sure I'm tracking down the correct issue: does the master
branch pass these same tests? Using the same data files (that is, using the
same cluster without rebuilding the functional tests?) Perhaps try testing the
log regex or mock PRs. They are rebased on the same master version as this PR.
But, they include a distinct set of changes. If those PRs pass, then the
problem is somewhere in this PR. If those {Rs have failures, then perhaps we
want to double-check the test framework data.
While that is done, I will continue to try to find a way to track down the
issue (without access to the test framework or the SF1 data...)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Disable "fast schema" batch for new scan framework
> --------------------------------------------------
>
> Key: DRILL-7306
> URL: https://issues.apache.org/jira/browse/DRILL-7306
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.16.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Major
> Fix For: 1.17.0
>
>
> The EVF framework is set up to return a "fast schema" empty batch with only
> schema as its first batch because, when the code was written, it seemed
> that's how we wanted operators to work. However, DRILL-7305 notes that many
> operators cannot handle empty batches.
> Since the empty-batch bugs show that Drill does not, in fact, provide a "fast
> schema" batch, this ticket asks to disable the feature in the new scan
> framework. The feature is disabled with a config option; it can be re-enabled
> if ever it is needed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)