Jason Altekruse created DRILL-4437:
--------------------------------------
Summary: Implement framework for testing operators in isolation
Key: DRILL-4437
URL: https://issues.apache.org/jira/browse/DRILL-4437
Project: Apache Drill
Issue Type: Test
Components: Tools, Build & Test
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Fix For: 1.6.0
Most of the tests written for Drill are end-to-end. We spin up a full instance
of the server, submit one or more SQL queries and check the results.
While integration tests like this are useful for ensuring that all features are
guaranteed to not break end-user functionality overuse of this approach has
caused a number of pain points.
Overall the tests end up running a lot of the exact same code, parsing and
planning many similar queries.
Creating consistent reproductions of issues, especially edge cases found in
clustered environments can be extremely difficult. Even the simpler case of
testing cases where operators are able to handle a particular series of
incoming batches of records has required hacks like generating large enough
files so that the scanners happen to break them up into separate batches. These
tests are brittle as they make assumptions about how the scanners will work in
the future. An example of when this could break, we might do perf evaluation to
find out we should be producing larger batches in some cases. Existing tests
that are trying to test multiple batches by producing a few more records than
the current threshold for batch size would not be testing the same code paths.
We need to make more parts of the system testable without initializing the
entire Drill server, as well as making the different internal settings and
state of the server configurable for tests.
This is a first effort to enable testing the physical operators in Drill by
mocking the components of the system necessary to enable operators to
initialize and execute.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)