[
https://issues.apache.org/jira/browse/DRILL-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258036#comment-15258036
]
ASF GitHub Bot commented on DRILL-4437:
---------------------------------------
Github user magpierre commented on the pull request:
https://github.com/apache/drill/pull/451#issuecomment-214733518
I recently got a tip from the drill dev team to use the UNION TYPE which is
working great (but needs some code change so possibly this pull request should
be closed and reopened at a later stage) and greatly simplifies the "format"
that needs to be produced in order to store XML in JSON format, but the
DRILL-4437 commit closes the possibility to use UNION_TYPE, and read Number as
double when embedding JSON. Both of which have proved to be essential to
support XML in JSON. Here's an example of the format created by the XML parser
when doing select * on the drill root pom file. (when simplified thanks to the
union type being available, which is far cleaner and leaner)
[test.json.zip](https://github.com/apache/drill/files/236604/test.json.zip)
> Implement framework for testing operators in isolation
> ------------------------------------------------------
>
> Key: DRILL-4437
> URL: https://issues.apache.org/jira/browse/DRILL-4437
> Project: Apache Drill
> Issue Type: Test
> Components: Tools, Build & Test
> Reporter: Jason Altekruse
> Assignee: Jason Altekruse
> Fix For: 1.7.0
>
>
> Most of the tests written for Drill are end-to-end. We spin up a full
> instance of the server, submit one or more SQL queries and check the results.
> While integration tests like this are useful for ensuring that all features
> are guaranteed to not break end-user functionality overuse of this approach
> has caused a number of pain points.
> Overall the tests end up running a lot of the exact same code, parsing and
> planning many similar queries.
> Creating consistent reproductions of issues, especially edge cases found in
> clustered environments can be extremely difficult. Even the simpler case of
> testing cases where operators are able to handle a particular series of
> incoming batches of records has required hacks like generating large enough
> files so that the scanners happen to break them up into separate batches.
> These tests are brittle as they make assumptions about how the scanners will
> work in the future. An example of when this could break, we might do perf
> evaluation to find out we should be producing larger batches in some cases.
> Existing tests that are trying to test multiple batches by producing a few
> more records than the current threshold for batch size would not be testing
> the same code paths.
> We need to make more parts of the system testable without initializing the
> entire Drill server, as well as making the different internal settings and
> state of the server configurable for tests.
> This is a first effort to enable testing the physical operators in Drill by
> mocking the components of the system necessary to enable operators to
> initialize and execute.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)