Paul Rogers created DRILL-5204:
----------------------------------

             Summary: Extend mock data source to use table specs from SQL
                 Key: DRILL-5204
                 URL: https://issues.apache.org/jira/browse/DRILL-5204
             Project: Apache Drill
          Issue Type: Improvement
          Components: Tools, Build & Test
    Affects Versions: 1.9.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers
            Priority: Minor


DRILL-5152 provided a simple way to generate mock data from SQL:

{code}
SELECT colName_type FROM `mock`.`tableName_size` ...
{code}

The fix in that release encoded types and record counts directly in the SQL, 
which is very handy for many simple cases.

The original mock data source has another feature: it lets you create multiple 
mock blocks of data that can be read in multiple threads. Later additions made 
it easy to repeat a column definition (to generate, say, a table with 1000 
columns), to choose the data generator class, etc. All of this was available 
only when writing physical plans by hand and encoding the definition in the sub 
scan for the mock data source.

This enhancement extends the SQL feature to allow the definitions to appear in 
a JSON file easily referenced from SQL. The JSON file must be somewhere on the 
class path (typically in a resources directory.) Then:

{code}
SELECT red, blue, green FROM `mock`.`foo/colors.json` ...
{code}

Is interpreted to mean, "the file colors.json defines a mock data source, 
perhaps with repeated columns, perhaps with multiple fragments. From that mock 
data source, select the three columns red, blue and green."

With this change, tests can include quite sophisticated mock data sources, 
simplifying debugging of plans with multiple fragments and/or more complex 
table structures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to