On Fri, Jul 26, 2013 at 6:40 PM, Sree V <[email protected]> wrote:
> Hi J,
>
> - The goal was to come up with manually generated tpc-h logical
> queries.  We'll use these to validate the output of sql parser.
>
> I am doing the later. feed the tpc-h queries to sql parser and come up with 
> logical plan, then verify manually or by feeding into execution engine.

See how it goes.  I don't think the SQL parser will actually parse all
the queries without substantial modification.  Plus, our hope was to
start using the logical queries for the execution engine while working
on the sql parser in tandem.

>
> - DrQL parser is not currently being used.
> I realized it later.
>
> - Why are you creating pojos for anything?
> TPC-H data set is in PSV files.
> It is easy with POJOs for this work flow.
> From PSV -> pojos -> JSON for now and any other format later.
>

Got it.

> At the end, we can give out a data set and sqls, respective logical plan and 
> physical plan, for drill users to play with and refer to.
>
> V
>
>
>
> ________________________________
>  From: Jacques Nadeau <[email protected]>
> To: [email protected]; Sree V <[email protected]>
> Sent: Friday, July 26, 2013 10:59 AM
> Subject: Re: [jira] [Work started] (DRILL-47) Generate Logical Plans for 
> TPC-H Queries
>
>
> Some thoughts (not in any particular order):
>
> - The goal was to come up with manually generated tpc-h logical
> queries.  We'll use these to validate the output of sql parser.
> - DrQL parser is not currently being used.
> - Why are you creating pojos for anything?
>
> J
>
>
>
>> [Sree Vaddi:] Seems, I should be using 'sqlparser' project.  Any 
>> sample/thought ?
>>
>>
>> 3.
>> How to apply the parsed sql from 2. above to the data in 1. above, to output 
>> the
>> Logical Plan ?
>>
>>
>> Please advise.
>>
>>
>> Thanking you.
>> With Regards
>> Sree
>>
>>
>>
>> Supporting code for 2. above and debug info:
>>
>>     @Test
>>     public void testTPCHSql1() {
>>         String drqlQueryText = "select " +
>>             "l_returnflag, l_linestatus, " +
>>             "sum(l_quantity) as sum_qty, " +
>>             "sum(l_extendedprice) as sum_base_price, " +
>>             "sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, " +
>>             "sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as 
>> sum_charge, " +
>>             "avg(l_quantity) as avg_qty, " +
>>
>>      "avg(l_extendedprice) as avg_price, " +
>>             "avg(l_discount) as avg_disc, " +
>>             "count(*) as count_order " +
>>         "from " +
>>             "lineitem " +
>>         "where " +
>>             "l_shipdate <= date '1998-12-01' - interval ':1' day (3) " +
>>         "group by " +
>>             "l_returnflag, " +
>>             "l_linestatus " +
>>         "order by " +
>>             "l_returnflag, " +
>>
>>  "l_linestatus;";
>>
>>         DrqlParser parser = new AntlrParser();
>>         SemanticModelReader query = parser.parse(drqlQueryText);
>>
>>         System.out.println(query.getFromClause());
>>         System.out.println(query.getGroupByClause());
>>         System.out.println(query.getJoinOnClause());
>>         System.out.println(query.getjustATable());
>>         System.out.println(query.getLimitClause());
>>         System.out.println(query.getOrderByClause());
>>         System.out.println(query.getResultColumnList().size());
>>
>>  System.out.println(query.getWhereClause());
>>         /*
>> setup debug info:
>> line#2299 DrqlAntlrParser
>> 2320
>> 3682
>> 4884
>> 5363
>>
>> 392
>> 6664
>>
>> #1207 DrqlAntlrLexer.mDiv()
>> part of the sql parsing:
>> // l_shipdate <= date '1998-12-01' - interval ':1' day (3)
>> variable value: (parsing location in sql i.e the location of letter 'd' in 
>> date)
>> [@125,378:379='<=',<52>,1:378]
>>
>> looks like the 'date' is interpreted as 'div' ?!
>>
>> test method console output:
>> line 1:382 mismatched character 'A' expecting ' '
>> line 1:416 mismatched character 'A' expecting ' '
>>
>> [org.apache.drill.parsers.impl.drqlantlr.SemanticModel@3a86edfe]
>> []
>> null
>> null
>> null
>> []
>> 10
>> null
>>
>>          */
>>     }
>>
>> ________________________________
>>  From: Sree Vaddi (JIRA) <[email protected]>
>> To: [email protected]
>> Sent: Thursday, July 25, 2013 7:07 AM
>> Subject: [jira] [Work started] (DRILL-47) Generate Logical Plans for TPC-H 
>> Queries
>>
>>
>>
>>      [ 
>> https://issues.apache.org/jira/browse/DRILL-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>  ]
>>
>> Work on DRILL-47 started by Sree Vaddi.
>>
>>> Generate Logical Plans for TPC-H Queries
>>> ----------------------------------------
>>>
>>>                 Key: DRILL-47
>>>                 URL: https://issues.apache.org/jira/browse/DRILL-47
>>>             Project: Apache Drill
>>>          Issue Type: New Feature
>>>            Reporter: Jacques Nadeau
>>>            Assignee: Sree Vaddi
>>>
>>> Creating example logical plans should help in many ways.  Among those are 
>>> validation cases for the sql parser, logical plan completeness, execution 
>>> engine performance, etc.  It would be great if someone could generate 
>>> logical plans for each of the TPC-H queries.
>>> The data is in PSV (pipe separated value) files.
>>> Converting these to JSON files.
>>> There are 20 sql files and 5 variant sql files.
>>> Creating one sub-task jira for each of the sql files.  Everything related 
>>> to that sql will be in it, i.e. sql parser, logical plan, execution stats 
>>> ...
>>
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA administrators
>> For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to