-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44386/#review124407
-----------------------------------------------------------
I'm still having trouble understanding the query `select customer_id,
customer_interests, unit_sales from sales where time_range_in(order_time,
'2015-04-11-00', '2015-04-13-00') and customer_interests not in ('Food')`. The
name `customer_interests` seems like a collection field. Which it is, looking
at the earlier query mentioned in testing done section. The filter in the query
is `customer_interests not in ('Food')` while the results make it seem like the
filter that's finally applied is `'Food' not in customer_interests`.
Secondly, to me, the query `select customer_id, customer_interests, unit_sales
from sales where time_range_in(order_time, '2015-04-11-00', '2015-04-13-00')
and 'Food' not in customer_interests` seems intuitive to understand, since
looking at field named `customer_interests`, there's an automatic assumption
involded that it's an array field. And if it's an array field, `element in
array` is the widely recognized syntax than `array in element`.
Would like to know other people's thoughts too.
Will do the code review separately.
- Rajat Khandelwal
On March 17, 2016, 5:52 a.m., Amareshwari Sriramadasu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44386/
> -----------------------------------------------------------
>
> (Updated March 17, 2016, 5:52 a.m.)
>
>
> Review request for lens.
>
>
> Bugs: LENS-788
> https://issues.apache.org/jira/browse/LENS-788
>
>
> Repository: lens
>
>
> Description
> -------
>
> Changes include :
> - Adds option to flatten early, if required. Default value is false. So,
> aggregation are run over expressions if value is false.
> - IN, NOT IN, = and != filters are converted to array_contains filters. Added
> config to override for filter funciton to use.
> - Did some refactoring to update fromString in QueryAST
>
>
> Diffs
> -----
>
> lens-cube/src/main/java/org/apache/lens/cube/parse/CandidateFact.java
> 82ca4f4
> lens-cube/src/main/java/org/apache/lens/cube/parse/CubeQueryConfUtil.java
> d96b567
> lens-cube/src/main/java/org/apache/lens/cube/parse/CubeQueryContext.java
> b8b6db9
> lens-cube/src/main/java/org/apache/lens/cube/parse/DefaultAliasDecider.java
> dadbfa0
> lens-cube/src/main/java/org/apache/lens/cube/parse/DefaultQueryAST.java
> a403e36
> lens-cube/src/main/java/org/apache/lens/cube/parse/DimHQLContext.java
> 318c82a
> lens-cube/src/main/java/org/apache/lens/cube/parse/DimOnlyHQLContext.java
> d22287b
> lens-cube/src/main/java/org/apache/lens/cube/parse/HQLParser.java 8d6105f
> lens-cube/src/main/java/org/apache/lens/cube/parse/JoinResolver.java
> b861bb6
> lens-cube/src/main/java/org/apache/lens/cube/parse/MultiFactHQLContext.java
> 9c18b7e
> lens-cube/src/main/java/org/apache/lens/cube/parse/QueryAST.java 31680ca
>
> lens-cube/src/main/java/org/apache/lens/cube/parse/SingleFactMultiStorageHQLContext.java
> 63cb388
>
> lens-cube/src/main/java/org/apache/lens/cube/parse/SingleFactSingleStorageHQLContext.java
> b1a3b3f
>
> lens-cube/src/main/java/org/apache/lens/cube/parse/join/AutoJoinContext.java
> e14a898
>
> lens-cube/src/main/java/org/apache/lens/cube/parse/join/BridgeTableJoinContext.java
> PRE-CREATION
> lens-cube/src/main/resources/olap-query-conf.xml 1436cd1
> lens-cube/src/test/java/org/apache/lens/cube/parse/CubeTestSetup.java
> 42decc6
>
> lens-cube/src/test/java/org/apache/lens/cube/parse/TestBridgeTableQueries.java
> e8ad8ff
>
> lens-cube/src/test/java/org/apache/lens/cube/parse/TestDefaultAliasDecider.java
> PRE-CREATION
> lens-cube/src/test/java/org/apache/lens/cube/parse/TestHQLParser.java
> f9d7457
> lens-cube/src/test/java/org/apache/lens/cube/parse/TestQuery.java cd20fef
>
> lens-cube/src/test/java/org/apache/lens/cube/parse/join/TestBridgeTableJoinCtx.java
> PRE-CREATION
> lens-dist/src/main/assembly/bin-dist.xml dede085
>
> lens-examples/src/main/java/org/apache/lens/examples/PopulateSampleMetastore.java
> cfe3465
> lens-examples/src/main/java/org/apache/lens/examples/SampleMetastore.java
> 0c2301b
> lens-examples/src/main/resources/cube-queries.sql 9f4a353
> lens-examples/src/main/resources/customer-interests-local-part.xml
> PRE-CREATION
> lens-examples/src/main/resources/customer-interests-local.data PRE-CREATION
> lens-examples/src/main/resources/customer-interests.xml PRE-CREATION
> lens-examples/src/main/resources/customer_interests_table.xml PRE-CREATION
> lens-examples/src/main/resources/db-storage-schema.sql 0aceeab
> lens-examples/src/main/resources/interests-local-part.xml PRE-CREATION
> lens-examples/src/main/resources/interests-local.data PRE-CREATION
> lens-examples/src/main/resources/interests.xml PRE-CREATION
> lens-examples/src/main/resources/interests_table.xml PRE-CREATION
> lens-examples/src/main/resources/sales-cube.xml e944821
> src/site/apt/user/olap-cube.apt 4bed623
> src/site/apt/user/olap-query-conf.apt 6f84869
> tools/conf/client/lens-client-site.xml 706e356
>
> Diff: https://reviews.apache.org/r/44386/diff/
>
>
> Testing
> -------
>
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Lens Checkstyle Rules ............................. SUCCESS [1.906s]
> [INFO] Lens .............................................. SUCCESS [3.122s]
> [INFO] Lens API .......................................... SUCCESS [28.599s]
> [INFO] Lens API for server and extensions ................ SUCCESS [19.594s]
> [INFO] Lens Cube ......................................... SUCCESS
> [11:12.234s]
> [INFO] Lens DB storage ................................... SUCCESS [19.512s]
> [INFO] Lens Query Library ................................ SUCCESS [17.144s]
> [INFO] Lens Hive Driver .................................. SUCCESS [2:52.782s]
> [INFO] Lens Driver for JDBC .............................. SUCCESS [36.328s]
> [INFO] Lens Elastic Search Driver ........................ SUCCESS [17.957s]
> [INFO] Lens Server ....................................... SUCCESS
> [17:40.805s]
> [INFO] Lens client ....................................... SUCCESS [34.762s]
> [INFO] Lens CLI .......................................... SUCCESS [2:53.130s]
> [INFO] Lens Examples ..................................... SUCCESS [10.135s]
> [INFO] Lens Ship Jars to Distributed Cache ............... SUCCESS [0.737s]
> [INFO] Lens Distribution ................................. SUCCESS [9.731s]
> [INFO] Lens ML Lib ....................................... FAILURE
> [40:05.058s]
> [INFO] Lens ML Ext Distribution .......................... SKIPPED
> [INFO] Lens Regression ................................... SKIPPED
> [INFO] Lens UI ........................................... SKIPPED
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Total time: 1:18:04.413s
> [INFO] Finished at: Wed Mar 16 14:47:06 UTC 2016
> [INFO] Final Memory: 196M/1834M
> [INFO]
> ------------------------------------------------------------------------
>
> Updated examples with bridge tables and added queries.
>
> ----
> lens-shell>select customer_id, customer_interests, unit_sales from sales
> where time_range_in(order_time, '2015-04-11-00', '2015-04-13-00')
> 16 Mar 2016 13:24:43 [Spring Shell] INFO cliLogger - Driver query: 'SELECT
> ( sales . customer_id ), ( customer_interests_chain . balias0 ), sum(( sales
> . unit_sales )) FROM ex1.local_sales_aggr_fact1 sales left outer join
> ex1.local_customer_table customer on sales.customer_id = customer.id and
> (customer.dt = 'latest') left outer join (select
> customer_interests.customer_id as customer_id,collect_set((
> customer_interests_chain . name )) as balias0 from
> ex1.local_customer_interests_table customer_interests join
> ex1.local_interests_table customer_interests_chain on
> customer_interests.interest_id = customer_interests_chain.id and
> (customer_interests_chain.dt = 'latest') and (customer_interests.dt =
> 'latest') group by customer_interests.customer_id) customer_interests_chain
> on customer.id = customer_interests_chain.customer_id WHERE (((( sales . ot )
> = '2015-04-11' ) or (( sales . ot ) = '2015-04-12' ))) GROUP BY ( sales .
> customer_id ), ( customer_interests_chain . balias0 )' and Dri
ver handle: OperationHandle [opType=EXECUTE_STATEMENT,
getHandleIdentifier()=359ad260-7d03-4ea0-9c7e-254391dd52ed]
> 16 Mar 2016 13:24:53 [Spring Shell] INFO cliLogger - Query Status:Progress:
> 0.1818181872367859
> Status: RUNNING
> Status Message: Query is running in HiveServer!
> Is Result Set Available: false
> Progress Message:
> [{"taskId":"Stage-12","type":"MAPREDLOCAL","externalHandle":"Stage-12","taskState":"FINISHED_STATE"},{"taskId":"Stage-2","type":"MAPRED","externalHandle":null,"taskState":"FINISHED_STATE"},{"taskId":"Stage-9","type":"CONDITIONAL","externalHandle":"Stage-9","taskState":"RUNNING_STATE"},{"taskId":"Stage-11","type":"MAPREDLOCAL","externalHandle":"Stage-11","taskState":"RUNNING_STATE"},{"taskId":"Stage-11","type":"MAPREDLOCAL","externalHandle":"Stage-11","taskState":"RUNNING_STATE"},{"taskId":"Stage-3","type":"MAPRED","externalHandle":null,"taskState":"UNKNOWN_STATE"},{"taskId":"Stage-8","type":"MAPRED","externalHandle":null,"taskState":"UNKNOWN_STATE"},{"taskId":"Stage-8","type":"MAPRED","externalHandle":null,"taskState":"UNKNOWN_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":null,"taskState":"UNKNOWN_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":null,"taskState":"UNKNOWN_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":null,"
taskState":"UNKNOWN_STATE"}]
>
>
> sales.customer_id customer_interests_chain.balias0 _c2
> Result available in memory, attaching here:
>
> 1 ["Fashion","Food"] 2
> 2 null 2
> 2 rows processed in (27) seconds.
>
> lens-shell>select customer_id, customer_interests, unit_sales from sales
> where time_range_in(order_time, '2015-04-11-00', '2015-04-13-00') and
> customer_interests not in ('Food')
> 16 Mar 2016 13:25:17 [Spring Shell] INFO cliLogger - Driver query: 'SELECT
> ( sales . customer_id ), ( customer_interests_chain . balias0 ), sum(( sales
> . unit_sales )) FROM ex1.local_sales_aggr_fact1 sales left outer join
> ex1.local_customer_table customer on sales.customer_id = customer.id and
> (customer.dt = 'latest') left outer join (select
> customer_interests.customer_id as customer_id,collect_set((
> customer_interests_chain . name )) as balias0 from
> ex1.local_customer_interests_table customer_interests join
> ex1.local_interests_table customer_interests_chain on
> customer_interests.interest_id = customer_interests_chain.id and
> (customer_interests_chain.dt = 'latest') and (customer_interests.dt =
> 'latest') group by customer_interests.customer_id) customer_interests_chain
> on customer.id = customer_interests_chain.customer_id WHERE ((((( sales . ot
> ) = '2015-04-11' ) or (( sales . ot ) = '2015-04-12' )) and not
> array_contains(( customer_interests_chain . balias0 ), 'Food' ))) GROUP B
Y ( sales . customer_id ), ( customer_interests_chain . balias0 )' and Driver
handle: OperationHandle [opType=EXECUTE_STATEMENT,
getHandleIdentifier()=f945a7a7-7f2d-4cb2-b893-5886bd3753cd]
>
> 16 Mar 2016 13:25:37 [Spring Shell] INFO cliLogger - Query Status:Progress:
> 0.6363636255264282
> Status: RUNNING
> Status Message: Query is running in HiveServer!
> Is Result Set Available: false
> Progress Message:
> [{"taskId":"Stage-12","type":"MAPREDLOCAL","externalHandle":"Stage-12","taskState":"FINISHED_STATE"},{"taskId":"Stage-2","type":"MAPRED","externalHandle":null,"taskState":"FINISHED_STATE"},{"taskId":"Stage-9","type":"CONDITIONAL","externalHandle":"Stage-9","taskState":"FINISHED_STATE"},{"taskId":"Stage-11","type":"MAPREDLOCAL","externalHandle":"Stage-11","taskState":"FINISHED_STATE"},{"taskId":"Stage-11","type":"MAPREDLOCAL","externalHandle":"Stage-11","taskState":"FINISHED_STATE"},{"taskId":"Stage-3","type":"MAPRED","externalHandle":null,"taskState":"UNKNOWN_STATE"},{"taskId":"Stage-8","type":"MAPRED","externalHandle":null,"taskState":"FINISHED_STATE"},{"taskId":"Stage-8","type":"MAPRED","externalHandle":null,"taskState":"FINISHED_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":null,"taskState":"RUNNING_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":null,"taskState":"RUNNING_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":n
ull,"taskState":"RUNNING_STATE"}]
>
> sales.customer_id customer_interests_chain.balias0 _c2
> Result available in memory, attaching here:
>
> 2 null 2
> 1 rows processed in (27) seconds.
>
> lens-shell>select customer_id, customer_interests, unit_sales from sales
> where time_range_in(order_time, '2015-04-11-00', '2015-04-13-00') and
> customer_interests in ('Food')
> 16 Mar 2016 13:26:31 [Spring Shell] INFO cliLogger - Driver query: 'SELECT
> ( sales . customer_id ), ( customer_interests_chain . balias0 ), sum(( sales
> . unit_sales )) FROM ex1.local_sales_aggr_fact1 sales left outer join
> ex1.local_customer_table customer on sales.customer_id = customer.id and
> (customer.dt = 'latest') left outer join (select
> customer_interests.customer_id as customer_id,collect_set((
> customer_interests_chain . name )) as balias0 from
> ex1.local_customer_interests_table customer_interests join
> ex1.local_interests_table customer_interests_chain on
> customer_interests.interest_id = customer_interests_chain.id and
> (customer_interests_chain.dt = 'latest') and (customer_interests.dt =
> 'latest') group by customer_interests.customer_id) customer_interests_chain
> on customer.id = customer_interests_chain.customer_id WHERE ((((( sales . ot
> ) = '2015-04-11' ) or (( sales . ot ) = '2015-04-12' )) and array_contains((
> customer_interests_chain . balias0 ), 'Food' ))) GROUP BY (
sales . customer_id ), ( customer_interests_chain . balias0 )' and Driver
handle: OperationHandle [opType=EXECUTE_STATEMENT,
getHandleIdentifier()=82df2138-8f33-4454-bd12-160cf38d7fba]
> 16 Mar 2016 13:26:51 [Spring Shell] INFO cliLogger - Query Status:Progress:
> 0.6363636255264282
> Status: RUNNING
> Status Message: Query is running in HiveServer!
> Is Result Set Available: false
> Progress Message:
> [{"taskId":"Stage-12","type":"MAPREDLOCAL","externalHandle":"Stage-12","taskState":"FINISHED_STATE"},{"taskId":"Stage-2","type":"MAPRED","externalHandle":null,"taskState":"FINISHED_STATE"},{"taskId":"Stage-9","type":"CONDITIONAL","externalHandle":"Stage-9","taskState":"FINISHED_STATE"},{"taskId":"Stage-11","type":"MAPREDLOCAL","externalHandle":"Stage-11","taskState":"FINISHED_STATE"},{"taskId":"Stage-11","type":"MAPREDLOCAL","externalHandle":"Stage-11","taskState":"FINISHED_STATE"},{"taskId":"Stage-3","type":"MAPRED","externalHandle":null,"taskState":"UNKNOWN_STATE"},{"taskId":"Stage-8","type":"MAPRED","externalHandle":null,"taskState":"FINISHED_STATE"},{"taskId":"Stage-8","type":"MAPRED","externalHandle":null,"taskState":"FINISHED_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":null,"taskState":"RUNNING_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":null,"taskState":"RUNNING_STATE"},{"taskId":"Stage-4","type":"MAPRED","externalHandle":n
ull,"taskState":"RUNNING_STATE"}]
>
> sales.customer_id customer_interests_chain.balias0 _c2
> Result available in memory, attaching here:
>
> 1 ["Fashion","Food"] 2
> 1 rows processed in (25) seconds.
> ----
>
>
> Thanks,
>
> Amareshwari Sriramadasu
>
>