[ https://issues.apache.org/jira/browse/HIVE-23030?focusedWorklogId=412389&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-412389 ]
ASF GitHub Bot logged work on HIVE-23030: ----------------------------------------- Author: ASF GitHub Bot Created on: 30/Mar/20 15:32 Start Date: 30/Mar/20 15:32 Worklog Time Spent: 10m Work Description: b-slim commented on pull request #960: HIVE-23030 ds rollup union URL: https://github.com/apache/hive/pull/960#discussion_r400286922 ########## File path: ql/src/test/results/clientpositive/llap/sketches_materialized_view_rollup.q.out ########## @@ -0,0 +1,262 @@ +PREHOOK: query: create table sketch_input (id int, category char(1)) +STORED AS ORC +TBLPROPERTIES ('transactional'='true') +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@sketch_input +POSTHOOK: query: create table sketch_input (id int, category char(1)) +STORED AS ORC +TBLPROPERTIES ('transactional'='true') +POSTHOOK: type: CREATETABLE +POSTHOOK: Output: database:default +POSTHOOK: Output: default@sketch_input +PREHOOK: query: insert into table sketch_input values + (1,'a'),(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 'a'), (9, 'a'), (10, 'a'), + (6,'b'),(6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), (13, 'b'), (14, 'b'), (15, 'b') +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@sketch_input +POSTHOOK: query: insert into table sketch_input values + (1,'a'),(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 'a'), (9, 'a'), (10, 'a'), + (6,'b'),(6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), (13, 'b'), (14, 'b'), (15, 'b') +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@sketch_input +POSTHOOK: Lineage: sketch_input.category SCRIPT [] +POSTHOOK: Lineage: sketch_input.id SCRIPT [] +PREHOOK: query: create materialized view mv_1 as + select category, ds_hll_sketch(id),count(id) from sketch_input group by category +PREHOOK: type: CREATE_MATERIALIZED_VIEW +PREHOOK: Input: default@sketch_input +PREHOOK: Output: database:default +PREHOOK: Output: default@mv_1 +POSTHOOK: query: create materialized view mv_1 as + select category, ds_hll_sketch(id),count(id) from sketch_input group by category +POSTHOOK: type: CREATE_MATERIALIZED_VIEW +POSTHOOK: Input: default@sketch_input +POSTHOOK: Output: database:default +POSTHOOK: Output: default@mv_1 +PREHOOK: query: explain +select category, ds_hll_estimate(ds_hll_sketch(id)) from sketch_input group by category +PREHOOK: type: QUERY +PREHOOK: Input: default@mv_1 +PREHOOK: Input: default@sketch_input +#### A masked pattern was here #### +POSTHOOK: query: explain +select category, ds_hll_estimate(ds_hll_sketch(id)) from sketch_input group by category +POSTHOOK: type: QUERY +POSTHOOK: Input: default@mv_1 +POSTHOOK: Input: default@sketch_input +#### A masked pattern was here #### +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 + Tez +#### A masked pattern was here #### + Vertices: + Map 1 + Map Operator Tree: + TableScan + alias: default.mv_1 + Statistics: Num rows: 2 Data size: 362 Basic stats: COMPLETE Column stats: COMPLETE + Select Operator + expressions: category (type: char(1)), ds_hll_estimate(_c1) (type: double) + outputColumnNames: _col0, _col1 + Statistics: Num rows: 2 Data size: 186 Basic stats: COMPLETE Column stats: COMPLETE + File Output Operator + compressed: false + Statistics: Num rows: 2 Data size: 186 Basic stats: COMPLETE Column stats: COMPLETE + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + Execution mode: vectorized, llap + LLAP IO: all inputs + + Stage: Stage-0 + Fetch Operator + limit: -1 + Processor Tree: + ListSink + +PREHOOK: query: select category, ds_hll_estimate(ds_hll_sketch(id)) from sketch_input group by category +PREHOOK: type: QUERY +PREHOOK: Input: default@mv_1 +PREHOOK: Input: default@sketch_input +#### A masked pattern was here #### +POSTHOOK: query: select category, ds_hll_estimate(ds_hll_sketch(id)) from sketch_input group by category +POSTHOOK: type: QUERY +POSTHOOK: Input: default@mv_1 +POSTHOOK: Input: default@sketch_input +#### A masked pattern was here #### +a 10.000000223517425 Review comment: fyi I think you are still on the mercy of rounding. for instance round(1.00000000001) is different form round(0.999999999999999) and the error rate is acceptable from the sketching side. I recommend doing adding something like compare with upper bound or lower bound or use case statement and/or add a comment line to explain why this test can be flaky. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 412389) Time Spent: 4h 40m (was: 4.5h) > Enable sketch union-s to be rolled up > ------------------------------------- > > Key: HIVE-23030 > URL: https://issues.apache.org/jira/browse/HIVE-23030 > Project: Hive > Issue Type: Sub-task > Reporter: Zoltan Haindrich > Assignee: Zoltan Haindrich > Priority: Major > Labels: pull-request-available > Attachments: HIVE-23030.01.patch, HIVE-23030.02.patch, > HIVE-23030.03.patch, HIVE-23030.04.patch, HIVE-23030.04.patch, > HIVE-23030.05.patch > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Enabling rolling up sketch aggregates could enable the matching of > materialized views created for higher dimensions to be applied for lower > dimension cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)