[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838386#comment-15838386 ] ASF GitHub Bot commented on HIVE-15277: --- Github user b-slim closed the pull request at: https://github.com/apache/hive/pull/120 > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: file.patch, HIVE-15277.2.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753038#comment-15753038 ] Lefty Leverenz commented on HIVE-15277: --- Also document the new configuration parameters: * *hive.druid.indexer.segments.granularity* * *hive.druid.indexer.partition.size.max* * *hive.druid.indexer.memory.rownum.max* * *hive.druid.basePersistDirectory* * *hive.druid.storage.storageDirectory* * *hive.druid.metadata.base* * *hive.druid.metadata.db.type* * *hive.druid.metadata.username* * *hive.druid.metadata.password* * *hive.druid.metadata.uri* * *hive.druid.working.directory* At this point there are enough Druid configuration parameters for a separate subsection in the Configuration Properties doc. (Also see HIVE-14217 and HIVE-15273.) * [Hive Configuration Properties | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveConfigurationProperties] Added a TODOC2.2 label. > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752490#comment-15752490 ] Jesus Camacho Rodriguez commented on HIVE-15277: We should update the Druid integration wiki with information about new features introduced in this patch. https://cwiki.apache.org/confluence/display/Hive/Druid+Integration > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Fix For: 2.2.0 > > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752461#comment-15752461 ] slim bouguerra commented on HIVE-15277: --- Run some of the tests locally and they passed https://gist.github.com/b-slim/dfa29b07ee901b5f0c8437975488436f > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752412#comment-15752412 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843449/HIVE-15277.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10817 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=92) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=93) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[5] (batchId=173) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[0] (batchId=173) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2594/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2594/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2594/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843449 - PreCommit-HIVE-Build > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15751777#comment-15751777 ] Jesus Camacho Rodriguez commented on HIVE-15277: [~bslim], some of the test failures (age=1) seem related to this patch. Could you take a look? Change in _LineageLogger_ causes changes in lineage golden files. It is better to tackle that in a follow-up and remove it from this patch, as it is not part of this issue. > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15751741#comment-15751741 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843418/HIVE-15277.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10814 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2591/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2591/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2591/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843418 - PreCommit-HIVE-Build > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15751417#comment-15751417 ] Jesus Camacho Rodriguez commented on HIVE-15277: +1 > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, > file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749770#comment-15749770 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843292/HIVE-15277.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10813 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=133) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery (batchId=216) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2580/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2580/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2580/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843292 - PreCommit-HIVE-Build > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749650#comment-15749650 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843292/HIVE-15277.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10813 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2579/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2579/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2579/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843292 - PreCommit-HIVE-Build > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749407#comment-15749407 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843276/HIVE-15277.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10782 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=144) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[druid_location] (batchId=85) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2577/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2577/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2577/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843276 - PreCommit-HIVE-Build > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > HIVE-15277.patch, HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747890#comment-15747890 ] Jesus Camacho Rodriguez commented on HIVE-15277: [~bslim], there are some failures related to this patch. We need to fix them before checking it in. > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747102#comment-15747102 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843141/HIVE-15277.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 10814 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[druid_external] (batchId=85) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[druid_location] (batchId=85) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2567/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2567/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2567/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843141 - PreCommit-HIVE-Build > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, > file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704721#comment-15704721 ] Jesus Camacho Rodriguez commented on HIVE-15277: [~bslim], I created HIVE-15303 to track the upgrade to 0.9.2. Do you have any estimate on when the new version will be released? > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703282#comment-15703282 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12840702/HIVE-15277.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2305/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2305/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2305/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-11-28 22:02:14.114 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-2305/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2016-11-28 22:02:14.117 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive abab282..78ab72e branch-2.1 -> origin/branch-2.1 + git reset --hard HEAD HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios (Sushanth Sowmyan reviewed by Vaibhav Gumashta) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios (Sushanth Sowmyan reviewed by Vaibhav Gumashta) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-11-28 22:02:16.126 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p0 patching file common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java patching file common/src/java/org/apache/hadoop/hive/conf/Constants.java patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java patching file druid-handler/README.md patching file druid-handler/pom.xml patching file druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidOutputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidQueryBasedInputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidSplit.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidOutputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidQueryBasedInputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidRecordWriter.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/HiveDruidSplit.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidGroupByQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSelectQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDe.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDeUtils.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTimeseriesQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTopNQueryRecordReader.java patching file druid-handler/src/test/org/apache/hadoop/hive/druid/DruidStorageHandlerTest.java patching file druid-handler/src/test/org/apache/hadoop/hive/druid/QTestDruidSerDe.java patching file
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703206#comment-15703206 ] slim bouguerra commented on HIVE-15277: --- This will fail till druid 0.9.2 is released. But it can be reviewed > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703139#comment-15703139 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12840702/HIVE-15277.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2303/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2303/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2303/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-11-28 21:15:28.263 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-2303/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2016-11-28 21:15:28.265 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 1aebe9d..63bdfa6 master -> origin/master + git reset --hard HEAD HEAD is now at 1aebe9d HIVE-15168: Flaky test: TestSparkClient.testJobSubmission (still flaky) (Barna Zsombor Klara via Rui Li, reviewed by Xuefu Zhang and Rui Li) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios (Sushanth Sowmyan reviewed by Vaibhav Gumashta) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-11-28 21:15:29.546 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p0 patching file common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java patching file common/src/java/org/apache/hadoop/hive/conf/Constants.java patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java patching file druid-handler/README.md patching file druid-handler/pom.xml patching file druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidOutputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidQueryBasedInputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidSplit.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidOutputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidQueryBasedInputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidRecordWriter.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/HiveDruidSplit.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidGroupByQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSelectQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDe.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDeUtils.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTimeseriesQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTopNQueryRecordReader.java patching file druid-handler/src/test/org/apache/hadoop/hive/druid/DruidStorageHandlerTest.java
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702905#comment-15702905 ] ASF GitHub Bot commented on HIVE-15277: --- GitHub user b-slim opened a pull request: https://github.com/apache/hive/pull/120 HIVE-15277 Druid stograge handler You can merge this pull request into a Git repository by running: $ git pull https://github.com/b-slim/hive rebase_druid_record_writer Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/120.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #120 commit 9025d4a33348faa007c17f2c7ff5dee4f3a87318 Author: Slim BouguerraDate: 2016-10-26T23:55:34Z adding druid record writer bump guava version to 16.0.1 moving out the injector commit be2e29dcba5617db478eefa75a5478a77512e090 Author: Jesus Camacho Rodriguez Date: 2016-11-02T03:21:59Z Druid time granularity partitioning, serializer and necessary extensions commit df4036f7f76294dc5599d29cdb760336b0ee9a4f Author: Jesus Camacho Rodriguez Date: 2016-11-02T19:59:52Z Recognition of dimensions and metrics patch 1 commit ea76f0ddfa33990d92e061676123c45920ed6dce Author: Slim Bouguerra Date: 2016-11-02T21:18:00Z adding file schema support commit 010701be7cf939f6854c9ee113ccf40b20aed32a Author: Jesus Camacho Rodriguez Date: 2016-11-04T19:48:43Z native storage new fixes commit 3d8496299d1d151da59bb6f547ebbc475c329197 Author: Slim Bouguerra Date: 2016-11-09T17:57:03Z using segment output path commit 2b10b26eb7a5d9a6058c9e1f206c599e54ec88b2 Author: Slim Bouguerra Date: 2016-11-16T00:16:10Z adding check for existing datasource and implement drop table commit e18b716a438e8b38155d4ab31b7070ae1945f1e4 Author: Slim Bouguerra Date: 2016-11-19T00:53:10Z adding UTs and refactor some code commit 3b31d16dcb9fd5cdb9eb6d1c994cb3f0c8cd8a33 Author: Slim Bouguerra Date: 2016-11-23T23:49:28Z fix druid version commit 4b447e56389aab1f45e9b48192068d1a0257a14c Author: Slim Bouguerra Date: 2016-11-28T19:32:02Z ignore record writer test commit a7b4f792a5e28b0772addbc0d5ea52d5b44d9d91 Author: Slim Bouguerra Date: 2016-11-28T19:38:25Z format code > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702078#comment-15702078 ] Jesus Camacho Rodriguez commented on HIVE-15277: [~bslim], it seems you need to rebase the patch as it did not apply cleanly on master. In addition, could you create a GitHub PR or [RB post|https://cwiki.apache.org/confluence/display/Hive/Review+Board] so it is easier to review the patch? Thanks > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Sub-task > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > {code:sql} > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > {code} > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15692034#comment-15692034 ] Hive QA commented on HIVE-15277: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12840348/file.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2275/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2275/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2275/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-11-24 03:06:11.758 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-2275/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2016-11-24 03:06:11.760 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 3dd28fb HIVE-15180: Extend JSONMessageFactory to store additional information about metadata objects on different table events (Sushanth Sowmyan, Vaibhav Gumashta reviewed by Thejas Nair) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 3dd28fb HIVE-15180: Extend JSONMessageFactory to store additional information about metadata objects on different table events (Sushanth Sowmyan, Vaibhav Gumashta reviewed by Thejas Nair) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-11-24 03:06:12.703 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p1 patching file common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java patching file common/src/java/org/apache/hadoop/hive/conf/Constants.java patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java Hunk #1 succeeded at 1929 (offset 4 lines). Hunk #2 succeeded at 1943 (offset 4 lines). patching file druid-handler/README.md patching file druid-handler/pom.xml patching file druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidOutputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidQueryBasedInputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidSplit.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidOutputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidQueryBasedInputFormat.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidRecordWriter.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/io/HiveDruidSplit.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidQueryRecordReader.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDe.java patching file druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDeUtils.java patching file druid-handler/src/test/org/apache/hadoop/hive/druid/DruidStorageHandlerTest.java patching file druid-handler/src/test/org/apache/hadoop/hive/druid/TestDerbyConnector.java patching file druid-handler/src/test/org/apache/hadoop/hive/druid/TestHiveDruidQueryBasedInputFormat.java patching file druid-handler/src/test/org/apache/hadoop/hive/ql/io/DruidRecordWriterTest.java patching file llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java patching
[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments
[ https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691812#comment-15691812 ] Ashutosh Chauhan commented on HIVE-15277: - Need to name patch file per: https://cwiki.apache.org/confluence/display/Hive/Hive+PreCommit+Patch+Testing for automated QA to run tests. > Teach Hive how to create/delete Druid segments > --- > > Key: HIVE-15277 > URL: https://issues.apache.org/jira/browse/HIVE-15277 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 2.2.0 >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: file.patch > > > We want to extend the DruidStorageHandler to support CTAS queries. > In this implementation Hive will generate druid segment files and insert the > metadata to signal the handoff to druid. > The syntax will be as follows: > CREATE TABLE druid_table_1 > STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' > TBLPROPERTIES ("druid.datasource" = "datasourcename") > AS `metric2`>; > This statement stores the results of query in a Druid > datasource named 'datasourcename'. One of the columns of the query needs to > be the time dimension, which is mandatory in Druid. In particular, we use the > same convention that it is used for Druid: there needs to be a the column > named '__time' in the result of the executed query, which will act as the > time dimension column in Druid. Currently, the time column dimension needs to > be a 'timestamp' type column. > metrics can be of type long, double and float while dimensions are strings. > Keep in mind that druid has a clear separation between dimensions and > metrics, therefore if you have a column in hive that contains number and need > to be presented as dimension use the cast operator to cast as string. > This initial implementation interacts with Druid Meta data storage to > add/remove the table in druid, user need to supply the meta data config as > --hiveconf hive.druid.metadata.password=XXX --hiveconf > hive.druid.metadata.username=druid --hiveconf > hive.druid.metadata.uri=jdbc:mysql://host/druid -- This message was sent by Atlassian JIRA (v6.3.4#6332)