[
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702905#comment-15702905
]
ASF GitHub Bot commented on HIVE-15277:
---------------------------------------
GitHub user b-slim opened a pull request:
https://github.com/apache/hive/pull/120
HIVE-15277 Druid stograge handler
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/b-slim/hive rebase_druid_record_writer
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/hive/pull/120.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #120
----
commit 9025d4a33348faa007c17f2c7ff5dee4f3a87318
Author: Slim Bouguerra <[email protected]>
Date: 2016-10-26T23:55:34Z
adding druid record writer
bump guava version to 16.0.1
moving out the injector
commit be2e29dcba5617db478eefa75a5478a77512e090
Author: Jesus Camacho Rodriguez <[email protected]>
Date: 2016-11-02T03:21:59Z
Druid time granularity partitioning, serializer and necessary extensions
commit df4036f7f76294dc5599d29cdb760336b0ee9a4f
Author: Jesus Camacho Rodriguez <[email protected]>
Date: 2016-11-02T19:59:52Z
Recognition of dimensions and metrics
patch 1
commit ea76f0ddfa33990d92e061676123c45920ed6dce
Author: Slim Bouguerra <[email protected]>
Date: 2016-11-02T21:18:00Z
adding file schema support
commit 010701be7cf939f6854c9ee113ccf40b20aed32a
Author: Jesus Camacho Rodriguez <[email protected]>
Date: 2016-11-04T19:48:43Z
native storage
new fixes
commit 3d8496299d1d151da59bb6f547ebbc475c329197
Author: Slim Bouguerra <[email protected]>
Date: 2016-11-09T17:57:03Z
using segment output path
commit 2b10b26eb7a5d9a6058c9e1f206c599e54ec88b2
Author: Slim Bouguerra <[email protected]>
Date: 2016-11-16T00:16:10Z
adding check for existing datasource and implement drop table
commit e18b716a438e8b38155d4ab31b7070ae1945f1e4
Author: Slim Bouguerra <[email protected]>
Date: 2016-11-19T00:53:10Z
adding UTs and refactor some code
commit 3b31d16dcb9fd5cdb9eb6d1c994cb3f0c8cd8a33
Author: Slim Bouguerra <[email protected]>
Date: 2016-11-23T23:49:28Z
fix druid version
commit 4b447e56389aab1f45e9b48192068d1a0257a14c
Author: Slim Bouguerra <[email protected]>
Date: 2016-11-28T19:32:02Z
ignore record writer test
commit a7b4f792a5e28b0772addbc0d5ea52d5b44d9d91
Author: Slim Bouguerra <[email protected]>
Date: 2016-11-28T19:38:25Z
format code
----
> Teach Hive how to create/delete Druid segments
> -----------------------------------------------
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
> Issue Type: Sub-task
> Components: Druid integration
> Affects Versions: 2.2.0
> Reporter: slim bouguerra
> Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS <select `timecolumn` as `___time`, `dimension1`,`dimension2`, `metric1`,
> `metric2`....>;
> {code}
> This statement stores the results of query <input_query> in a Druid
> datasource named 'datasourcename'. One of the columns of the query needs to
> be the time dimension, which is mandatory in Druid. In particular, we use the
> same convention that it is used for Druid: there needs to be a the column
> named '__time' in the result of the executed query, which will act as the
> time dimension column in Druid. Currently, the time column dimension needs to
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings.
> Keep in mind that druid has a clear separation between dimensions and
> metrics, therefore if you have a column in hive that contains number and need
> to be presented as dimension use the cast operator to cast as string.
> This initial implementation interacts with Druid Meta data storage to
> add/remove the table in druid, user need to supply the meta data config as
> --hiveconf hive.druid.metadata.password=XXX --hiveconf
> hive.druid.metadata.username=druid --hiveconf
> hive.druid.metadata.uri=jdbc:mysql://host/druid
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)