[jira] [Work started] (HIVE-15634) Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization

2017-02-01 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15634 started by slim bouguerra.
-
> Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization
> 
>
> Key: HIVE-15634
> URL: https://issues.apache.org/jira/browse/HIVE-15634
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: slim bouguerra
>Priority: Critical
>
> {{SET hive.tez.java.opts=-Duser.timezone="UTC";}} can be used to change 
> timezone for Tez tasks. However, when Fetch optimizer kicks in because we can 
> push the full query to Druid, we obtain different values for the timestamp 
> than when jobs are executed. This probably has to do with the timezone on the 
> client side. How should we handle this issue?
> For instance, this can be observed with the following query:
> {code:sql}
> set hive.fetch.task.conversion=more;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-10-31 19:00:00
> 1999-11-01 19:00:00
> 1999-11-02 19:00:00
> 1999-11-03 19:00:00
> 1999-11-04 19:00:00
> 1999-11-05 19:00:00
> 1999-11-06 19:00:00
> 1999-11-07 19:00:00
> 1999-11-08 19:00:00
> set hive.fetch.task.conversion=none;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-11-01 00:00:00
> 1999-11-02 00:00:00
> 1999-11-03 00:00:00
> 1999-11-04 00:00:00
> 1999-11-05 00:00:00
> 1999-11-06 00:00:00
> 1999-11-07 00:00:00
> 1999-11-08 00:00:00
> 1999-11-09 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15634) Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization

2017-02-01 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849116#comment-15849116
 ] 

slim bouguerra commented on HIVE-15634:
---

making sure that the timezone is the same as the hive server will solve the 
issue.
Hence avoid doing {code} SET hive.tez.java.opts=-Duser.timezone="UTC" {code}

> Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization
> 
>
> Key: HIVE-15634
> URL: https://issues.apache.org/jira/browse/HIVE-15634
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>
> {{SET hive.tez.java.opts=-Duser.timezone="UTC";}} can be used to change 
> timezone for Tez tasks. However, when Fetch optimizer kicks in because we can 
> push the full query to Druid, we obtain different values for the timestamp 
> than when jobs are executed. This probably has to do with the timezone on the 
> client side. How should we handle this issue?
> For instance, this can be observed with the following query:
> {code:sql}
> set hive.fetch.task.conversion=more;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-10-31 19:00:00
> 1999-11-01 19:00:00
> 1999-11-02 19:00:00
> 1999-11-03 19:00:00
> 1999-11-04 19:00:00
> 1999-11-05 19:00:00
> 1999-11-06 19:00:00
> 1999-11-07 19:00:00
> 1999-11-08 19:00:00
> set hive.fetch.task.conversion=none;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-11-01 00:00:00
> 1999-11-02 00:00:00
> 1999-11-03 00:00:00
> 1999-11-04 00:00:00
> 1999-11-05 00:00:00
> 1999-11-06 00:00:00
> 1999-11-07 00:00:00
> 1999-11-08 00:00:00
> 1999-11-09 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15634) Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization

2017-02-01 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-15634:
-

Assignee: slim bouguerra  (was: Jesus Camacho Rodriguez)

> Hive/Druid integration: Timestamp column inconsistent w/o Fetch optimization
> 
>
> Key: HIVE-15634
> URL: https://issues.apache.org/jira/browse/HIVE-15634
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: slim bouguerra
>Priority: Critical
>
> {{SET hive.tez.java.opts=-Duser.timezone="UTC";}} can be used to change 
> timezone for Tez tasks. However, when Fetch optimizer kicks in because we can 
> push the full query to Druid, we obtain different values for the timestamp 
> than when jobs are executed. This probably has to do with the timezone on the 
> client side. How should we handle this issue?
> For instance, this can be observed with the following query:
> {code:sql}
> set hive.fetch.task.conversion=more;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-10-31 19:00:00
> 1999-11-01 19:00:00
> 1999-11-02 19:00:00
> 1999-11-03 19:00:00
> 1999-11-04 19:00:00
> 1999-11-05 19:00:00
> 1999-11-06 19:00:00
> 1999-11-07 19:00:00
> 1999-11-08 19:00:00
> set hive.fetch.task.conversion=none;
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-10 00:00:00';
> OK
> 1999-11-01 00:00:00
> 1999-11-02 00:00:00
> 1999-11-03 00:00:00
> 1999-11-04 00:00:00
> 1999-11-05 00:00:00
> 1999-11-06 00:00:00
> 1999-11-07 00:00:00
> 1999-11-08 00:00:00
> 1999-11-09 00:00:00
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15632) Hive/Druid integration: Incorrect result - Limit on timestamp disappears

2017-02-01 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849115#comment-15849115
 ] 

slim bouguerra commented on HIVE-15632:
---

there is no limit on timeseries query. 

> Hive/Druid integration: Incorrect result - Limit on timestamp disappears
> 
>
> Key: HIVE-15632
> URL: https://issues.apache.org/jira/browse/HIVE-15632
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>
> This can be observed with the following query:
> {code:sql}
> SELECT DISTINCT `__time`
> FROM store_sales_sold_time_subset_hive
> ORDER BY `__time` ASC
> LIMIT 10;
> {code}
> Query is translated correctly to Druid _timeseries_, but _limit_ operator 
> disappears.
> {code}
> OK
> Plan optimized by CBO.
> Stage-0
>   Fetch Operator
> limit:-1
> Select Operator [SEL_1]
>   Output:["_col0"]
>   TableScan [TS_0]
> 
> Output:["__time"],properties:{"druid.query.json":"{\"queryType\":\"timeseries\",\"dataSource\":\"druid_tpcds_ss_sold_time_subset\",\"descending\":false,\"granularity\":\"NONE\",\"aggregations\":[],\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"]}","druid.query.type":"timeseries"}
> {code}
> Thus, result has more than 10 rows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15727) Add pre insert work to give storage handler the possibility to perform pre insert checking

2017-02-03 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15727:
--
Attachment: HIVE-15727.3.patch

> Add pre insert work to give storage handler the possibility to perform pre 
> insert checking
> --
>
> Key: HIVE-15727
> URL: https://issues.apache.org/jira/browse/HIVE-15727
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15727.2.patch, HIVE-15727.3.patch, HIVE-15727.patch
>
>
> Add pre insert work stage to give storage handler the possibility to perform 
> pre insert checking. For instance for the druid storage handler this will 
> block the statement INSERT INTO statement.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15727) Add pre insert work to give storage handler the possibility to perform pre insert checking

2017-02-03 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852195#comment-15852195
 ] 

slim bouguerra commented on HIVE-15727:
---

[~ashutoshc] you are right i havn't rebased my branch

> Add pre insert work to give storage handler the possibility to perform pre 
> insert checking
> --
>
> Key: HIVE-15727
> URL: https://issues.apache.org/jira/browse/HIVE-15727
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15727.2.patch, HIVE-15727.3.patch, 
> HIVE-15727.4.patch, HIVE-15727.patch
>
>
> Add pre insert work stage to give storage handler the possibility to perform 
> pre insert checking. For instance for the druid storage handler this will 
> block the statement INSERT INTO statement.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15809) Typo in the PostgreSQL database name for druid service

2017-02-03 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-15809:
-


> Typo in the PostgreSQL database name for druid service
> --
>
> Key: HIVE-15809
> URL: https://issues.apache.org/jira/browse/HIVE-15809
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Trivial
> Fix For: 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15727) Add pre insert work to give storage handler the possibility to perform pre insert checking

2017-02-03 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15727:
--
Attachment: HIVE-15727.4.patch

> Add pre insert work to give storage handler the possibility to perform pre 
> insert checking
> --
>
> Key: HIVE-15727
> URL: https://issues.apache.org/jira/browse/HIVE-15727
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15727.2.patch, HIVE-15727.3.patch, 
> HIVE-15727.4.patch, HIVE-15727.patch
>
>
> Add pre insert work stage to give storage handler the possibility to perform 
> pre insert checking. For instance for the druid storage handler this will 
> block the statement INSERT INTO statement.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15809) Typo in the PostgreSQL database name for druid service

2017-02-03 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15809:
--
Attachment: HIVE-15809.patch

> Typo in the PostgreSQL database name for druid service
> --
>
> Key: HIVE-15809
> URL: https://issues.apache.org/jira/browse/HIVE-15809
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Trivial
> Fix For: 2.2.0
>
> Attachments: HIVE-15809.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15809) Typo in the PostgreSQL database name for druid service

2017-02-03 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15809:
--
Status: Patch Available  (was: Open)

> Typo in the PostgreSQL database name for druid service
> --
>
> Key: HIVE-15809
> URL: https://issues.apache.org/jira/browse/HIVE-15809
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Trivial
> Fix For: 2.2.0
>
> Attachments: HIVE-15809.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15850) Proper handling of timezone in Druid storage handler

2017-02-08 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858171#comment-15858171
 ] 

slim bouguerra commented on HIVE-15850:
---

@jcamachor most of the change of this PR is re-writing the implementation from 
`long` to `DateTime`, i might be missing something but not sure what is exactly 
fixing. It will be nice if you layout a test scenario that breaks the old code 
and get fixed by this patch.  

> Proper handling of timezone in Druid storage handler
> 
>
> Key: HIVE-15850
> URL: https://issues.apache.org/jira/browse/HIVE-15850
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-15850.patch
>
>
> We need to make sure that filters on timestamp are represented with timezone 
> when we go into Calcite and converting them again when we go back from 
> Calcite to Hive. That would help us to 1) push the correct filters to Druid, 
> and 2) if filters are not pushed at all (they remain in the Calcite plan), 
> they will be correctly represented in Hive. I have checked and AFAIK this is 
> currently done correctly (ASTBuilder.java, ExprNodeConverter.java, and 
> RexNodeConverter.java).
> Secondly, we need to make sure we read/write timestamp data correctly from/to 
> Druid.
> - When we write timestamp to Druid, we should include the timezone, which 
> would allow Druid to handle them properly. We do that already.
> - When we read timestamp from Druid, we should transform the timestamp to be 
> based on Hive timezone. This will give us a consistent behavior of 
> Druid-on-Hive vs Hive-standalone, since timestamp in Hive is represented to 
> the user using Hive client timezone. Currently we do not do that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-22 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.6.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.4.patch, 
> HIVE-15439.5.patch, HIVE-15439.6.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-25 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837866#comment-15837866
 ] 

slim bouguerra commented on HIVE-15439:
---

[~leftylev] Yes it needs to be added i guess.

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15439.3.patch, HIVE-15439.4.patch, 
> HIVE-15439.5.patch, HIVE-15439.6.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15727) Add pre insert work to give storage handler the possibility to perform pre insert checking

2017-01-25 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15727:
--
Attachment: HIVE-15727.patch

> Add pre insert work to give storage handler the possibility to perform pre 
> insert checking
> --
>
> Key: HIVE-15727
> URL: https://issues.apache.org/jira/browse/HIVE-15727
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15727.patch
>
>
> Add pre insert work stage to give storage handler the possibility to perform 
> pre insert checking. For instance for the druid storage handler this will 
> block the statement INSERT INTO statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15727) Add pre insert work to give storage handler the possibility to perform pre insert checking

2017-01-25 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15727:
--
Status: Patch Available  (was: Open)

> Add pre insert work to give storage handler the possibility to perform pre 
> insert checking
> --
>
> Key: HIVE-15727
> URL: https://issues.apache.org/jira/browse/HIVE-15727
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15727.patch
>
>
> Add pre insert work stage to give storage handler the possibility to perform 
> pre insert checking. For instance for the druid storage handler this will 
> block the statement INSERT INTO statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-20 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.5.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.4.patch, 
> HIVE-15439.5.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-20 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.4.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.4.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-20 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832647#comment-15832647
 ] 

slim bouguerra commented on HIVE-15586:
---

[~ashutoshc] thanks for review.

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15586.2.patch, HIVE-15586.patch, HIVE-15586.patch, 
> HIVE-15586.patch
>
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15727) Add pre insert work to give storage handler the possibility to perform pre insert checking

2017-01-26 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15727:
--
Attachment: HIVE-15727.2.patch

> Add pre insert work to give storage handler the possibility to perform pre 
> insert checking
> --
>
> Key: HIVE-15727
> URL: https://issues.apache.org/jira/browse/HIVE-15727
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
> Attachments: HIVE-15727.2.patch, HIVE-15727.patch
>
>
> Add pre insert work stage to give storage handler the possibility to perform 
> pre insert checking. For instance for the druid storage handler this will 
> block the statement INSERT INTO statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15727) Add pre insert work to give storage handler the possibility to perform pre insert checking

2017-01-25 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838388#comment-15838388
 ] 

slim bouguerra commented on HIVE-15727:
---

https://github.com/apache/hive/pull/139

> Add pre insert work to give storage handler the possibility to perform pre 
> insert checking
> --
>
> Key: HIVE-15727
> URL: https://issues.apache.org/jira/browse/HIVE-15727
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Fix For: 2.2.0
>
>
> Add pre insert work stage to give storage handler the possibility to perform 
> pre insert checking. For instance for the druid storage handler this will 
> block the statement INSERT INTO statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-25 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Labels: TODOC2.2  (was: )

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15439.3.patch, HIVE-15439.4.patch, 
> HIVE-15439.5.patch, HIVE-15439.6.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15990) Always initialize connection properties in DruidSerDe

2017-02-21 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876669#comment-15876669
 ] 

slim bouguerra commented on HIVE-15990:
---

I have tested this and it works. Thanks

> Always initialize connection properties in DruidSerDe
> -
>
> Key: HIVE-15990
> URL: https://issues.apache.org/jira/browse/HIVE-15990
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-15990.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Attachment: HIVE-15951.2.patch

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878627#comment-15878627
 ] 

slim bouguerra commented on HIVE-15951:
---

[~ashutoshc] please checkout the new fix. The delete is done on the close call.

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15969) Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer

2017-02-17 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872386#comment-15872386
 ] 

slim bouguerra commented on HIVE-15969:
---

After talking with [~thejas] offline we decided it is better to catch the 
exception from the final block and bubble up the exception from the create 
table function. 

> Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer
> ---
>
> Key: HIVE-15969
> URL: https://issues.apache.org/jira/browse/HIVE-15969
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
> Attachments: HIVE-15969.patch
>
>
> Looks like the additional failures in TestRemoteHiveMetaStore, 
> TestSetUGIOnOnlyServer
>  are related to this patch in HIVE-15877. 
> I don't think that change was intended.
> {code}
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> @@ -739,7 +739,10 @@ public void createTable(Table tbl, EnvironmentContext 
> envContext) throws Already
>  hook.commitCreateTable(tbl);
>}
>success = true;
> -} finally {
> +} catch (Exception e){
> +  LOG.error("Got exception from createTable", e);
> +}
> +finally {
>if (!success && (hook != null)) {
>  hook.rollbackCreateTable(tbl);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15969) Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer

2017-02-17 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15969:
--
Attachment: HIVE-15969.2.patch

> Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer
> ---
>
> Key: HIVE-15969
> URL: https://issues.apache.org/jira/browse/HIVE-15969
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
> Attachments: HIVE-15969.2.patch, HIVE-15969.patch
>
>
> Looks like the additional failures in TestRemoteHiveMetaStore, 
> TestSetUGIOnOnlyServer
>  are related to this patch in HIVE-15877. 
> I don't think that change was intended.
> {code}
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> @@ -739,7 +739,10 @@ public void createTable(Table tbl, EnvironmentContext 
> envContext) throws Already
>  hook.commitCreateTable(tbl);
>}
>success = true;
> -} finally {
> +} catch (Exception e){
> +  LOG.error("Got exception from createTable", e);
> +}
> +finally {
>if (!success && (hook != null)) {
>  hook.rollbackCreateTable(tbl);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15969) Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer

2017-02-17 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15969:
--
Status: Patch Available  (was: Open)

> Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer
> ---
>
> Key: HIVE-15969
> URL: https://issues.apache.org/jira/browse/HIVE-15969
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> Looks like the additional failures in TestRemoteHiveMetaStore, 
> TestSetUGIOnOnlyServer
>  are related to this patch in HIVE-15877. 
> I don't think that change was intended.
> {code}
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> @@ -739,7 +739,10 @@ public void createTable(Table tbl, EnvironmentContext 
> envContext) throws Already
>  hook.commitCreateTable(tbl);
>}
>success = true;
> -} finally {
> +} catch (Exception e){
> +  LOG.error("Got exception from createTable", e);
> +}
> +finally {
>if (!success && (hook != null)) {
>  hook.rollbackCreateTable(tbl);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15969) Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer

2017-02-17 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872321#comment-15872321
 ] 

slim bouguerra commented on HIVE-15969:
---

[~thejas] that change is only logging the exception, not sure why is it failing 
the UTs. will take a look.
If you want to remove it you can but i would like to have it in the future. 

> Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer
> ---
>
> Key: HIVE-15969
> URL: https://issues.apache.org/jira/browse/HIVE-15969
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Thejas M Nair
>
> Looks like the additional failures in TestRemoteHiveMetaStore, 
> TestSetUGIOnOnlyServer
>  are related to this patch in HIVE-15877. 
> I don't think that change was intended.
> {code}
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> @@ -739,7 +739,10 @@ public void createTable(Table tbl, EnvironmentContext 
> envContext) throws Already
>  hook.commitCreateTable(tbl);
>}
>success = true;
> -} finally {
> +} catch (Exception e){
> +  LOG.error("Got exception from createTable", e);
> +}
> +finally {
>if (!success && (hook != null)) {
>  hook.rollbackCreateTable(tbl);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15969) Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer

2017-02-17 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15969:
--
Attachment: HIVE-15969.patch

> Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer
> ---
>
> Key: HIVE-15969
> URL: https://issues.apache.org/jira/browse/HIVE-15969
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
> Attachments: HIVE-15969.patch
>
>
> Looks like the additional failures in TestRemoteHiveMetaStore, 
> TestSetUGIOnOnlyServer
>  are related to this patch in HIVE-15877. 
> I don't think that change was intended.
> {code}
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> @@ -739,7 +739,10 @@ public void createTable(Table tbl, EnvironmentContext 
> envContext) throws Already
>  hook.commitCreateTable(tbl);
>}
>success = true;
> -} finally {
> +} catch (Exception e){
> +  LOG.error("Got exception from createTable", e);
> +}
> +finally {
>if (!success && (hook != null)) {
>  hook.rollbackCreateTable(tbl);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15969) Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer

2017-02-17 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-15969:
-

Assignee: slim bouguerra

> Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer
> ---
>
> Key: HIVE-15969
> URL: https://issues.apache.org/jira/browse/HIVE-15969
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> Looks like the additional failures in TestRemoteHiveMetaStore, 
> TestSetUGIOnOnlyServer
>  are related to this patch in HIVE-15877. 
> I don't think that change was intended.
> {code}
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> @@ -739,7 +739,10 @@ public void createTable(Table tbl, EnvironmentContext 
> envContext) throws Already
>  hook.commitCreateTable(tbl);
>}
>success = true;
> -} finally {
> +} catch (Exception e){
> +  LOG.error("Got exception from createTable", e);
> +}
> +finally {
>if (!success && (hook != null)) {
>  hook.rollbackCreateTable(tbl);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-16 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-15951:
-

Assignee: slim bouguerra

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-16 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870792#comment-15870792
 ] 

slim bouguerra commented on HIVE-15951:
---

[~ashutoshc] can you please checkout this bug.

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-16 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Attachment: HIVE-15951.patch

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-16 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15951 started by slim bouguerra.
-
> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-17 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872796#comment-15872796
 ] 

slim bouguerra commented on HIVE-15951:
---

[~ashutoshc] i am not sure i am getting this, is it the delete on exit that 
won't work ? 

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879355#comment-15879355
 ] 

slim bouguerra commented on HIVE-15951:
---

[~ashutoshc] valid point but hive common is using that method as well so i 
think it is ok to use it.
https://github.com/b-slim/hive/blob/38ad77929980dc155dcc4a5d009a9a855eb5b017/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L755-L755


> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-23 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Status: Patch Available  (was: In Progress)

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-25 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884251#comment-15884251
 ] 

slim bouguerra commented on HIVE-15702:
---

[~ashutoshc] i have checked that all renamed tests are run now Thanks.

> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
> Attachments: HIVE-15702.patch
>
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15702:
--
Status: Patch Available  (was: Open)

> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883753#comment-15883753
 ] 

slim bouguerra commented on HIVE-15702:
---

@all sorry not sure how i missed this. So i confirm what Ashutosh Chauhan  said 
both need to be renamed.


> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15702:
--
Attachment: HIVE-15702.patch

> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
> Attachments: HIVE-15702.patch
>
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Attachment: HIVE-15951.2.patch

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Attachment: (was: HIVE-15951.2.patch)

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-10 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15877:
--
Component/s: Druid integration

> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-10 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15877:
--
Affects Version/s: 2.2.0

> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-10 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-15877:
-

Assignee: slim bouguerra

> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-10 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861722#comment-15861722
 ] 

slim bouguerra commented on HIVE-15877:
---

[~ashutoshc] can you please review this ? 


> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15877.patch
>
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-10 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15877:
--
Attachment: HIVE-15877.patch

> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15877.patch
>
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-10 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15877:
--
Status: Patch Available  (was: Open)

> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15877.patch
>
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15914) Fix issues with druid-handler pom file

2017-02-14 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866206#comment-15866206
 ] 

slim bouguerra commented on HIVE-15914:
---

[~jcamachorodriguez] thanks LGTM 

> Fix issues with druid-handler pom file
> --
>
> Key: HIVE-15914
> URL: https://issues.apache.org/jira/browse/HIVE-15914
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15914.patch
>
>
> Patch fixes multiple issues, including warnings when Hive is compiled due to 
> multiple definitions of the same dependency (joda-time).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15850) Proper handling of timezone in Druid storage handler

2017-02-09 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859821#comment-15859821
 ] 

slim bouguerra commented on HIVE-15850:
---

[~jcamachorodriguez] thanks LGTM.
More questions, can you ensure that if the user uses UDFs like 
to_utc_timestamp(timestamp, string timezone) this still work ?  

> Proper handling of timezone in Druid storage handler
> 
>
> Key: HIVE-15850
> URL: https://issues.apache.org/jira/browse/HIVE-15850
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-15850.patch
>
>
> We need to make sure that filters on timestamp are passed to Druid with 
> correct timezone.
> After CALCITE-1617, Calcite will generate a Druid query with intervals 
> without timezone specification. In Druid, these intervals will be assumed to 
> be in UTC (if Druid is running in UTC, which is currently the 
> recommendation). However, in Hive, those intervals should be assumed to be in 
> the user timezone. Thus, we should respect Hive semantics and include the 
> user timezone in the intervals passed to Druid.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-16 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15877:
--
Attachment: HIVE-15877.2.patch

> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15877.2.patch, HIVE-15877.patch
>
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15877) Upload dependency jars for druid storage handler

2017-02-16 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870623#comment-15870623
 ] 

slim bouguerra commented on HIVE-15877:
---

[~ashutoshc] Thanks i have used the suggested method 

> Upload dependency jars for druid storage handler
> 
>
> Key: HIVE-15877
> URL: https://issues.apache.org/jira/browse/HIVE-15877
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15877.2.patch, HIVE-15877.patch
>
>
> Upload dependency jars for druid storage handler



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15612) Include Calcite dependency in Druid storage handler jar

2017-01-18 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829072#comment-15829072
 ] 

slim bouguerra commented on HIVE-15612:
---

+1

> Include Calcite dependency in Druid storage handler jar
> ---
>
> Key: HIVE-15612
> URL: https://issues.apache.org/jira/browse/HIVE-15612
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-15612.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-18 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15586:
--
Attachment: HIVE-15586.patch

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15586.patch, HIVE-15586.patch, HIVE-15586.patch
>
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-18 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828979#comment-15828979
 ] 

slim bouguerra commented on HIVE-15036:
---

This one can be excluded, we don't need `io.airlift:airline` it is used to run 
druid CLI which is not really used here.


> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-16 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-16 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.patch

Same file added fake change 

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-18 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15586:
--
Attachment: HIVE-15586.patch

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15586.patch
>
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-18 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15586:
--
Status: Patch Available  (was: Open)

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-18 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-18 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15586:
--
Attachment: HIVE-15586.patch

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15586.patch, HIVE-15586.patch
>
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-18 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828835#comment-15828835
 ] 

slim bouguerra commented on HIVE-15586:
---

[~jcamachorodriguez] and [~ashutoshc] please take a look on this one. 

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15586.patch, HIVE-15586.patch
>
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-15036:
-

Assignee: slim bouguerra  (was: Jesus Camacho Rodriguez)

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15036:
--
Status: Patch Available  (was: Open)

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15036:
--
Attachment: HIVE-15036.patch

Excluded the concerned jar and added a banning plugin to enforce the rule even 
on transitive dependencies.

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
> Attachments: HIVE-15036.patch
>
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15036) Druid code recently included in Hive pulls in GPL jar

2017-01-19 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830510#comment-15830510
 ] 

slim bouguerra commented on HIVE-15036:
---

[~alangates] thanks very much please check this patch. As an add-on i have 
added a banning plugin to enforce the rule over the entire project, this will 
allow to catch transitive dependencies.
http://maven.apache.org/enforcer/enforcer-rules/bannedDependencies.html

> Druid code recently included in Hive pulls in GPL jar
> -
>
> Key: HIVE-15036
> URL: https://issues.apache.org/jira/browse/HIVE-15036
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Assignee: slim bouguerra
>Priority: Blocker
> Attachments: HIVE-15036.patch
>
>
> Druid pulls in a jar annotation-2.3.jar.  According to its pom file it is 
> licensed under GPL.  We cannot ship a binary distribution that includes this 
> jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.2.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.2.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15586) Make Insert and Create statement Transactional

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15586:
--
Attachment: HIVE-15586.2.patch

> Make Insert and Create statement Transactional
> --
>
> Key: HIVE-15586
> URL: https://issues.apache.org/jira/browse/HIVE-15586
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15586.2.patch, HIVE-15586.patch, HIVE-15586.patch, 
> HIVE-15586.patch
>
>
> Currently insert/create will return the handle to user without waiting for 
> the data been loaded by the druid cluster. In order to avoid that will add a 
> passive wait till the segment are loaded by historical in case the 
> coordinator is UP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: (was: HIVE-15439.patch)

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: HIVE-15439.3.patch

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: (was: HIVE-15439.patch)

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.3.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15439) Support INSERT OVERWRITE for internal druid datasources.

2017-01-19 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15439:
--
Attachment: (was: HIVE-15439.2.patch)

> Support INSERT OVERWRITE for internal druid datasources.
> 
>
> Key: HIVE-15439
> URL: https://issues.apache.org/jira/browse/HIVE-15439
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, HIVE-15439.patch, 
> HIVE-15439.patch
>
>
> Add support for SQL statement INSERT OVERWRITE TABLE druid_internal_table.
> In order to add this support will need to add new post insert hook to update 
> the druid metadata. Creation of the segment will be the same as CTAS.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-16034) Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat

2017-02-27 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886049#comment-15886049
 ] 

slim bouguerra commented on HIVE-16034:
---

[~jcamachorodriguez] can we add a unit-test for this case ?

> Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat
> 
>
> Key: HIVE-16034
> URL: https://issues.apache.org/jira/browse/HIVE-16034
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16034.patch
>
>
> We are extracting the type name by String, which might cause issues, e.g., 
> for Decimal, where type includes precision and scale. Instead, we should 
> check the PrimitiveCategory enum.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16034) Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat

2017-02-27 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886309#comment-15886309
 ] 

slim bouguerra commented on HIVE-16034:
---

[~jcamachorodriguez] seems like the contract of {code} 
getHiveRecordWriter{code} is to parse the {code} tableProperties{code} object 
and build stuff like aggregators.
Thought would be to mock {code} tableProperties{code}  and make sure that we 
get the right aggregators/dimension specs or that we get an exception for the 
case of timestamp. If you think it is an overkill that's ok we can skip it.
Overall i am +1 

> Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat
> 
>
> Key: HIVE-16034
> URL: https://issues.apache.org/jira/browse/HIVE-16034
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16034.01.patch, HIVE-16034.patch
>
>
> We are extracting the type name by String, which might cause issues, e.g., 
> for Decimal, where type includes precision and scale. Instead, we should 
> check the PrimitiveCategory enum.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14543) Create Druid table without specifying data source

2016-10-20 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593340#comment-15593340
 ] 

slim bouguerra commented on HIVE-14543:
---

how we can get the hive table name ?

> Create Druid table without specifying data source
> -
>
> Key: HIVE-14543
> URL: https://issues.apache.org/jira/browse/HIVE-14543
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>
> We should be able to omit the Druid datasource from the TBLPROPERTIES. In 
> that case, the Druid datasource name should match the Hive table name.
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler';
> TBLPROPERTIES ("druid.address" = "localhost");
> {code}
> For instance, the statement above creates a table that references the Druid 
> datasource "druid_table_1".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-23 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691629#comment-15691629
 ] 

slim bouguerra commented on HIVE-15273:
---

FYI the way lifecycle is created need to be fixed will send that in a follow up.

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15273) Http Client not configured correctly

2016-11-23 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15273:
--
Attachment: 0001-adding-confing-to-http-client.patch

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14474) Create datasource in Druid from Hive

2016-11-23 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691733#comment-15691733
 ] 

slim bouguerra commented on HIVE-14474:
---

have add a new patch that creates/deletes druid segments 
https://issues.apache.org/jira/browse/HIVE-15277

> Create datasource in Druid from Hive
> 
>
> Key: HIVE-14474
> URL: https://issues.apache.org/jira/browse/HIVE-14474
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14474.01.patch, HIVE-14474.02.patch, 
> HIVE-14474.03.patch, HIVE-14474.04.patch, HIVE-14474.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In the initial implementation proposed in this issue, we will write the 
> results of the query to HDFS (or the location specified in the CTAS 
> statement), and submit a HadoopIndexing task to the Druid overlord. The task 
> will contain the path where data was stored, it will read it and create the 
> segments in Druid. Once this is done, the results are removed from Hive.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "my_query_based_datasource")
> AS ;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'my_query_based_datasource'. One of the columns of the query 
> needs to be the time dimension, which is mandatory in Druid. In particular, 
> we use the same convention that it is used for Druid: there needs to be a the 
> column named '\_\_time' in the result of the executed query, which will act 
> as the time dimension column in Druid. Currently, the time column dimension 
> needs to be a 'timestamp' type column.
> This initial implementation interacts with Druid API as it is currently 
> exposed to the user. In a follow-up issue, we should propose an 
> implementation that integrates tighter with Druid. In particular, we would 
> like to store segments directly in Druid from Hive, thus avoiding the 
> overhead of writing Hive results to HDFS and then launching a MR job that 
> basically reads them again to create the segments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15273) Http Client not configured correctly

2016-11-23 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15273:
--
Status: Patch Available  (was: Open)

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15273) Http Client not configured correctly

2016-11-23 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15273:
--
Status: Open  (was: Patch Available)

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15273) Http Client not configured correctly

2016-11-23 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15273:
--
Status: Patch Available  (was: Open)

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-23 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: file.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-23 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Status: Patch Available  (was: Open)

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15274) wrong results on the column __time

2016-11-23 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690640#comment-15690640
 ] 

slim bouguerra commented on HIVE-15274:
---

i will verify that, was thinking maybe be we can add some unit test to catch 
this ?

> wrong results on the column __time
> --
>
> Key: HIVE-15274
> URL: https://issues.apache.org/jira/browse/HIVE-15274
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-15274.patch
>
>
> issuing select * from table will return wrong time column.
> expected results
>  ─┬┬─┐
> │ __time  │ dimension1 │ metric1 │
> ├─┼┼─┤
> │ Wed Dec 31 2014 16:00:00 GMT-0800 (PST) │ value1 │ 1   │
> │ Wed Dec 31 2014 16:00:00 GMT-0800 (PST) │ value1.1   │ 1   │
> │ Sun May 31 2015 19:00:00 GMT-0700 (PDT) │ value2 │ 20.5│
> │ Sun May 31 2015 19:00:00 GMT-0700 (PDT) │ value2.1   │ 32  │
> └─┴┴─┘
> returned result
> 2014-12-31 19:00:00   value1  1.0
> 2014-12-31 19:00:00   value1.11.0
> 2014-12-31 19:00:00   value2  20.5
> 2014-12-31 19:00:00   value2.132.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703206#comment-15703206
 ] 

slim bouguerra commented on HIVE-15277:
---

This will fail till druid 0.9.2 is released. But it can be reviewed  

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.2.patch

formatting 

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15273:
--
Attachment: HIVE-15273.patch

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702627#comment-15702627
 ] 

slim bouguerra commented on HIVE-15273:
---

[~leftylev] thanks for the comments ! i have uploaded a new patch.
[~jcamachorodriguez] thanks for testing it please checkout the new patch.

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702667#comment-15702667
 ] 

slim bouguerra commented on HIVE-15273:
---

added PR https://github.com/apache/hive/pull/119

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15393) Update Guava version

2016-12-08 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15393:
--
Status: Patch Available  (was: Open)

> Update Guava version
> 
>
> Key: HIVE-15393
> URL: https://issues.apache.org/jira/browse/HIVE-15393
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Priority: Blocker
> Attachments: HIVE-15393.patch
>
>
> Druid base code is using newer version of guava 16.0.1 that is not compatible 
> with the current version used by Hive.
> FYI Hadoop project is moving to Guava 18 not sure if it is better to move to 
> guava 18 or even 19.
> https://issues.apache.org/jira/browse/HADOOP-10101



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15393) Update Guava version

2016-12-08 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15393:
--
Attachment: HIVE-15393.patch

> Update Guava version
> 
>
> Key: HIVE-15393
> URL: https://issues.apache.org/jira/browse/HIVE-15393
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Priority: Blocker
> Attachments: HIVE-15393.patch
>
>
> Druid base code is using newer version of guava 16.0.1 that is not compatible 
> with the current version used by Hive.
> FYI Hadoop project is moving to Guava 18 not sure if it is better to move to 
> guava 18 or even 19.
> https://issues.apache.org/jira/browse/HADOOP-10101



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-12-13 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-12-15 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, 
> file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-12-15 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-12-15 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752461#comment-15752461
 ] 

slim bouguerra commented on HIVE-15277:
---

Run some of the tests locally and they passed
https://gist.github.com/b-slim/dfa29b07ee901b5f0c8437975488436f


> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-12-14 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-12-14 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-12-14 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, HIVE-15277.patch, 
> HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >