[jira] [Comment Edited] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702016#comment-15702016
 ] 

Jesus Camacho Rodriguez edited comment on HIVE-15273 at 11/28/16 2:01 PM:
--

[~bslim], I have just checked the patch and overall looks good.

In addition to addressing [~leftylev]'s comments, could you remove the 
initialization for _numConnection_ and _readTimeout_ with hardcoded values in 
DruidSerDe? Since we will pass through the _initialization_ method and default 
values already exist in HiveConf.java, we do not need to hardcode the value for 
these properties. Thanks


was (Author: jcamachorodriguez):
[~bslim], I have just checked the patch and overall looks good.

In addition to addressing [~leftylev]'s comments, could you remove the 
initialization for _ numConnection_ and _readTimeout_ with hardcoded values in 
DruidSerDe? Since we will pass through the _initialization_ method and default 
values already exist in HiveConf.java, we do not need to hardcode the value for 
these properties. Thanks

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702016#comment-15702016
 ] 

Jesus Camacho Rodriguez commented on HIVE-15273:


[~bslim], I have just checked the patch and overall looks good.

In addition to addressing [~leftylev]'s comments, could you remove the 
initialization for _ numConnection_ and _readTimeout_ with hardcoded values in 
DruidSerDe? Since we will pass through the _initialization_ method and default 
values already exist in HiveConf.java, we do not need to hardcode the value for 
these properties. Thanks

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15277:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-14473

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2016-11-28 Thread thauvin damien (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

thauvin damien updated HIVE-15157:
--
Priority: Major  (was: Critical)

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15277:
---
Description: 
We want to extend the DruidStorageHandler to support CTAS queries.
In this implementation Hive will generate druid segment files and insert the 
metadata to signal the handoff to druid.

The syntax will be as follows:
{code:sql}
CREATE TABLE druid_table_1
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES ("druid.datasource" = "datasourcename")
AS ;
{code}

This statement stores the results of query  in a Druid datasource 
named 'datasourcename'. One of the columns of the query needs to be the time 
dimension, which is mandatory in Druid. In particular, we use the same 
convention that it is used for Druid: there needs to be a the column named 
'__time' in the result of the executed query, which will act as the time 
dimension column in Druid. Currently, the time column dimension needs to be a 
'timestamp' type column.
metrics can be of type long, double and float while dimensions are strings. 
Keep in mind that druid has a clear separation between dimensions and metrics, 
therefore if you have a column in hive that contains number and need to be 
presented as dimension use the cast operator to cast as string. 
This initial implementation interacts with Druid Meta data storage to 
add/remove the table in druid, user need to supply the meta data config as 
--hiveconf hive.druid.metadata.password=XXX --hiveconf 
hive.druid.metadata.username=druid --hiveconf 
hive.druid.metadata.uri=jdbc:mysql://host/druid

  was:


We want to extend the DruidStorageHandler to support CTAS queries.
In this implementation Hive will generate druid segment files and insert the 
metadata to signal the handoff to druid.

The syntax will be as follows:

CREATE TABLE druid_table_1
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES ("druid.datasource" = "datasourcename")
AS ;

This statement stores the results of query  in a Druid datasource 
named 'datasourcename'. One of the columns of the query needs to be the time 
dimension, which is mandatory in Druid. In particular, we use the same 
convention that it is used for Druid: there needs to be a the column named 
'__time' in the result of the executed query, which will act as the time 
dimension column in Druid. Currently, the time column dimension needs to be a 
'timestamp' type column.
metrics can be of type long, double and float while dimensions are strings. 
Keep in mind that druid has a clear separation between dimensions and metrics, 
therefore if you have a column in hive that contains number and need to be 
presented as dimension use the cast operator to cast as string. 
This initial implementation interacts with Druid Meta data storage to 
add/remove the table in druid, user need to supply the meta data config as 
--hiveconf hive.druid.metadata.password=XXX --hiveconf 
hive.druid.metadata.username=druid --hiveconf 
hive.druid.metadata.uri=jdbc:mysql://host/druid


> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 

[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702078#comment-15702078
 ] 

Jesus Camacho Rodriguez commented on HIVE-15277:


[~bslim], it seems you need to rebase the patch as it did not apply cleanly on 
master.

In addition, could you create a GitHub PR or [RB 
post|https://cwiki.apache.org/confluence/display/Hive/Review+Board] so it is 
easier to review the patch? Thanks

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15115) Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]

2016-11-28 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara reassigned HIVE-15115:
--

Assignee: Barna Zsombor Klara

> Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
> --
>
> Key: HIVE-15115
> URL: https://issues.apache.org/jira/browse/HIVE-15115
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> This test was identified as flaky before, it seems it turned flaky again.
> Earlier Jira:
> [HIVE-14976|https://issues.apache.org/jira/browse/HIVE-14976]
> New flaky runs:
> https://builds.apache.org/job/PreCommit-HIVE-Build/1931/testReport
> https://builds.apache.org/job/PreCommit-HIVE-Build/1930/testReport
> {code}
> 516c516
> < totalSize   3220
> ---
> > totalSize   3224
> 569c569
> < totalSize   3220
> ---
> > totalSize   3224
> 634c634
> < totalSize   4577
> ---
> > totalSize   4581
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15115) Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]

2016-11-28 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702097#comment-15702097
 ] 

Barna Zsombor Klara commented on HIVE-15115:


I was able to reproduce the test failure by running the test on a machine with 
CentOS. It seems the qout file was generated under OSX and the stats differ 
depending on the operation system. I tested it with both ORC and parquet and 
had the same results, namely that the totalSize differs between operating 
systems.

> Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
> --
>
> Key: HIVE-15115
> URL: https://issues.apache.org/jira/browse/HIVE-15115
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Barna Zsombor Klara
>
> This test was identified as flaky before, it seems it turned flaky again.
> Earlier Jira:
> [HIVE-14976|https://issues.apache.org/jira/browse/HIVE-14976]
> New flaky runs:
> https://builds.apache.org/job/PreCommit-HIVE-Build/1931/testReport
> https://builds.apache.org/job/PreCommit-HIVE-Build/1930/testReport
> {code}
> 516c516
> < totalSize   3220
> ---
> > totalSize   3224
> 569c569
> < totalSize   3220
> ---
> > totalSize   3224
> 634c634
> < totalSize   4577
> ---
> > totalSize   4581
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15124) Fix OrcInputFormat to use reader's schema for include boolean array

2016-11-28 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-15124:
-
Attachment: HIVE-15124.patch

Updated two more expected output files. The pull request has the changes 
separated out.

> Fix OrcInputFormat to use reader's schema for include boolean array
> ---
>
> Key: HIVE-15124
> URL: https://issues.apache.org/jira/browse/HIVE-15124
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-15124.patch, HIVE-15124.patch, HIVE-15124.patch
>
>
> Currently, the OrcInputFormat uses the file's schema rather than the reader's 
> schema. This means that SchemaEvolution fails with an 
> ArrayIndexOutOfBoundsException if a partition has a different schema than the 
> table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15291) Comparison of timestamp fails if only date part is provided.

2016-11-28 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702274#comment-15702274
 ] 

Peter Vary edited comment on HIVE-15291 at 11/28/16 3:50 PM:
-

Thanks for catching this issue!

[~dhiraj.kumar]: Do you mind adding a unit test for this case as well, so it 
will not cause any problems later on?

Thanks,
Peter


was (Author: pvary):
[~dhiraj.kumar]: Do you mind adding a unit test for this case as well?

Thanks,
Peter

> Comparison of timestamp fails if only date part is provided. 
> -
>
> Key: HIVE-15291
> URL: https://issues.apache.org/jira/browse/HIVE-15291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, UDF
>Affects Versions: 2.1.0
>Reporter: Dhiraj Kumar
> Attachments: HIVE-15291.1.patch
>
>
> Summary : If a query needs to compare two timestamp with one timestamp 
> provided in "-MM-DD" format and skipping the time part, it returns 
> incorrect result. 
> Steps to reproduce : 
> 1. Start a hive-cli. 
> 2. Fire up the query -> select cast("2016-12-31 12:00:00" as timestamp) > 
> "2016-12-30";
> 3. Expected result : true
> 4. Actual result : NULL
> Detailed description : 
> If two primitives of different type needs to compared, a common comparator 
> type is chosen. Prior to 2.1, Common type Text was chosen to compare 
> Timestamp type and Text type. 
> In version 2.1, Common type Timestamp is chosen to compare Timestamp type and 
> Text type. This leads to converting Text type (-MM-DD) to be converted 
> into java.sql.Timestamp which throws Exception saying the input is not in 
> proper format. The exception is suppressed and a null is returned. 
> Code below from org.apache.hadoop.hive.ql.exec.FunctionRegistry
> {code:java}
> if (pgA == PrimitiveGrouping.STRING_GROUP && pgB == 
> PrimitiveGrouping.DATE_GROUP) {
>   return b;
> }
> // date/timestamp is higher precedence than String_GROUP
> if (pgB == PrimitiveGrouping.STRING_GROUP && pgA == 
> PrimitiveGrouping.DATE_GROUP) {
>   return a;
> }
> {code}
> The bug was introduced in  
> [HIVE-13381|https://issues.apache.org/jira/browse/HIVE-13381]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15280:
---
Attachment: HIVE-15280.2.patch

[~stakiar] I added a new test case to the S3 insert_into.q query to validate 
the compressed filenames are correctly inserted.

I thought on adding this case on insert_compressed.q, but the {{dfs -ls}} 
command on .q.out for HDFS patches are fully masked, and if I change it to be 
partially masked, then lots of .q.out would need to be updated.

I also thought about adding simple unit tests with mocks, but mvFile and 
copyFiles are private methods, and another parent copyFiles is protected, and 
that just complicates unit tests. 

I think the insert_compressed.q did not fail on Hive 2.2 because it fixed 
something related to detecting compression on files without looking at the file 
extension. 

Anyway, I had an idea of making blobstore tests available on Hive QA by adding 
a proxy filesystem to it. Once I do that, we will see these tests running 
automatically, and validating the mvFile() case.

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15291) Comparison of timestamp fails if only date part is provided.

2016-11-28 Thread Dhiraj Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dhiraj Kumar updated HIVE-15291:

Description: 
Summary : If a query needs to compare two timestamp with one timestamp provided 
in "-MM-DD" format, skipping the time part, it returns incorrect result. 

Steps to reproduce : 

1. Start a hive-cli. 
2. Fire up the query -> select cast("2016-12-31 12:00:00" as timestamp) > 
"2016-12-30";
3. Expected result : true
4. Actual result : NULL

Detailed description : 
If two primitives of different type needs to compared, a common comparator type 
is chosen. Prior to 2.1, Common type Text was chosen to compare Timestamp type 
and Text type. 

In version 2.1, Common type Timestamp is chosen to compare Timestamp type and 
Text type. This leads to converting Text type (-MM-DD) into 
java.sql.Timestamp which throws exception saying the input is not in proper 
format. The exception is suppressed and a null is returned. 

Code below from org.apache.hadoop.hive.ql.exec.FunctionRegistry
{code:java}
if (pgA == PrimitiveGrouping.STRING_GROUP && pgB == 
PrimitiveGrouping.DATE_GROUP) {
  return b;
}
// date/timestamp is higher precedence than String_GROUP
if (pgB == PrimitiveGrouping.STRING_GROUP && pgA == 
PrimitiveGrouping.DATE_GROUP) {
  return a;
}
{code}


The bug was introduced in  
[HIVE-13381|https://issues.apache.org/jira/browse/HIVE-13381]

  was:
Summary : If a query needs to compare two timestamp with one timestamp provided 
in "-MM-DD" format and skipping the time part, it returns incorrect result. 

Steps to reproduce : 

1. Start a hive-cli. 
2. Fire up the query -> select cast("2016-12-31 12:00:00" as timestamp) > 
"2016-12-30";
3. Expected result : true
4. Actual result : NULL

Detailed description : 
If two primitives of different type needs to compared, a common comparator type 
is chosen. Prior to 2.1, Common type Text was chosen to compare Timestamp type 
and Text type. 

In version 2.1, Common type Timestamp is chosen to compare Timestamp type and 
Text type. This leads to converting Text type (-MM-DD) to be converted into 
java.sql.Timestamp which throws Exception saying the input is not in proper 
format. The exception is suppressed and a null is returned. 

Code below from org.apache.hadoop.hive.ql.exec.FunctionRegistry
{code:java}
if (pgA == PrimitiveGrouping.STRING_GROUP && pgB == 
PrimitiveGrouping.DATE_GROUP) {
  return b;
}
// date/timestamp is higher precedence than String_GROUP
if (pgB == PrimitiveGrouping.STRING_GROUP && pgA == 
PrimitiveGrouping.DATE_GROUP) {
  return a;
}
{code}


The bug was introduced in  
[HIVE-13381|https://issues.apache.org/jira/browse/HIVE-13381]


> Comparison of timestamp fails if only date part is provided. 
> -
>
> Key: HIVE-15291
> URL: https://issues.apache.org/jira/browse/HIVE-15291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, UDF
>Affects Versions: 2.1.0
>Reporter: Dhiraj Kumar
> Attachments: HIVE-15291.1.patch
>
>
> Summary : If a query needs to compare two timestamp with one timestamp 
> provided in "-MM-DD" format, skipping the time part, it returns incorrect 
> result. 
> Steps to reproduce : 
> 1. Start a hive-cli. 
> 2. Fire up the query -> select cast("2016-12-31 12:00:00" as timestamp) > 
> "2016-12-30";
> 3. Expected result : true
> 4. Actual result : NULL
> Detailed description : 
> If two primitives of different type needs to compared, a common comparator 
> type is chosen. Prior to 2.1, Common type Text was chosen to compare 
> Timestamp type and Text type. 
> In version 2.1, Common type Timestamp is chosen to compare Timestamp type and 
> Text type. This leads to converting Text type (-MM-DD) into 
> java.sql.Timestamp which throws exception saying the input is not in proper 
> format. The exception is suppressed and a null is returned. 
> Code below from org.apache.hadoop.hive.ql.exec.FunctionRegistry
> {code:java}
> if (pgA == PrimitiveGrouping.STRING_GROUP && pgB == 
> PrimitiveGrouping.DATE_GROUP) {
>   return b;
> }
> // date/timestamp is higher precedence than String_GROUP
> if (pgB == PrimitiveGrouping.STRING_GROUP && pgA == 
> PrimitiveGrouping.DATE_GROUP) {
>   return a;
> }
> {code}
> The bug was introduced in  
> [HIVE-13381|https://issues.apache.org/jira/browse/HIVE-13381]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15279) map join dummy operators are not set up correctly in certain cases with merge join

2016-11-28 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702502#comment-15702502
 ] 

Gunther Hagleitner commented on HIVE-15279:
---

+1 fix is certainly valid, but also raises other questions: We do not try to 
initialize any dummy ops that the merge work might have as far as I can tell. I 
don't see any specific reason why mapjoins shouldn't be allowed in merge work. 
Do you know why [~sershe]?

> map join dummy operators are not set up correctly in certain cases with merge 
> join
> --
>
> Key: HIVE-15279
> URL: https://issues.apache.org/jira/browse/HIVE-15279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15279.patch
>
>
> As a result, MapJoin is not initialized and there's NPE later.
> Tez-specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15291) Comparison of timestamp fails if only date part is provided.

2016-11-28 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702274#comment-15702274
 ] 

Peter Vary commented on HIVE-15291:
---

[~dhiraj.kumar]: Do you mind adding a unit test for this case as well?

Thanks,
Peter

> Comparison of timestamp fails if only date part is provided. 
> -
>
> Key: HIVE-15291
> URL: https://issues.apache.org/jira/browse/HIVE-15291
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, UDF
>Affects Versions: 2.1.0
>Reporter: Dhiraj Kumar
> Attachments: HIVE-15291.1.patch
>
>
> Summary : If a query needs to compare two timestamp with one timestamp 
> provided in "-MM-DD" format and skipping the time part, it returns 
> incorrect result. 
> Steps to reproduce : 
> 1. Start a hive-cli. 
> 2. Fire up the query -> select cast("2016-12-31 12:00:00" as timestamp) > 
> "2016-12-30";
> 3. Expected result : true
> 4. Actual result : NULL
> Detailed description : 
> If two primitives of different type needs to compared, a common comparator 
> type is chosen. Prior to 2.1, Common type Text was chosen to compare 
> Timestamp type and Text type. 
> In version 2.1, Common type Timestamp is chosen to compare Timestamp type and 
> Text type. This leads to converting Text type (-MM-DD) to be converted 
> into java.sql.Timestamp which throws Exception saying the input is not in 
> proper format. The exception is suppressed and a null is returned. 
> Code below from org.apache.hadoop.hive.ql.exec.FunctionRegistry
> {code:java}
> if (pgA == PrimitiveGrouping.STRING_GROUP && pgB == 
> PrimitiveGrouping.DATE_GROUP) {
>   return b;
> }
> // date/timestamp is higher precedence than String_GROUP
> if (pgB == PrimitiveGrouping.STRING_GROUP && pgA == 
> PrimitiveGrouping.DATE_GROUP) {
>   return a;
> }
> {code}
> The bug was introduced in  
> [HIVE-13381|https://issues.apache.org/jira/browse/HIVE-13381]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15076) Improve scalability of LDAP authentication provider group filter

2016-11-28 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702529#comment-15702529
 ] 

Naveen Gangam commented on HIVE-15076:
--

[~yalovyyi] I have been sidetracks with some high priority items and it was a 
short week, so I did not have a whole lot of time to take a deep look. I just 
took a quick look and commented on reviewboard on some cosmetic stuff. I will 
get to the functional side over the next couple of days. So please bear with my 
schedule.

Also can you share any metrics you may have recorded on performance? Thanks

> Improve scalability of LDAP authentication provider group filter
> 
>
> Key: HIVE-15076
> URL: https://issues.apache.org/jira/browse/HIVE-15076
> Project: Hive
>  Issue Type: Improvement
>  Components: Authentication
>Affects Versions: 2.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-15076.1.patch, HIVE-15076.2.patch
>
>
> Current implementation uses following algorithm:
> #   For a given user find all groups that user is a member of. (A list of 
> LDAP groups is constructed as a result of that request)
> #  Match this list of groups with provided group filter.
>  
> Time/Memory complexity of this approach is O(N) on client side, where N – is 
> a number of groups the user has membership in. On a large directory (800+ 
> groups per user) we can observe up to 2x performance degradation and failures 
> because of size of LDAP response (LDAP: error code 4 - Sizelimit Exceeded).
>  
> Some Directory Services (Microsoft Active Directory for instance) provide a 
> virtual attribute for User Object that contains a list of groups that user 
> belongs to. This attribute can be used to quickly determine whether this user 
> passes or fails the group filter.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15124) Fix OrcInputFormat to use reader's schema for include boolean array

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702531#comment-15702531
 ] 

Hive QA commented on HIVE-15124:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840664/HIVE-15124.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10735 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2297/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2297/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2297/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840664 - PreCommit-HIVE-Build

> Fix OrcInputFormat to use reader's schema for include boolean array
> ---
>
> Key: HIVE-15124
> URL: https://issues.apache.org/jira/browse/HIVE-15124
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-15124.patch, HIVE-15124.patch, HIVE-15124.patch
>
>
> Currently, the OrcInputFormat uses the file's schema rather than the reader's 
> schema. This means that SchemaEvolution fails with an 
> ArrayIndexOutOfBoundsException if a partition has a different schema than the 
> table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15278) PTF+MergeJoin = NPE

2016-11-28 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702566#comment-15702566
 ] 

Gunther Hagleitner commented on HIVE-15278:
---

LGTM +1. This does look like it'd be painful to debug. Is it possible to add a 
small test to avoid this debug pain for the next person?

One thing I'm not completely sure of: The bug is that the join operator is 
trying to pump records through it's parents after they have been closed. It's 
doing that to finish the last pending group when the first of it's parents is 
closed. Your fix finishes the group after the first parent is closed not the 
last - do you know for a fact that the join operator won't try to push records 
through that (closed) parent? (I think that's the case because it's the big 
table side and all remaining records should be from other branches).

> PTF+MergeJoin = NPE
> ---
>
> Key: HIVE-15278
> URL: https://issues.apache.org/jira/browse/HIVE-15278
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15278.patch
>
>
> Manifests as
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:340)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
>   ... 29 more
> {noformat}
> It's actually a somewhat subtle ordering problem in sortmerge - as it stands, 
> it calls different branches of the tree in closeOp after they themselves have 
> already been closed. Other operators that clean stuff up in close may result 
> in different errors. The common pattern is
> {noformat}
>1125 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
>1126 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
>1127 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
> ...
>1131 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:428)
>1132 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:388)
>1133 at 
> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> ...
>1139 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:294)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13353) SHOW COMPACTIONS should support filtering options

2016-11-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13353:
--
Attachment: HIVE-13353.01.patch

> SHOW COMPACTIONS should support filtering options
> -
>
> Key: HIVE-13353
> URL: https://issues.apache.org/jira/browse/HIVE-13353
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-13353.01.patch
>
>
> Since we now have historical information in SHOW COMPACTIONS the output can 
> easily become unwieldy. (e.g. 1000 partitions with 3 lines of history each)
> this is a significant usability issue
> Need to add ability to filter by db/table/partition
> Perhaps would also be useful to filter by status



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702862#comment-15702862
 ] 

Hive QA commented on HIVE-15273:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840684/HIVE-15273.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10733 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2300/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2300/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2300/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840684 - PreCommit-HIVE-Build

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702905#comment-15702905
 ] 

ASF GitHub Bot commented on HIVE-15277:
---

GitHub user b-slim opened a pull request:

https://github.com/apache/hive/pull/120

HIVE-15277 Druid stograge handler



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/b-slim/hive rebase_druid_record_writer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/120.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #120


commit 9025d4a33348faa007c17f2c7ff5dee4f3a87318
Author: Slim Bouguerra 
Date:   2016-10-26T23:55:34Z

adding druid record writer

bump guava version to 16.0.1

moving out the injector

commit be2e29dcba5617db478eefa75a5478a77512e090
Author: Jesus Camacho Rodriguez 
Date:   2016-11-02T03:21:59Z

Druid time granularity partitioning, serializer and necessary extensions

commit df4036f7f76294dc5599d29cdb760336b0ee9a4f
Author: Jesus Camacho Rodriguez 
Date:   2016-11-02T19:59:52Z

Recognition of dimensions and metrics

patch 1

commit ea76f0ddfa33990d92e061676123c45920ed6dce
Author: Slim Bouguerra 
Date:   2016-11-02T21:18:00Z

adding file schema support

commit 010701be7cf939f6854c9ee113ccf40b20aed32a
Author: Jesus Camacho Rodriguez 
Date:   2016-11-04T19:48:43Z

native storage

new fixes

commit 3d8496299d1d151da59bb6f547ebbc475c329197
Author: Slim Bouguerra 
Date:   2016-11-09T17:57:03Z

using segment output path

commit 2b10b26eb7a5d9a6058c9e1f206c599e54ec88b2
Author: Slim Bouguerra 
Date:   2016-11-16T00:16:10Z

adding check for existing datasource and implement drop table

commit e18b716a438e8b38155d4ab31b7070ae1945f1e4
Author: Slim Bouguerra 
Date:   2016-11-19T00:53:10Z

adding UTs and refactor some code

commit 3b31d16dcb9fd5cdb9eb6d1c994cb3f0c8cd8a33
Author: Slim Bouguerra 
Date:   2016-11-23T23:49:28Z

fix druid version

commit 4b447e56389aab1f45e9b48192068d1a0257a14c
Author: Slim Bouguerra 
Date:   2016-11-28T19:32:02Z

ignore record writer test

commit a7b4f792a5e28b0772addbc0d5ea52d5b44d9d91
Author: Slim Bouguerra 
Date:   2016-11-28T19:38:25Z

format code




> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15278) PTF+MergeJoin = NPE

2016-11-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702903#comment-15702903
 ] 

Sergey Shelukhin commented on HIVE-15278:
-

Yes, we make 2 assumptions:
1) That it won't try to pump more records thru the big table side, which won't 
work in any way; logically, it makes no sense cause the big table side is the 
one that's causing the operators to get closed in the first place, so it should 
be done with all records.
2) Main table side is closed first. That is true now; reduceWork vs 
mergeWorkList in ReduceRecordProducer.

I am not sure if we can add a test. Repro that we have is too specific (and 
large potentially) for q files and this code is too much of a mess to repro 
with a unit test.

> PTF+MergeJoin = NPE
> ---
>
> Key: HIVE-15278
> URL: https://issues.apache.org/jira/browse/HIVE-15278
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15278.patch
>
>
> Manifests as
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:340)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
>   ... 29 more
> {noformat}
> It's actually a somewhat subtle ordering problem in sortmerge - as it stands, 
> it calls different branches of the tree in closeOp after they themselves have 
> already been closed. Other operators that clean stuff up in close may result 
> in different errors. The common pattern is
> {noformat}
>1125 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
>1126 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
>1127 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
> ...
>1131 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:428)
>1132 at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:388)
>1133 at 
> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> ...
>1139 at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:294)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15293) add toString to OpTraits

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15293:

Status: Patch Available  (was: Open)

> add toString to OpTraits
> 
>
> Key: HIVE-15293
> URL: https://issues.apache.org/jira/browse/HIVE-15293
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Trivial
> Attachments: HIVE-15293.patch
>
>
> The traits logging is completely pointless right now



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15293) add toString to OpTraits

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15293:

Attachment: HIVE-15293.patch

[~ashutoshc] can you take a look? Thanks

> add toString to OpTraits
> 
>
> Key: HIVE-15293
> URL: https://issues.apache.org/jira/browse/HIVE-15293
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Trivial
> Attachments: HIVE-15293.patch
>
>
> The traits logging is completely pointless right now



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703018#comment-15703018
 ] 

Yongzhi Chen commented on HIVE-15280:
-

Yes, before HIVE-15199, filetype has '.' in it, HIVE-15199 use the 
FilenameUtils.getExtension which remove the char. Adding the extension 
separator back in the full file name is the fix. 
The fix looks good.  +1

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15284) Add junit test to test replication scenarios

2016-11-28 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703032#comment-15703032
 ] 

Vaibhav Gumashta commented on HIVE-15284:
-

The failed tests have been reported in HIVE-15058 as flaky. 

> Add junit test to test replication scenarios
> 
>
> Key: HIVE-15284
> URL: https://issues.apache.org/jira/browse/HIVE-15284
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15284.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15295) Fix HCatalog javadoc with Java 8

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15295:
---
Priority: Minor  (was: Major)

> Fix HCatalog javadoc with Java 8
> 
>
> Key: HIVE-15295
> URL: https://issues.apache.org/jira/browse/HIVE-15295
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>
> Realized while generating artifacts for Hive 2.1.1 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-15295) Fix HCatalog javadoc generation with Java 8

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15295 started by Jesus Camacho Rodriguez.
--
> Fix HCatalog javadoc generation with Java 8
> ---
>
> Key: HIVE-15295
> URL: https://issues.apache.org/jira/browse/HIVE-15295
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>
> Realized while generating artifacts for Hive 2.1.1 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15295) Fix HCatalog javadoc generation with Java 8

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15295:
---
Target Version/s: 2.2.0, 2.1.1

> Fix HCatalog javadoc generation with Java 8
> ---
>
> Key: HIVE-15295
> URL: https://issues.apache.org/jira/browse/HIVE-15295
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-15295.patch
>
>
> Realized while generating artifacts for Hive 2.1.1 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15295) Fix HCatalog javadoc generation with Java 8

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15295:
---
Fix Version/s: 2.1.1

> Fix HCatalog javadoc generation with Java 8
> ---
>
> Key: HIVE-15295
> URL: https://issues.apache.org/jira/browse/HIVE-15295
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Fix For: 2.1.1
>
> Attachments: HIVE-15295.patch
>
>
> Realized while generating artifacts for Hive 2.1.1 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15296) AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15296:

Description: 
First attempt and failure detection:
{noformat}
2016-11-18 20:20:01,980 [INFO] [TaskSchedulerEventHandlerThread] 
|tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
capability=memory:4096, vCores:1, hosts=[3n01]
2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
|tezplugins.LlapTaskSchedulerService|: Assigned task 
TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
startTime=0, containerId=null, assignedInstance=null, uniqueId=55, 
localityDelayTimeout=9223372036854775807} to container 
container_1_2622_01_56 on node=DynamicServiceInstance [alive=true, 
host=3n01:15001 with resources=memory:59392, vCores:16, 
shufflePort=15551, servicesAddress=http://3n01:15002, mgmtPort=15004]
2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
|tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
startTime=10550817928, containerId=container_1_2622_01_56, 
assignedInstance=DynamicServiceInstance [alive=true, host=3n01:15001 with 
resources=memory:59392, vCores:16, shufflePort=15551, 
servicesAddress=http://3n01:15002, mgmtPort=15004], uniqueId=55, 
localityDelayTimeout=9223372036854775807} = SCHEDULED
2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
|impl.TaskAttemptImpl|: TaskAttempt: [attempt_1478967587833_2622_1_06_31_0] 
started. Is using containerId: [container_1_2622_01_56] on NM: 
[3n01:15001]
2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
|history.HistoryEventHandler|: 
[HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_STARTED]: 
vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
startTime=1479500403427, containerId=container_1_2622_01_56, 
nodeId=3n01:15001
2016-11-18 20:20:03,430 [INFO] [TaskCommunicator # 1] 
|tezplugins.LlapTaskCommunicator|: Successfully launched task: 
attempt_1478967587833_2622_1_06_31_0
2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
|impl.TaskImpl|: TaskAttempt:attempt_1478967587833_2622_1_06_31_0 sent 
events: (0-1).
2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
|impl.VertexImpl|: Sending attempt_1478967587833_2622_1_06_31_0 24 events 
[0,24) total 24 vertex_1478967587833_2622_1_06 [Map 1]
2016-11-18 20:25:43,249 [INFO] [Dispatcher thread {Central}] 
|history.HistoryEventHandler|: 
[HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_FINISHED]: 
vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
creationTime=1479500401929, allocationTime=1479500403426, 
startTime=1479500403427, finishTime=1479500743249, timeTaken=339822, 
status=FAILED, taskFailureType=NON_FATAL, errorEnum=TASK_HEARTBEAT_ERROR, 
diagnostics=AttemptID:attempt_1478967587833_2622_1_06_31_0 Timed out after 
300 secs, nodeHttpAddress=http://3n01:15002, counters=Counters: 1, 
org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
2016-11-18 20:25:43,255 [INFO] [TaskSchedulerEventHandlerThread] 
|tezplugins.LlapTaskSchedulerService|: Processing de-allocate request for 
task=attempt_1478967587833_2622_1_06_31_0, state=ASSIGNED, endReason=OTHER
2016-11-18 20:25:43,259 [INFO] [Dispatcher thread {Central}] |node.AMNodeImpl|: 
Attempt failed on node: 3n01:15001 TA: attempt_1478967587833_2622_1_06_31_0 
failed: true container: container_1_2622_01_56 numFailedTAs: 7
2016-11-18 20:25:43,262 [INFO] [Dispatcher thread {Central}] |impl.VertexImpl|: 
Source task attempt completed for vertex: vertex_1478967587833_2622_1_07 
[Reducer 2] attempt: attempt_1478967587833_2622_1_06_31_0 with state: 
FAILED vertexState: RUNNING
{noformat}
Second attempt:
{noformat}
2016-11-18 20:25:43,267 [INFO] [TaskSchedulerEventHandlerThread] 
|tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
capability=memory:4096, vCores:1, hosts=null
2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
|tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
TaskInfo{task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
startTime=0, containerId=null, assignedInstance=null, uniqueId=318, 
localityDelayTimeout=9223372036854775807} = DELAYED_RESOURCES
2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
|tezplugins.LlapTaskSchedulerService|: Preempting for 
task=attempt_1478967587833_2622_1_06_31_1 on any available host
2016-11-18 20:25:43,298 [INFO] [LlapScheduler] 
|tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
TaskInfo{task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
startTime=0, containerId=null, assignedInstance=null, uniqueId=318, 

[jira] [Commented] (HIVE-15270) ExprNode/Sarg changes to support values supplied during query runtime

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702994#comment-15702994
 ] 

Hive QA commented on HIVE-15270:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840693/HIVE-15270.3.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10733 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=90)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2301/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2301/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2301/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840693 - PreCommit-HIVE-Build

> ExprNode/Sarg changes to support values supplied during query runtime
> -
>
> Key: HIVE-15270
> URL: https://issues.apache.org/jira/browse/HIVE-15270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15270.1.patch, HIVE-15270.2.patch, 
> HIVE-15270.3.patch
>
>
> Infrastructure changes to support retrieval of query-runtime supplied values, 
> needed for dynamic min/max (HIVE-15269) and bloomfilter join optimizations.
> - Some concept of available runtime values that can be retrieved for a 
> MapWork/ReduceWork
> - ExprNode/Sarg changes to pass a Conf during initialization - this allows 
> the expression to retrieve the MapWork at query time (using 
> Utilities.getMapWork(Configuration)) to access runtime-supplied values.
> - Ability to populate the runtime values in Tez mode via incoming Tez edges



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15279) map join dummy operators are not set up correctly in certain cases with merge join

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15279:

Attachment: HIVE-15279.01.patch

Updated the patch

> map join dummy operators are not set up correctly in certain cases with merge 
> join
> --
>
> Key: HIVE-15279
> URL: https://issues.apache.org/jira/browse/HIVE-15279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15279.01.patch, HIVE-15279.patch
>
>
> As a result, MapJoin is not initialized and there's NPE later.
> Tez-specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15295) Fix HCatalog javadoc generation with Java 8

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15295:
---
Summary: Fix HCatalog javadoc generation with Java 8  (was: Fix HCatalog 
javadoc with Java 8)

> Fix HCatalog javadoc generation with Java 8
> ---
>
> Key: HIVE-15295
> URL: https://issues.apache.org/jira/browse/HIVE-15295
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>
> Realized while generating artifacts for Hive 2.1.1 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15280:
---
Attachment: HIVE-15280.3.patch

Here's a new patch with unit tests added.

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch, 
> HIVE-15280.3.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15276) CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703134#comment-15703134
 ] 

Hive QA commented on HIVE-15276:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840699/HIVE-15276.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 10718 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=108)

[tez_joins_explain.q,transform2.q,groupby5.q,cbo_semijoin.q,bucketmapjoin13.q,union_remove_6_subq.q,groupby2_map_multi_distinct.q,load_dyn_part9.q,multi_insert_gby2.q,vectorization_11.q,groupby_position.q,avro_compression_enabled_native.q,smb_mapjoin_8.q,join21.q,auto_join16.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join_without_localtask]
 (batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_3]
 (batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[date_join1] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby2_noskew] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby6_noskew] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[mapjoin_test_outer] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[merge2] (batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_join_union] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_11] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union21] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_9] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2302/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2302/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2302/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840699 - PreCommit-HIVE-Build

> CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"
> -
>
> Key: HIVE-15276
> URL: https://issues.apache.org/jira/browse/HIVE-15276
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.1.0
>Reporter: Grant Sohn
>Assignee: Grant Sohn
>Priority: Trivial
> Attachments: HIVE-15276.1.patch, HIVE-15276.2.patch, 
> HIVE-15276.3.patch, HIVE-15276.4.patch, HIVE-15276.5.patch
>
>
> Found some obvious spelling typos in the CLI help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15284) Add junit test to test replication scenarios

2016-11-28 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-15284:

  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.2.0
Target Version/s: 2.2.0
  Status: Resolved  (was: Patch Available)

Committed to master. Thanks [~sushanth].

> Add junit test to test replication scenarios
> 
>
> Key: HIVE-15284
> URL: https://issues.apache.org/jira/browse/HIVE-15284
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Fix For: 2.2.0
>
> Attachments: HIVE-15284.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15295) Fix HCatalog javadoc generation with Java 8

2016-11-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15295:
---
Status: Patch Available  (was: In Progress)

> Fix HCatalog javadoc generation with Java 8
> ---
>
> Key: HIVE-15295
> URL: https://issues.apache.org/jira/browse/HIVE-15295
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>
> Realized while generating artifacts for Hive 2.1.1 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15296) AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703167#comment-15703167
 ] 

Sergey Shelukhin edited comment on HIVE-15296 at 11/28/16 9:24 PM:
---

[~prasanth_j] [~sseth]  [~hagleitn] fyi


was (Author: sershe):
[~prasanth_j] [~sseth] fyi

> AM may lose task failures and not reschedule when scheduling to LLAP
> 
>
> Key: HIVE-15296
> URL: https://issues.apache.org/jira/browse/HIVE-15296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> First attempt and failure detection:
> {noformat}
> 2016-11-18 20:20:01,980 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> capability=memory:4096, vCores:1, hosts=[3n01]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: Assigned task 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=0, containerId=null, assignedInstance=null, uniqueId=55, 
> localityDelayTimeout=9223372036854775807} to container 
> container_1_2622_01_56 on node=DynamicServiceInstance 
> [alive=true, host=3n01:15001 with resources=memory:59392, vCores:16, 
> shufflePort=15551, servicesAddress=http://3n01:15002, mgmtPort=15004]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=10550817928, containerId=container_1_2622_01_56, 
> assignedInstance=DynamicServiceInstance [alive=true, host=3n01:15001 with 
> resources=memory:59392, vCores:16, shufflePort=15551, 
> servicesAddress=http://3n01:15002, mgmtPort=15004], uniqueId=55, 
> localityDelayTimeout=9223372036854775807} = SCHEDULED
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |impl.TaskAttemptImpl|: TaskAttempt: 
> [attempt_1478967587833_2622_1_06_31_0] started. Is using containerId: 
> [container_1_2622_01_56] on NM: [3n01:15001]
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_STARTED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> startTime=1479500403427, containerId=container_1_2622_01_56, 
> nodeId=3n01:15001
> 2016-11-18 20:20:03,430 [INFO] [TaskCommunicator # 1] 
> |tezplugins.LlapTaskCommunicator|: Successfully launched task: 
> attempt_1478967587833_2622_1_06_31_0
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.TaskImpl|: TaskAttempt:attempt_1478967587833_2622_1_06_31_0 sent 
> events: (0-1).
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.VertexImpl|: Sending attempt_1478967587833_2622_1_06_31_0 24 events 
> [0,24) total 24 vertex_1478967587833_2622_1_06 [Map 1]
> 2016-11-18 20:25:43,249 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> creationTime=1479500401929, allocationTime=1479500403426, 
> startTime=1479500403427, finishTime=1479500743249, timeTaken=339822, 
> status=FAILED, taskFailureType=NON_FATAL, errorEnum=TASK_HEARTBEAT_ERROR, 
> diagnostics=AttemptID:attempt_1478967587833_2622_1_06_31_0 Timed out 
> after 300 secs, nodeHttpAddress=http://3n01:15002, counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:25:43,255 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Processing de-allocate request for 
> task=attempt_1478967587833_2622_1_06_31_0, state=ASSIGNED, endReason=OTHER
> 2016-11-18 20:25:43,259 [INFO] [Dispatcher thread {Central}] 
> |node.AMNodeImpl|: Attempt failed on node: 3n01:15001 TA: 
> attempt_1478967587833_2622_1_06_31_0 failed: true container: 
> container_1_2622_01_56 numFailedTAs: 7
> 2016-11-18 20:25:43,262 [INFO] [Dispatcher thread {Central}] 
> |impl.VertexImpl|: Source task attempt completed for vertex: 
> vertex_1478967587833_2622_1_07 [Reducer 2] attempt: 
> attempt_1478967587833_2622_1_06_31_0 with state: FAILED vertexState: 
> RUNNING
> {noformat}
> Second attempt:
> {noformat}
> 2016-11-18 20:25:43,267 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> capability=memory:4096, vCores:1, hosts=null
> 2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
> 

[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703139#comment-15703139
 ] 

Hive QA commented on HIVE-15277:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840702/HIVE-15277.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2303/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2303/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2303/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-11-28 21:15:28.263
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2303/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-11-28 21:15:28.265
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   1aebe9d..63bdfa6  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 1aebe9d HIVE-15168: Flaky test: 
TestSparkClient.testJobSubmission (still flaky) (Barna Zsombor Klara via Rui 
Li, reviewed by Xuefu Zhang and Rui Li)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios 
(Sushanth Sowmyan reviewed by Vaibhav Gumashta)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-11-28 21:15:29.546
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java
patching file common/src/java/org/apache/hadoop/hive/conf/Constants.java
patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
patching file druid-handler/README.md
patching file druid-handler/pom.xml
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidOutputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidQueryBasedInputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidSplit.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidOutputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidQueryBasedInputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidRecordWriter.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/HiveDruidSplit.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidGroupByQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSelectQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDe.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDeUtils.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTimeseriesQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTopNQueryRecordReader.java
patching file 
druid-handler/src/test/org/apache/hadoop/hive/druid/DruidStorageHandlerTest.java

[jira] [Commented] (HIVE-15296) AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703167#comment-15703167
 ] 

Sergey Shelukhin commented on HIVE-15296:
-

[~prasanth_j] [~sseth] fyi

> AM may lose task failures and not reschedule when scheduling to LLAP
> 
>
> Key: HIVE-15296
> URL: https://issues.apache.org/jira/browse/HIVE-15296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> First attempt and failure detection:
> {noformat}
> 2016-11-18 20:20:01,980 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> capability=memory:4096, vCores:1, hosts=[3n01]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: Assigned task 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=0, containerId=null, assignedInstance=null, uniqueId=55, 
> localityDelayTimeout=9223372036854775807} to container 
> container_1_2622_01_56 on node=DynamicServiceInstance 
> [alive=true, host=3n01:15001 with resources=memory:59392, vCores:16, 
> shufflePort=15551, servicesAddress=http://3n01:15002, mgmtPort=15004]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=10550817928, containerId=container_1_2622_01_56, 
> assignedInstance=DynamicServiceInstance [alive=true, host=3n01:15001 with 
> resources=memory:59392, vCores:16, shufflePort=15551, 
> servicesAddress=http://3n01:15002, mgmtPort=15004], uniqueId=55, 
> localityDelayTimeout=9223372036854775807} = SCHEDULED
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |impl.TaskAttemptImpl|: TaskAttempt: 
> [attempt_1478967587833_2622_1_06_31_0] started. Is using containerId: 
> [container_1_2622_01_56] on NM: [3n01:15001]
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_STARTED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> startTime=1479500403427, containerId=container_1_2622_01_56, 
> nodeId=3n01:15001
> 2016-11-18 20:20:03,430 [INFO] [TaskCommunicator # 1] 
> |tezplugins.LlapTaskCommunicator|: Successfully launched task: 
> attempt_1478967587833_2622_1_06_31_0
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.TaskImpl|: TaskAttempt:attempt_1478967587833_2622_1_06_31_0 sent 
> events: (0-1).
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.VertexImpl|: Sending attempt_1478967587833_2622_1_06_31_0 24 events 
> [0,24) total 24 vertex_1478967587833_2622_1_06 [Map 1]
> 2016-11-18 20:25:43,249 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> creationTime=1479500401929, allocationTime=1479500403426, 
> startTime=1479500403427, finishTime=1479500743249, timeTaken=339822, 
> status=FAILED, taskFailureType=NON_FATAL, errorEnum=TASK_HEARTBEAT_ERROR, 
> diagnostics=AttemptID:attempt_1478967587833_2622_1_06_31_0 Timed out 
> after 300 secs, nodeHttpAddress=http://3n01:15002, counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:25:43,255 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Processing de-allocate request for 
> task=attempt_1478967587833_2622_1_06_31_0, state=ASSIGNED, endReason=OTHER
> 2016-11-18 20:25:43,259 [INFO] [Dispatcher thread {Central}] 
> |node.AMNodeImpl|: Attempt failed on node: 3n01:15001 TA: 
> attempt_1478967587833_2622_1_06_31_0 failed: true container: 
> container_1_2622_01_56 numFailedTAs: 7
> 2016-11-18 20:25:43,262 [INFO] [Dispatcher thread {Central}] 
> |impl.VertexImpl|: Source task attempt completed for vertex: 
> vertex_1478967587833_2622_1_07 [Reducer 2] attempt: 
> attempt_1478967587833_2622_1_06_31_0 with state: FAILED vertexState: 
> RUNNING
> {noformat}
> Second attempt:
> {noformat}
> 2016-11-18 20:25:43,267 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> capability=memory:4096, vCores:1, hosts=null
> 2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_1, 

[jira] [Commented] (HIVE-15293) add toString to OpTraits

2016-11-28 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703194#comment-15703194
 ] 

Ashutosh Chauhan commented on HIVE-15293:
-

+1

> add toString to OpTraits
> 
>
> Key: HIVE-15293
> URL: https://issues.apache.org/jira/browse/HIVE-15293
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Trivial
> Attachments: HIVE-15293.patch
>
>
> The traits logging is completely pointless right now



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703202#comment-15703202
 ] 

Sahil Takiar commented on HIVE-15280:
-

+1

Just to clarify the {{Mockito.spy}} usage, the code is basically creating a 
LocalFileSystem object and then forcing the {{FileSystem.getURI()}} method 
basically returns {{hdfs://}} in which case {{needToCopy}} will return true?

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch, 
> HIVE-15280.3.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703206#comment-15703206
 ] 

slim bouguerra commented on HIVE-15277:
---

This will fail till druid 0.9.2 is released. But it can be reviewed  

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators from the ConditionalTask

2016-11-28 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702815#comment-15702815
 ] 

Sahil Takiar commented on HIVE-15114:
-

New tests LGTM.

> Remove extra MoveTask operators from the ConditionalTask
> 
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.3.patch, HIVE-15114.4.patch, 
> HIVE-15114.5.patch, HIVE-15114.WIP.1.patch, HIVE-15114.WIP.2.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15124) Fix OrcInputFormat to use reader's schema for include boolean array

2016-11-28 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-15124:
-
Attachment: HIVE-15124.patch

*Sigh* reverted previous change.

> Fix OrcInputFormat to use reader's schema for include boolean array
> ---
>
> Key: HIVE-15124
> URL: https://issues.apache.org/jira/browse/HIVE-15124
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-15124.patch, HIVE-15124.patch, HIVE-15124.patch, 
> HIVE-15124.patch
>
>
> Currently, the OrcInputFormat uses the file's schema rather than the reader's 
> schema. This means that SchemaEvolution fails with an 
> ArrayIndexOutOfBoundsException if a partition has a different schema than the 
> table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.2.patch

formatting 

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.2.patch, HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15076) Improve scalability of LDAP authentication provider group filter

2016-11-28 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702942#comment-15702942
 ] 

Illya Yalovyy commented on HIVE-15076:
--

[~ngangam]

Thank you for the feedback on this CR.

Here is some performance results (what I can share):
|| User member of # groups || GroupMembershipKeyFilter || 
UserMembershipKeyFilter ||
| 200 | 0.118 | 0.103 |
| 400 | 0.135 | 0.106 |
| 600 | 0.171 | 0.113 |
| 800 | 0.230 | 0.119 |
| 1000 | FAIL | 0.129 |

GroupMembershipKeyFilter fails with "javax.naming.SizeLimitExceededException: 
[LDAP: error code 4 - Sizelimit Exceeded]" when number of groups greater than 
800. The particular number of groups when the default implementation fails 
depends on record size for each group, so in real production it will be much 
less.

> Improve scalability of LDAP authentication provider group filter
> 
>
> Key: HIVE-15076
> URL: https://issues.apache.org/jira/browse/HIVE-15076
> Project: Hive
>  Issue Type: Improvement
>  Components: Authentication
>Affects Versions: 2.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-15076.1.patch, HIVE-15076.2.patch
>
>
> Current implementation uses following algorithm:
> #   For a given user find all groups that user is a member of. (A list of 
> LDAP groups is constructed as a result of that request)
> #  Match this list of groups with provided group filter.
>  
> Time/Memory complexity of this approach is O(N) on client side, where N – is 
> a number of groups the user has membership in. On a large directory (800+ 
> groups per user) we can observe up to 2x performance degradation and failures 
> because of size of LDAP response (LDAP: error code 4 - Sizelimit Exceeded).
>  
> Some Directory Services (Microsoft Active Directory for instance) provide a 
> virtual attribute for User Object that contains a list of groups that user 
> belongs to. This attribute can be used to quickly determine whether this user 
> passes or fails the group filter.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15273:
--
Attachment: HIVE-15273.patch

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702627#comment-15702627
 ] 

slim bouguerra commented on HIVE-15273:
---

[~leftylev] thanks for the comments ! i have uploaded a new patch.
[~jcamachorodriguez] thanks for testing it please checkout the new patch.

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15270) ExprNode/Sarg changes to support values supplied during query runtime

2016-11-28 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-15270:
--
Attachment: HIVE-15270.3.patch

Re-running same patch

> ExprNode/Sarg changes to support values supplied during query runtime
> -
>
> Key: HIVE-15270
> URL: https://issues.apache.org/jira/browse/HIVE-15270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15270.1.patch, HIVE-15270.2.patch, 
> HIVE-15270.3.patch
>
>
> Infrastructure changes to support retrieval of query-runtime supplied values, 
> needed for dynamic min/max (HIVE-15269) and bloomfilter join optimizations.
> - Some concept of available runtime values that can be retrieved for a 
> MapWork/ReduceWork
> - ExprNode/Sarg changes to pass a Conf during initialization - this allows 
> the expression to retrieve the MapWork at query time (using 
> Utilities.getMapWork(Configuration)) to access runtime-supplied values.
> - Ability to populate the runtime values in Tez mode via incoming Tez edges



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15276) CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"

2016-11-28 Thread Grant Sohn (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Sohn updated HIVE-15276:
--
Attachment: HIVE-15276.5.patch

Was not using the "--no-prefix" option when generating the patch.

> CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"
> -
>
> Key: HIVE-15276
> URL: https://issues.apache.org/jira/browse/HIVE-15276
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.1.0
>Reporter: Grant Sohn
>Assignee: Grant Sohn
>Priority: Trivial
> Attachments: HIVE-15276.1.patch, HIVE-15276.2.patch, 
> HIVE-15276.3.patch, HIVE-15276.4.patch, HIVE-15276.5.patch
>
>
> Found some obvious spelling typos in the CLI help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702667#comment-15702667
 ] 

slim bouguerra commented on HIVE-15273:
---

added PR https://github.com/apache/hive/pull/119

> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15273) Http Client not configured correctly

2016-11-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702666#comment-15702666
 ] 

ASF GitHub Bot commented on HIVE-15273:
---

GitHub user b-slim opened a pull request:

https://github.com/apache/hive/pull/119

HIVE-15273 adding confing to http client



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/b-slim/hive fix_http_client

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/119.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #119


commit ae7c217f6937d8ec818a61df4a1579de0d11d36e
Author: Slim Bouguerra 
Date:   2016-11-23T22:49:07Z

adding confing to http client




> Http Client not configured correctly
> 
>
> Key: HIVE-15273
> URL: https://issues.apache.org/jira/browse/HIVE-15273
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Minor
> Attachments: 0001-adding-confing-to-http-client.patch, 
> HIVE-15273.patch
>
>
> Current used http client by the druid-hive record reader is constructed with 
> default values. Default values of numConnection and ReadTimeout are very 
> small which can lead to following exception " ERROR 
> [2ee34a2b-c8a5-4748-ab91-db3621d2aa5c main] CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: java.io.IOException: org.apache.h
> ive.druid.org.jboss.netty.channel.ChannelException: Channel disconnected"
> Full stack can be found 
> here.https://gist.github.com/b-slim/384ca6a96698f5b51ad9b171cff556a2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators from the ConditionalTask

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702763#comment-15702763
 ] 

Hive QA commented on HIVE-15114:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840685/HIVE-15114.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 10736 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=91)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join_without_localtask]
 (batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_3]
 (batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[date_join1] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby6_noskew] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[mapjoin_test_outer] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[merge2] (batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_join_union] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_11] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[timestamp_2] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_9] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2299/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2299/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2299/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840685 - PreCommit-HIVE-Build

> Remove extra MoveTask operators from the ConditionalTask
> 
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.3.patch, HIVE-15114.4.patch, 
> HIVE-15114.5.patch, HIVE-15114.WIP.1.patch, HIVE-15114.WIP.2.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15276) CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"

2016-11-28 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702764#comment-15702764
 ] 

Alan Gates commented on HIVE-15276:
---

+1, this patch looks good. You'll need to regenerate the patch against master 
as it doesn't currently apply cleanly, which is why the tests are failing.  
Once we've had a clean test run the patch can be committed.

> CLIs spell "substitution" as "subsitution" and "auxiliary" as "auxillary"
> -
>
> Key: HIVE-15276
> URL: https://issues.apache.org/jira/browse/HIVE-15276
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.1.0
>Reporter: Grant Sohn
>Assignee: Grant Sohn
>Priority: Trivial
> Attachments: HIVE-15276.1.patch, HIVE-15276.2.patch, 
> HIVE-15276.3.patch, HIVE-15276.4.patch
>
>
> Found some obvious spelling typos in the CLI help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15277:
--
Attachment: HIVE-15277.patch

> Teach Hive how to create/delete Druid segments 
> ---
>
> Key: HIVE-15277
> URL: https://issues.apache.org/jira/browse/HIVE-15277
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
> Attachments: HIVE-15277.patch, file.patch
>
>
> We want to extend the DruidStorageHandler to support CTAS queries.
> In this implementation Hive will generate druid segment files and insert the 
> metadata to signal the handoff to druid.
> The syntax will be as follows:
> {code:sql}
> CREATE TABLE druid_table_1
> STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
> TBLPROPERTIES ("druid.datasource" = "datasourcename")
> AS  `metric2`>;
> {code}
> This statement stores the results of query  in a Druid 
> datasource named 'datasourcename'. One of the columns of the query needs to 
> be the time dimension, which is mandatory in Druid. In particular, we use the 
> same convention that it is used for Druid: there needs to be a the column 
> named '__time' in the result of the executed query, which will act as the 
> time dimension column in Druid. Currently, the time column dimension needs to 
> be a 'timestamp' type column.
> metrics can be of type long, double and float while dimensions are strings. 
> Keep in mind that druid has a clear separation between dimensions and 
> metrics, therefore if you have a column in hive that contains number and need 
> to be presented as dimension use the cast operator to cast as string. 
> This initial implementation interacts with Druid Meta data storage to 
> add/remove the table in druid, user need to supply the meta data config as 
> --hiveconf hive.druid.metadata.password=XXX --hiveconf 
> hive.druid.metadata.username=druid --hiveconf 
> hive.druid.metadata.uri=jdbc:mysql://host/druid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15292) set command without v dumps too many settings

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15292:

Summary: set command without v dumps too many settings  (was: set command 
without v seems to dump too many settings)

> set command without v dumps too many settings
> -
>
> Key: HIVE-15292
> URL: https://issues.apache.org/jira/browse/HIVE-15292
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> I think the problem might be the "defaults" map that we automatically add to 
> config. Need to add an option to only dump settings that differ from Hive 
> defaults, or change the default behavior to that, since that was the intended 
> effect, it seems



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15293) add toString to OpTraits

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15293:

Priority: Trivial  (was: Major)

> add toString to OpTraits
> 
>
> Key: HIVE-15293
> URL: https://issues.apache.org/jira/browse/HIVE-15293
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Trivial
>
> The traits logging is completely pointless right now



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702634#comment-15702634
 ] 

Sergio Peña commented on HIVE-15280:


I think the .q test case is not useful either because it shows the timestamp 
when the files were written. Forget about the 2nd patch.

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702653#comment-15702653
 ] 

Hive QA commented on HIVE-15280:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840669/HIVE-15280.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10733 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=90)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2298/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2298/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2298/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840669 - PreCommit-HIVE-Build

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15114) Remove extra MoveTask operators from the ConditionalTask

2016-11-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15114:
---
Attachment: HIVE-15114.5.patch

[~stakiar] [~aihuaxu] I added a couple of test cases to verify dynamic 
partitions are working correctly. 

> Remove extra MoveTask operators from the ConditionalTask
> 
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.3.patch, HIVE-15114.4.patch, 
> HIVE-15114.5.patch, HIVE-15114.WIP.1.patch, HIVE-15114.WIP.2.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702650#comment-15702650
 ] 

Sahil Takiar commented on HIVE-15280:
-

[~spena] in that case, it may be easier to just add unit tests for these 
methods. The unit tests could just be added for {{copyFiles(HiveConf conf, Path 
srcf, Path destf, FileSystem fs, boolean isSrcLocal, boolean isAcid, List 
newFiles)}} which already is package protected.

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15279) map join dummy operators are not set up correctly in certain cases with merge join

2016-11-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702910#comment-15702910
 ] 

Sergey Shelukhin commented on HIVE-15279:
-

We can probably also initialize those.

> map join dummy operators are not set up correctly in certain cases with merge 
> join
> --
>
> Key: HIVE-15279
> URL: https://issues.apache.org/jira/browse/HIVE-15279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15279.patch
>
>
> As a result, MapJoin is not initialized and there's NPE later.
> Tez-specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15240) Updating/Altering stats in metastore can be expensive in S3

2016-11-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702924#comment-15702924
 ] 

Sergey Shelukhin commented on HIVE-15240:
-

+1

> Updating/Altering stats in metastore can be expensive in S3
> ---
>
> Key: HIVE-15240
> URL: https://issues.apache.org/jira/browse/HIVE-15240
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-15240.1.patch, HIVE-15240.2.patch, 
> HIVE-15240.3.patch, HIVE-15240.5.patch, HIVE-15240.6.patch
>
>
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L630
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java#L367
> If there are 100 partitions, it iterates every partition to determine its 
> location taking up more than good amount of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15277) Teach Hive how to create/delete Druid segments

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703282#comment-15703282
 ] 

Hive QA commented on HIVE-15277:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840702/HIVE-15277.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2305/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2305/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2305/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-11-28 22:02:14.114
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2305/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-11-28 22:02:14.117
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   abab282..78ab72e  branch-2.1 -> origin/branch-2.1
+ git reset --hard HEAD
HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios 
(Sushanth Sowmyan reviewed by Vaibhav Gumashta)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios 
(Sushanth Sowmyan reviewed by Vaibhav Gumashta)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-11-28 22:02:16.126
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java
patching file common/src/java/org/apache/hadoop/hive/conf/Constants.java
patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
patching file druid-handler/README.md
patching file druid-handler/pom.xml
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidOutputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidQueryBasedInputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/HiveDruidSplit.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidOutputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidQueryBasedInputFormat.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/DruidRecordWriter.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/io/HiveDruidSplit.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidGroupByQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSelectQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDe.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidSerDeUtils.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTimeseriesQueryRecordReader.java
patching file 
druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidTopNQueryRecordReader.java
patching file 
druid-handler/src/test/org/apache/hadoop/hive/druid/DruidStorageHandlerTest.java
patching file 
druid-handler/src/test/org/apache/hadoop/hive/druid/QTestDruidSerDe.java
patching file 

[jira] [Updated] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-28 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15074:
---
Status: Patch Available  (was: Open)

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15114) Remove extra MoveTask operators from the ConditionalTask

2016-11-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15114:
---
Attachment: HIVE-15114.6.patch

Here's a new patch with many more test cases to validate each scenario of:
INSERT INTO TABLE WITH/WITHOUT DYNAMIC PARTITIONS
INSERT OVERWRITE TABLE WITH/WITHOUT DYNAMIC PARTITIONS
INSERT OVERWRITE DIRECTORY
FROM ... INSERT OVERWRITE DIRECTORY

> Remove extra MoveTask operators from the ConditionalTask
> 
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.3.patch, HIVE-15114.4.patch, 
> HIVE-15114.5.patch, HIVE-15114.6.patch, HIVE-15114.WIP.1.patch, 
> HIVE-15114.WIP.2.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703213#comment-15703213
 ] 

Sergio Peña commented on HIVE-15280:


Correct, that's why I did it that way. I did not know how to spy the 
Hive.needCopy() as it is a static method.

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch, 
> HIVE-15280.3.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15297) Hive should not split semicolon within quoted string literals

2016-11-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15297:
---
Attachment: HIVE-15297.01.patch

> Hive should not split semicolon within quoted string literals
> -
>
> Key: HIVE-15297
> URL: https://issues.apache.org/jira/browse/HIVE-15297
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15297.01.patch
>
>
> String literals in query cannot have reserved symbols. The same set of query 
> works fine in mysql and postgresql. 
> {code}
> hive> CREATE TABLE ts(s varchar(550));
> OK
> Time taken: 0.075 seconds
> hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0');
> MismatchedTokenException(14!=326)
>   at 
> org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
>   at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 
> 'Mozilla' in value row constructor
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15297) Hive should not split semicolon within quoted string literals

2016-11-28 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15297:
---
Status: Patch Available  (was: Open)

> Hive should not split semicolon within quoted string literals
> -
>
> Key: HIVE-15297
> URL: https://issues.apache.org/jira/browse/HIVE-15297
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15297.01.patch
>
>
> String literals in query cannot have reserved symbols. The same set of query 
> works fine in mysql and postgresql. 
> {code}
> hive> CREATE TABLE ts(s varchar(550));
> OK
> Time taken: 0.075 seconds
> hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0');
> MismatchedTokenException(14!=326)
>   at 
> org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
>   at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 
> 'Mozilla' in value row constructor
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15124) Fix OrcInputFormat to use reader's schema for include boolean array

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703273#comment-15703273
 ] 

Hive QA commented on HIVE-15124:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840697/HIVE-15124.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10736 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable (batchId=213)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2304/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2304/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2304/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840697 - PreCommit-HIVE-Build

> Fix OrcInputFormat to use reader's schema for include boolean array
> ---
>
> Key: HIVE-15124
> URL: https://issues.apache.org/jira/browse/HIVE-15124
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-15124.patch, HIVE-15124.patch, HIVE-15124.patch, 
> HIVE-15124.patch
>
>
> Currently, the OrcInputFormat uses the file's schema rather than the reader's 
> schema. This means that SchemaEvolution fails with an 
> ArrayIndexOutOfBoundsException if a partition has a different schema than the 
> table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-28 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15074:
---
Attachment: (was: HIVE-15074.patch)

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-11-28 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Status: Open  (was: Patch Available)

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15192.2.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-11-28 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Status: Patch Available  (was: Open)

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15192.2.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-11-28 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Attachment: HIVE-15192.2.patch

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15192.2.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-11-28 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Attachment: (was: HIVE-15192.2.patch)

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15192.2.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15272) "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark

2016-11-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703275#comment-15703275
 ] 

Xuefu Zhang commented on HIVE-15272:


[~VPareek] would you mind providing a repro case (data, ddl, and query)? Thanks.

> "LEFT OUTER JOIN" Is not populating correct records with Hive On Spark
> --
>
> Key: HIVE-15272
> URL: https://issues.apache.org/jira/browse/HIVE-15272
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 1.1.0
> Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4
>Reporter: Vikash Pareek
>
> I ran following Hive query multiple times with execution engine as Hive on 
> Spark and Hive on MapReduce.
> {code}
> SELECT COUNT(DISTINCT t1.region, t1.amount)
> FROM my_db.my_table1 t1
> LEFT OUTER
> JOIN my-db.my_table2 t2 ON (t1.id = t2.id
> AND t1.name = t2.name)
> {code}
> With Hive on Spark: Result (count) were different of every execution.
> With Hive on MapReduce: Result (count) were same of every execution.
> Seems like Hive on Spark behaving differently in each execution and does not 
> populating correct result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-28 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15074:
---
Attachment: HIVE-15074.patch

[~aihuaxu], [~ychena], and [~ngangam], could you help to review the patch, 
thanks

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15279) map join dummy operators are not set up correctly in certain cases with merge join

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703496#comment-15703496
 ] 

Hive QA commented on HIVE-15279:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840713/HIVE-15279.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 38 failed/errored test(s), 10174 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=131)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=138)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
 (batchId=90)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
 (batchId=91)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query39] 
(batchId=219)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=95)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMapJoinOnTez (batchId=199)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMergeJoinOnTez (batchId=199)
org.apache.hadoop.hive.ql.TestAcidOnTezWithSplitUpdate.testMapJoinOnTez 
(batchId=203)
org.apache.hadoop.hive.ql.TestAcidOnTezWithSplitUpdate.testMergeJoinOnTez 
(batchId=203)
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithTez.testFetchResultsOfLogWithExecutionMode
 (batchId=208)
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithTez.testFetchResultsOfLogWithNoneMode
 (batchId=208)
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithTez.testFetchResultsOfLogWithPerformanceMode
 (batchId=208)
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithTez.testFetchResultsOfLogWithVerboseMode
 (batchId=208)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2307/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2307/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2307/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase

[jira] [Updated] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-28 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15074:
---
Attachment: HIVE-15074.patch

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-28 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703362#comment-15703362
 ] 

Naveen Gangam commented on HIVE-15074:
--

Functionally, it looks good to me. 

Just for semantics to make it consistent with other verify* calls, do we want 
to make it return a boolean to indicate success/failure instead of throwing an 
exception? maybe overload {{verifySchemaVersion()}} that catches the exception 
and returns a boolean? I will defer to [~aihuaxu] for the final call on this.

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15293) add toString to OpTraits

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703392#comment-15703392
 ] 

Hive QA commented on HIVE-15293:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840708/HIVE-15293.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10734 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2306/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2306/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2306/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840708 - PreCommit-HIVE-Build

> add toString to OpTraits
> 
>
> Key: HIVE-15293
> URL: https://issues.apache.org/jira/browse/HIVE-15293
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Trivial
> Attachments: HIVE-15293.patch
>
>
> The traits logging is completely pointless right now



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15270) ExprNode/Sarg changes to support values supplied during query runtime

2016-11-28 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703437#comment-15703437
 ] 

Jason Dere commented on HIVE-15270:
---

Failures not related (created HIVE-15298 for the sample* failures).
[~ashutoshc] [~prasanth_j] can you take a look?

> ExprNode/Sarg changes to support values supplied during query runtime
> -
>
> Key: HIVE-15270
> URL: https://issues.apache.org/jira/browse/HIVE-15270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15270.1.patch, HIVE-15270.2.patch, 
> HIVE-15270.3.patch
>
>
> Infrastructure changes to support retrieval of query-runtime supplied values, 
> needed for dynamic min/max (HIVE-15269) and bloomfilter join optimizations.
> - Some concept of available runtime values that can be retrieved for a 
> MapWork/ReduceWork
> - ExprNode/Sarg changes to pass a Conf during initialization - this allows 
> the expression to retrieve the MapWork at query time (using 
> Utilities.getMapWork(Configuration)) to access runtime-supplied values.
> - Ability to populate the runtime values in Tez mode via incoming Tez edges



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15114) Remove extra MoveTask operators from the ConditionalTask

2016-11-28 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703657#comment-15703657
 ] 

Sahil Takiar commented on HIVE-15114:
-

In general, new tests LGTM. Should we split the tests for dynamic partitions 
into their own .q files?

> Remove extra MoveTask operators from the ConditionalTask
> 
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.3.patch, HIVE-15114.4.patch, 
> HIVE-15114.5.patch, HIVE-15114.6.patch, HIVE-15114.WIP.1.patch, 
> HIVE-15114.WIP.2.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-11-28 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Status: Patch Available  (was: Open)

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15192.2.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15294) Capture additional metadata to replicate a simple insert at destination

2016-11-28 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-15294:

Summary: Capture additional metadata to replicate a simple insert at 
destination  (was: Capture additional metadata to replicate an insert at 
destination)

> Capture additional metadata to replicate a simple insert at destination
> ---
>
> Key: HIVE-15294
> URL: https://issues.apache.org/jira/browse/HIVE-15294
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> For replicating inserts like {{INSERT INTO ... SELECT ... FROM}}, we will 
> need to capture the newly added files in the notification message to be able 
> to replicate the event at destination. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15296) AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15296:

Attachment: HIVE-15279.02.patch

fixing the NPE in the latest patch

> AM may lose task failures and not reschedule when scheduling to LLAP
> 
>
> Key: HIVE-15296
> URL: https://issues.apache.org/jira/browse/HIVE-15296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: HIVE-15279.02.patch
>
>
> First attempt and failure detection:
> {noformat}
> 2016-11-18 20:20:01,980 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> capability=memory:4096, vCores:1, hosts=[3n01]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: Assigned task 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=0, containerId=null, assignedInstance=null, uniqueId=55, 
> localityDelayTimeout=9223372036854775807} to container 
> container_1_2622_01_56 on node=DynamicServiceInstance 
> [alive=true, host=3n01:15001 with resources=memory:59392, vCores:16, 
> shufflePort=15551, servicesAddress=http://3n01:15002, mgmtPort=15004]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=10550817928, containerId=container_1_2622_01_56, 
> assignedInstance=DynamicServiceInstance [alive=true, host=3n01:15001 with 
> resources=memory:59392, vCores:16, shufflePort=15551, 
> servicesAddress=http://3n01:15002, mgmtPort=15004], uniqueId=55, 
> localityDelayTimeout=9223372036854775807} = SCHEDULED
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |impl.TaskAttemptImpl|: TaskAttempt: 
> [attempt_1478967587833_2622_1_06_31_0] started. Is using containerId: 
> [container_1_2622_01_56] on NM: [3n01:15001]
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_STARTED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> startTime=1479500403427, containerId=container_1_2622_01_56, 
> nodeId=3n01:15001
> 2016-11-18 20:20:03,430 [INFO] [TaskCommunicator # 1] 
> |tezplugins.LlapTaskCommunicator|: Successfully launched task: 
> attempt_1478967587833_2622_1_06_31_0
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.TaskImpl|: TaskAttempt:attempt_1478967587833_2622_1_06_31_0 sent 
> events: (0-1).
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.VertexImpl|: Sending attempt_1478967587833_2622_1_06_31_0 24 events 
> [0,24) total 24 vertex_1478967587833_2622_1_06 [Map 1]
> 2016-11-18 20:25:43,249 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> creationTime=1479500401929, allocationTime=1479500403426, 
> startTime=1479500403427, finishTime=1479500743249, timeTaken=339822, 
> status=FAILED, taskFailureType=NON_FATAL, errorEnum=TASK_HEARTBEAT_ERROR, 
> diagnostics=AttemptID:attempt_1478967587833_2622_1_06_31_0 Timed out 
> after 300 secs, nodeHttpAddress=http://3n01:15002, counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:25:43,255 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Processing de-allocate request for 
> task=attempt_1478967587833_2622_1_06_31_0, state=ASSIGNED, endReason=OTHER
> 2016-11-18 20:25:43,259 [INFO] [Dispatcher thread {Central}] 
> |node.AMNodeImpl|: Attempt failed on node: 3n01:15001 TA: 
> attempt_1478967587833_2622_1_06_31_0 failed: true container: 
> container_1_2622_01_56 numFailedTAs: 7
> 2016-11-18 20:25:43,262 [INFO] [Dispatcher thread {Central}] 
> |impl.VertexImpl|: Source task attempt completed for vertex: 
> vertex_1478967587833_2622_1_07 [Reducer 2] attempt: 
> attempt_1478967587833_2622_1_06_31_0 with state: FAILED vertexState: 
> RUNNING
> {noformat}
> Second attempt:
> {noformat}
> 2016-11-18 20:25:43,267 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> capability=memory:4096, vCores:1, hosts=null
> 2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> 

[jira] [Updated] (HIVE-15296) AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15296:

Attachment: (was: HIVE-15279.02.patch)

> AM may lose task failures and not reschedule when scheduling to LLAP
> 
>
> Key: HIVE-15296
> URL: https://issues.apache.org/jira/browse/HIVE-15296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> First attempt and failure detection:
> {noformat}
> 2016-11-18 20:20:01,980 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> capability=memory:4096, vCores:1, hosts=[3n01]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: Assigned task 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=0, containerId=null, assignedInstance=null, uniqueId=55, 
> localityDelayTimeout=9223372036854775807} to container 
> container_1_2622_01_56 on node=DynamicServiceInstance 
> [alive=true, host=3n01:15001 with resources=memory:59392, vCores:16, 
> shufflePort=15551, servicesAddress=http://3n01:15002, mgmtPort=15004]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=10550817928, containerId=container_1_2622_01_56, 
> assignedInstance=DynamicServiceInstance [alive=true, host=3n01:15001 with 
> resources=memory:59392, vCores:16, shufflePort=15551, 
> servicesAddress=http://3n01:15002, mgmtPort=15004], uniqueId=55, 
> localityDelayTimeout=9223372036854775807} = SCHEDULED
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |impl.TaskAttemptImpl|: TaskAttempt: 
> [attempt_1478967587833_2622_1_06_31_0] started. Is using containerId: 
> [container_1_2622_01_56] on NM: [3n01:15001]
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_STARTED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> startTime=1479500403427, containerId=container_1_2622_01_56, 
> nodeId=3n01:15001
> 2016-11-18 20:20:03,430 [INFO] [TaskCommunicator # 1] 
> |tezplugins.LlapTaskCommunicator|: Successfully launched task: 
> attempt_1478967587833_2622_1_06_31_0
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.TaskImpl|: TaskAttempt:attempt_1478967587833_2622_1_06_31_0 sent 
> events: (0-1).
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.VertexImpl|: Sending attempt_1478967587833_2622_1_06_31_0 24 events 
> [0,24) total 24 vertex_1478967587833_2622_1_06 [Map 1]
> 2016-11-18 20:25:43,249 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> creationTime=1479500401929, allocationTime=1479500403426, 
> startTime=1479500403427, finishTime=1479500743249, timeTaken=339822, 
> status=FAILED, taskFailureType=NON_FATAL, errorEnum=TASK_HEARTBEAT_ERROR, 
> diagnostics=AttemptID:attempt_1478967587833_2622_1_06_31_0 Timed out 
> after 300 secs, nodeHttpAddress=http://3n01:15002, counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:25:43,255 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Processing de-allocate request for 
> task=attempt_1478967587833_2622_1_06_31_0, state=ASSIGNED, endReason=OTHER
> 2016-11-18 20:25:43,259 [INFO] [Dispatcher thread {Central}] 
> |node.AMNodeImpl|: Attempt failed on node: 3n01:15001 TA: 
> attempt_1478967587833_2622_1_06_31_0 failed: true container: 
> container_1_2622_01_56 numFailedTAs: 7
> 2016-11-18 20:25:43,262 [INFO] [Dispatcher thread {Central}] 
> |impl.VertexImpl|: Source task attempt completed for vertex: 
> vertex_1478967587833_2622_1_07 [Reducer 2] attempt: 
> attempt_1478967587833_2622_1_06_31_0 with state: FAILED vertexState: 
> RUNNING
> {noformat}
> Second attempt:
> {noformat}
> 2016-11-18 20:25:43,267 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> capability=memory:4096, vCores:1, hosts=null
> 2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> 

[jira] [Commented] (HIVE-15296) AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703736#comment-15703736
 ] 

Sergey Shelukhin commented on HIVE-15296:
-

er, wrong jira

> AM may lose task failures and not reschedule when scheduling to LLAP
> 
>
> Key: HIVE-15296
> URL: https://issues.apache.org/jira/browse/HIVE-15296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> First attempt and failure detection:
> {noformat}
> 2016-11-18 20:20:01,980 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> capability=memory:4096, vCores:1, hosts=[3n01]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: Assigned task 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=0, containerId=null, assignedInstance=null, uniqueId=55, 
> localityDelayTimeout=9223372036854775807} to container 
> container_1_2622_01_56 on node=DynamicServiceInstance 
> [alive=true, host=3n01:15001 with resources=memory:59392, vCores:16, 
> shufflePort=15551, servicesAddress=http://3n01:15002, mgmtPort=15004]
> 2016-11-18 20:20:01,982 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_0, priority=65, 
> startTime=10550817928, containerId=container_1_2622_01_56, 
> assignedInstance=DynamicServiceInstance [alive=true, host=3n01:15001 with 
> resources=memory:59392, vCores:16, shufflePort=15551, 
> servicesAddress=http://3n01:15002, mgmtPort=15004], uniqueId=55, 
> localityDelayTimeout=9223372036854775807} = SCHEDULED
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |impl.TaskAttemptImpl|: TaskAttempt: 
> [attempt_1478967587833_2622_1_06_31_0] started. Is using containerId: 
> [container_1_2622_01_56] on NM: [3n01:15001]
> 2016-11-18 20:20:03,427 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_STARTED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> startTime=1479500403427, containerId=container_1_2622_01_56, 
> nodeId=3n01:15001
> 2016-11-18 20:20:03,430 [INFO] [TaskCommunicator # 1] 
> |tezplugins.LlapTaskCommunicator|: Successfully launched task: 
> attempt_1478967587833_2622_1_06_31_0
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.TaskImpl|: TaskAttempt:attempt_1478967587833_2622_1_06_31_0 sent 
> events: (0-1).
> 2016-11-18 20:20:03,434 [INFO] [IPC Server handler 11 on 43092] 
> |impl.VertexImpl|: Sending attempt_1478967587833_2622_1_06_31_0 24 events 
> [0,24) total 24 vertex_1478967587833_2622_1_06 [Map 1]
> 2016-11-18 20:25:43,249 [INFO] [Dispatcher thread {Central}] 
> |history.HistoryEventHandler|: 
> [HISTORY][DAG:dag_1478967587833_2622_1][Event:TASK_ATTEMPT_FINISHED]: 
> vertexName=Map 1, taskAttemptId=attempt_1478967587833_2622_1_06_31_0, 
> creationTime=1479500401929, allocationTime=1479500403426, 
> startTime=1479500403427, finishTime=1479500743249, timeTaken=339822, 
> status=FAILED, taskFailureType=NON_FATAL, errorEnum=TASK_HEARTBEAT_ERROR, 
> diagnostics=AttemptID:attempt_1478967587833_2622_1_06_31_0 Timed out 
> after 300 secs, nodeHttpAddress=http://3n01:15002, counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:25:43,255 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Processing de-allocate request for 
> task=attempt_1478967587833_2622_1_06_31_0, state=ASSIGNED, endReason=OTHER
> 2016-11-18 20:25:43,259 [INFO] [Dispatcher thread {Central}] 
> |node.AMNodeImpl|: Attempt failed on node: 3n01:15001 TA: 
> attempt_1478967587833_2622_1_06_31_0 failed: true container: 
> container_1_2622_01_56 numFailedTAs: 7
> 2016-11-18 20:25:43,262 [INFO] [Dispatcher thread {Central}] 
> |impl.VertexImpl|: Source task attempt completed for vertex: 
> vertex_1478967587833_2622_1_07 [Reducer 2] attempt: 
> attempt_1478967587833_2622_1_06_31_0 with state: FAILED vertexState: 
> RUNNING
> {noformat}
> Second attempt:
> {noformat}
> 2016-11-18 20:25:43,267 [INFO] [TaskSchedulerEventHandlerThread] 
> |tezplugins.LlapTaskSchedulerService|: Received allocateRequest. 
> task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> capability=memory:4096, vCores:1, hosts=null
> 2016-11-18 20:25:43,297 [INFO] [LlapScheduler] 
> |tezplugins.LlapTaskSchedulerService|: ScheduleResult for Task: 
> TaskInfo{task=attempt_1478967587833_2622_1_06_31_1, priority=64, 
> 

[jira] [Updated] (HIVE-15279) map join dummy operators are not set up correctly in certain cases with merge join

2016-11-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15279:

Attachment: HIVE-15279.02.patch

fixing the NPE in the latest patch

> map join dummy operators are not set up correctly in certain cases with merge 
> join
> --
>
> Key: HIVE-15279
> URL: https://issues.apache.org/jira/browse/HIVE-15279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15279.01.patch, HIVE-15279.02.patch, 
> HIVE-15279.patch
>
>
> As a result, MapJoin is not initialized and there's NPE later.
> Tez-specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15227) Optimize join + gby into semijoin

2016-11-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15227:

Status: Open  (was: Patch Available)

> Optimize join + gby into semijoin
> -
>
> Key: HIVE-15227
> URL: https://issues.apache.org/jira/browse/HIVE-15227
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-15227.2.patch, HIVE-15227.patch
>
>
> Calcite has a rule which can do this transformation. Lets take advantage of 
> this since Hive has native Left semi join operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15295) Fix HCatalog javadoc generation with Java 8

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703652#comment-15703652
 ] 

Hive QA commented on HIVE-15295:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840714/HIVE-15295.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10720 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables_compact]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=96)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2308/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2308/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2308/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840714 - PreCommit-HIVE-Build

> Fix HCatalog javadoc generation with Java 8
> ---
>
> Key: HIVE-15295
> URL: https://issues.apache.org/jira/browse/HIVE-15295
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, HCatalog
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Fix For: 2.1.1
>
> Attachments: HIVE-15295.patch
>
>
> Realized while generating artifacts for Hive 2.1.1 release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15149) Add additional information to ATSHook for Tez UI

2016-11-28 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-15149:
--
Attachment: HIVE-15149.2.patch

Adding perf logger timings (post-hook)

> Add additional information to ATSHook for Tez UI
> 
>
> Key: HIVE-15149
> URL: https://issues.apache.org/jira/browse/HIVE-15149
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15149.1.patch, HIVE-15149.2.patch
>
>
> Additional query details wanted for TEZ-3530. The additional details 
> discussed include the following:
> Publish the following info ( in addition to existing bits published today):
> Application Id to which the query was submitted (primary filter)
> DAG Id (primary filter)
> Hive query name (primary filter)
> Hive Configs (everything a set command would provide except for sensitive 
> credential info)
> Potentially publish source of config i.e. set in hive query script vs 
> hive-site.xml, etc.
> Which HiveServer2 the query was submitted to
> *Which IP/host the query was submitted from - not sure what filter support 
> will be available.
> Which execution mode the query is running in (primary filter)
> What submission mode was used (cli/beeline/jdbc, etc)
> User info ( running as, actual end user, etc) - not sure if already present
> Perf logger events. The data published should be able to create a timeline 
> view of the query i.e. actual submission time, query compile timestamps, 
> execution timestamps, post-exec data moves, etc.
> Explain plan with enough details for visualizing.
> Databases and tables being queried (primary filter)
> Yarn queue info (primary filter)
> Caller context (primary filter)
> Original source i.e. submitter
> Thread info in HS2 if needed ( I believe Vikram may have added this earlier )
> Query time taken (with filter support )  
> Additional context info e.g. llap instance name and appId if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15280) Hive.mvFile() misses the "." char when joining the filename + extension

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703752#comment-15703752
 ] 

Hive QA commented on HIVE-15280:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840716/HIVE-15280.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10742 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite 
(batchId=183)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2310/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2310/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2310/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840716 - PreCommit-HIVE-Build

> Hive.mvFile() misses the "." char when joining the filename + extension
> ---
>
> Key: HIVE-15280
> URL: https://issues.apache.org/jira/browse/HIVE-15280
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
> Attachments: HIVE-15280.1.patch, HIVE-15280.2.patch, 
> HIVE-15280.3.patch
>
>
> Hive.mvFile() misses the "." char when joining the filename + extension. This 
> may cause incorrect results when compressed files are copied to a table 
> location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15227) Optimize join + gby into semijoin

2016-11-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15227:

Attachment: HIVE-15227.3.patch

> Optimize join + gby into semijoin
> -
>
> Key: HIVE-15227
> URL: https://issues.apache.org/jira/browse/HIVE-15227
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-15227.2.patch, HIVE-15227.3.patch, HIVE-15227.patch
>
>
> Calcite has a rule which can do this transformation. Lets take advantage of 
> this since Hive has native Left semi join operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15227) Optimize join + gby into semijoin

2016-11-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15227:

Status: Patch Available  (was: Open)

> Optimize join + gby into semijoin
> -
>
> Key: HIVE-15227
> URL: https://issues.apache.org/jira/browse/HIVE-15227
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-15227.2.patch, HIVE-15227.3.patch, HIVE-15227.patch
>
>
> Calcite has a rule which can do this transformation. Lets take advantage of 
> this since Hive has native Left semi join operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15074) Schematool provides a way to detect invalid entries in VERSION table

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703834#comment-15703834
 ] 

Hive QA commented on HIVE-15074:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840735/HIVE-15074.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2312/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2312/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2312/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-11-29 01:44:23.550
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2312/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-11-29 01:44:23.553
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios 
(Sushanth Sowmyan reviewed by Vaibhav Gumashta)
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/specialChar.q
Removing ql/src/test/results/clientpositive/specialChar.q.out
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios 
(Sushanth Sowmyan reviewed by Vaibhav Gumashta)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-11-29 01:44:24.497
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java: No such 
file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hive/beeline/TestSchemaTool.java: 
No such file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12840735 - PreCommit-HIVE-Build

> Schematool provides a way to detect invalid entries in VERSION table
> 
>
> Key: HIVE-15074
> URL: https://issues.apache.org/jira/browse/HIVE-15074
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15074.patch
>
>
> For some unknown reason, we see customer's HMS can not start because there 
> are multiple entries in their HMS VERSION table. Schematool should provide a 
> way to validate the HMS db and provide warning and fix options for this kind 
> of issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15297) Hive should not split semicolon within quoted string literals

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703830#comment-15703830
 ] 

Hive QA commented on HIVE-15297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840728/HIVE-15297.01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 316 failed/errored test(s), 10735 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist] 
(batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[allcolref_in_udf] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter1] (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter2] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter3] (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter4] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter5] (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_index] (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby2] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_9] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_admin_almighty2]
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_cli_nonsql]
 (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_show_grant]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values]
 (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin11] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_gby_empty] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_annotate_stats_groupby]
 (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join0] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_gby_empty] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_join0] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_simple_select] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_stats] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_subq_exists] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_union] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_windowing] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_simple_select] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_stats] (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_subq_exists] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_union] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_windowing] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[char_cast] (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl_dp] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compile_processor] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compustat_avro] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[concat_op] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constprog_partitioner] 
(batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constprog_type] 
(batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer5] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_escape] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_func1] 
(batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_translate] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_5] (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_4] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_5] (batchId=2)

[jira] [Commented] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-11-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703838#comment-15703838
 ] 

Hive QA commented on HIVE-15192:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12840767/HIVE-15192.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2313/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2313/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2313/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-11-29 01:44:59.948
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2313/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-11-29 01:44:59.950
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios 
(Sushanth Sowmyan reviewed by Vaibhav Gumashta)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 63bdfa6 HIVE-15284: Add junit test to test replication scenarios 
(Sushanth Sowmyan reviewed by Vaibhav Gumashta)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-11-29 01:45:00.874
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p1
patching file itests/src/test/resources/testconfiguration.properties
patching file pom.xml
patching file ql/src/java/org/apache/hadoop/hive/ql/lib/SubQueryWalker.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelShuttle.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelShuttleImpl.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveFilter.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveJoin.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveProject.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckCtx.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeSubQueryDesc.java
patching file ql/src/test/queries/clientnegative/subquery_restrictions.q
patching file ql/src/test/queries/clientpositive/subquery_in.q
patching file ql/src/test/queries/clientpositive/subquery_notin.q
patching file ql/src/test/results/clientnegative/subquery_restrictions.q.out
patching file ql/src/test/results/clientpositive/llap/subquery_exists.q.out
patching file ql/src/test/results/clientpositive/llap/subquery_in.q.out
patching file ql/src/test/results/clientpositive/llap/subquery_notin.q.out
patching file ql/src/test/results/clientpositive/spark/subquery_exists.q.out
patching file ql/src/test/results/clientpositive/spark/subquery_in.q.out
patching file ql/src/test/results/clientpositive/subquery_exists.q.out
patching file ql/src/test/results/clientpositive/subquery_exists_having.q.out

[jira] [Commented] (HIVE-15258) Enable CBO on queries involving interval literals

2016-11-28 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703598#comment-15703598
 ] 

Pengcheng Xiong commented on HIVE-15258:


I downloaded the patch and applied it to the master. Checked several q tests 
and all of them worked well. I think it is good to enable the interval in Hive 
after CALCITE-1020 (fixed in 1.6) got in. LGTM +1. Btw, please file a calcite 
bug for the return type error for timestamp minus timestamp.

> Enable CBO on queries involving interval literals
> -
>
> Key: HIVE-15258
> URL: https://issues.apache.org/jira/browse/HIVE-15258
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-15258.2.patch, HIVE-15258.patch
>
>
> Currently, queries fail and fall back to non-cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >