[jira] [Commented] (HIVE-12525) Cleanup unused metrics in HMS

2016-09-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15535096#comment-15535096
 ] 

Lefty Leverenz commented on HIVE-12525:
---

[~szehon], this also needs to be committed to branch-1.

> Cleanup unused metrics in HMS
> -
>
> Key: HIVE-12525
> URL: https://issues.apache.org/jira/browse/HIVE-12525
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 2.0.0
>
> Attachments: HIVE-12525.patch
>
>
> I had added these without much thought when writing the metrics-framework to 
> test out the concept.
> Looking back, these actually need of more investigation, as some are actually 
> wrong or at least do not add much value.  Wrong is the active-transaction, as 
> actually each ObjectStore is a thread-local, and an aggregate number is what 
> was meant.  Open/committed/rollback need some investigation what really helps.
> Goal is to remove these before the release to reduce confusion to users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2016-09-29 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15535044#comment-15535044
 ] 

Vaibhav Gumashta commented on HIVE-14735:
-

Looks like OSX may not have md5sum installed by default. Should we use md5 on 
OSX?

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9423) HiveServer2: Provide the user with different error messages depending on the Thrift client exception code

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534883#comment-15534883
 ] 

Hive QA commented on HIVE-9423:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830914/HIVE-9423.6-branch-2.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 252 failed/errored test(s), 10435 tests 
executed
*Failed tests:*
{noformat}
249_TestHWISessionManager - did not produce a TEST-*.xml file
392_TestMsgBusConnection - did not produce a TEST-*.xml file
792_TestJdbcWithMiniKdcSQLAuthHttp - did not produce a TEST-*.xml file
793_TestJdbcWithMiniKdc - did not produce a TEST-*.xml file
794_TestHs2HooksWithMiniKdc - did not produce a TEST-*.xml file
796_TestJdbcWithDBTokenStore - did not produce a TEST-*.xml file
797_TestJdbcWithMiniKdcCookie - did not produce a TEST-*.xml file
798_TestJdbcNonKrbSASLWithMiniKdc - did not produce a TEST-*.xml file
800_TestJdbcWithMiniKdcSQLAuthBinary - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_table_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnStatsUpdateForStatsOptimizer_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnStatsUpdateForStatsOptimizer_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_full
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial_ndv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fouter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table_use_metadata
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join34
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_json_serde1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_14

[jira] [Updated] (HIVE-14865) Fix comments after HIVE-14350

2016-09-29 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14865:
--
Attachment: HIVE-14865.patch

[~alangates] could you review please.  HIVE-14350 moved AcidUtils.isValidBase() 
to ValidTxnList.isValidBase() and made it independent of aborted txns.

this patch just fixes the now misleading comments

> Fix comments after HIVE-14350
> -
>
> Key: HIVE-14865
> URL: https://issues.apache.org/jira/browse/HIVE-14865
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14865.patch
>
>
> there are still some comments in the code that should've been updated in 
> HIVE-14350



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14865) Fix comments after HIVE-14350

2016-09-29 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14865:
--
Component/s: Transactions

> Fix comments after HIVE-14350
> -
>
> Key: HIVE-14865
> URL: https://issues.apache.org/jira/browse/HIVE-14865
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> there are still some comments in the code that should've been updated in 
> HIVE-14350



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2016-09-29 Thread Rich Kobylinski (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534688#comment-15534688
 ] 

Rich Kobylinski commented on HIVE-11812:


I am out of the office Friday September 30, returning Monday.

Please contact Claire Spettell 860-273-9791 with issues that require immediate 
response.

This e-mail may contain confidential or privileged information. If you think 
you have received this e-mail in error, please advise the sender by reply 
e-mail and then delete this e-mail immediately. Thank you.


> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
> Attachments: HIVE-11812.1.patch
>
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2016-09-29 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534684#comment-15534684
 ] 

Jason Dere commented on HIVE-11812:
---

Ok. So there may be some additional investigation required here, to get this 
working in all timezones.

> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
> Attachments: HIVE-11812.1.patch
>
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-09-29 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14822:
---
Attachment: HIVE-14822.02.patch

First patch didn't work when distcp is used in the MoveTask. Adding the fix for 
that in the second patch.

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14856) create table with select from table limit is failing with NFE if limit exceed than allowed 32bit integer length

2016-09-29 Thread Rajkumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-14856:
--
Attachment: HIVE-14856.1-branch-1.2.patch

> create table with select from table limit is failing with NFE if limit exceed 
> than allowed 32bit integer length
> ---
>
> Key: HIVE-14856
> URL: https://issues.apache.org/jira/browse/HIVE-14856
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: centos 6.6
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
> Fix For: 1.2.1
>
> Attachments: HIVE-14856.1-branch-1.2.patch, HIVE-14856.patch
>
>
> query with limit is failing with NumberFormatException if the limit exceeds 
> 32bit integer length.
> create table sample1 as select * from sample limit 2248321440;
> FAILED: NumberFormatException For input string: "2248321440"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14863) Decimal to int conversion produces incorrect values

2016-09-29 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14863:

Summary: Decimal to int conversion produces incorrect values  (was: Decimal 
to int conversion produces incorrect data)

> Decimal to int conversion produces incorrect values
> ---
>
> Key: HIVE-14863
> URL: https://issues.apache.org/jira/browse/HIVE-14863
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> {noformat}
> > select cast(cast ('111' as decimal(38,0)) as int);
> OK
> 307163591
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13098) Add a strict check for when the decimal gets converted to null due to insufficient width

2016-09-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534325#comment-15534325
 ] 

Sergey Shelukhin commented on HIVE-13098:
-

I will stop working on this for now cause it;s a giant annoying time sink. If 
there are no objections to threadlocal I will go with that.

> Add a strict check for when the decimal gets converted to null due to 
> insufficient width
> 
>
> Key: HIVE-13098
> URL: https://issues.apache.org/jira/browse/HIVE-13098
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13098.WIP.patch, HIVE-13098.WIP2.patch
>
>
> When e.g. 99 is selected as decimal(5,0), the result is null. This can be 
> problematic, esp. if the data is written to a table and lost without the user 
> realizing it. There should be an option to error out in such cases instead; 
> it should probably be on by default and the error message should instruct the 
> user on how to disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13098) Add a strict check for when the decimal gets converted to null due to insufficient width

2016-09-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534285#comment-15534285
 ] 

Sergey Shelukhin edited comment on HIVE-13098 at 9/29/16 10:49 PM:
---

Upon looking further on OI path I don't think it's possible to propagate it 
there without major changes, in fact OI-related parts of this patch are not 
valid, since OIs are assumed to be stateless and are cached process-wide, ditto 
for TypeInfo-s. There are lots of static method paths accessing those...
I think I might scrape a lot of the patch and add a globally accessible static 
that would have to be initialize on CLI/HS2/task startup.. The only exception 
would be write path that happens outside of Hive services... 

This will reduce size of the patch a lot (but also make it a global setting not 
modifiable per query...)

Update: another alternative would be a (TADA!) threadlocal.
We could set it at compile time and change the patch to have only compile paths 
use it, whereas runtime paths would use the fields in OIs and fns that compile 
populates. As much as I hate threadlocals, I think that's the best approach as 
it will make patch smaller (right now 700kb of code changes is not even 
everything, OI changes would be massive), also allow one to set it per query 
and remove the requirement to initialize it for everyone using Hive libs, since 
APIs would not use it beyond compilation.

[~ashutoshc] [~hagleitn] [~jdere] opinions?


was (Author: sershe):
Upon looking further on OI path I don't think it's possible to propagate it 
there without major changes, in fact OI-related parts of this patch are not 
valid, since OIs are assumed to be stateless and are cached process-wide, ditto 
for TypeInfo-s. There are lots of static method paths accessing those...
I think I might scrape a lot of the patch and add a globally accessible static 
that would have to be initialize on CLI/HS2/task startup.. The only exception 
would be write path that happens outside of Hive services... 

This will reduce size of the patch a lot (but also make it a global setting not 
modifiable per query...)

Update: another alternative would be a (TADA!) threadlocal.
We could set it at compile time and change the patch to have only compile paths 
use it, whereas runtime paths would use the fields in OIs and fns that compile 
populates.

[~ashutoshc] [~hagleitn] [~jdere] opinions?

> Add a strict check for when the decimal gets converted to null due to 
> insufficient width
> 
>
> Key: HIVE-13098
> URL: https://issues.apache.org/jira/browse/HIVE-13098
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13098.WIP.patch, HIVE-13098.WIP2.patch
>
>
> When e.g. 99 is selected as decimal(5,0), the result is null. This can be 
> problematic, esp. if the data is written to a table and lost without the user 
> realizing it. There should be an option to error out in such cases instead; 
> it should probably be on by default and the error message should instruct the 
> user on how to disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13098) Add a strict check for when the decimal gets converted to null due to insufficient width

2016-09-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534285#comment-15534285
 ] 

Sergey Shelukhin edited comment on HIVE-13098 at 9/29/16 10:47 PM:
---

Upon looking further on OI path I don't think it's possible to propagate it 
there without major changes, in fact OI-related parts of this patch are not 
valid, since OIs are assumed to be stateless and are cached process-wide, ditto 
for TypeInfo-s. There are lots of static method paths accessing those...
I think I might scrape a lot of the patch and add a globally accessible static 
that would have to be initialize on CLI/HS2/task startup.. The only exception 
would be write path that happens outside of Hive services... 

This will reduce size of the patch a lot (but also make it a global setting not 
modifiable per query...)

Update: another alternative would be a (TADA!) threadlocal.
We could set it at compile time and change the patch to have only compile paths 
use it, whereas runtime paths would use the fields in OIs and fns that compile 
populates.

[~ashutoshc] [~hagleitn] [~jdere] opinions?


was (Author: sershe):
Upon looking further on OI path I don't think it's possible to propagate it 
there without major changes, in fact OI-related parts of this patch are not 
valid, since OIs are assumed to be stateless and are cached process-wide, ditto 
for TypeInfo-s. There are lots of static method paths accessing those...
I think I might scrape a lot of the patch and add a globally accessible static 
that would have to be initialize on CLI/HS2/task startup.. The only exception 
would be write path that happens outside of Hive services... 

This will reduce size of the patch a lot (but also make it a global setting not 
modifiable per query...)

[~ashutoshc] [~hagleitn] [~jdere] opinions?

> Add a strict check for when the decimal gets converted to null due to 
> insufficient width
> 
>
> Key: HIVE-13098
> URL: https://issues.apache.org/jira/browse/HIVE-13098
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13098.WIP.patch, HIVE-13098.WIP2.patch
>
>
> When e.g. 99 is selected as decimal(5,0), the result is null. This can be 
> problematic, esp. if the data is written to a table and lost without the user 
> realizing it. There should be an option to error out in such cases instead; 
> it should probably be on by default and the error message should instruct the 
> user on how to disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13098) Add a strict check for when the decimal gets converted to null due to insufficient width

2016-09-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534285#comment-15534285
 ] 

Sergey Shelukhin commented on HIVE-13098:
-

Upon looking further on OI path I don't think it's possible to propagate it 
there without major changes, in fact OI-related parts of this patch are not 
valid, since OIs are assumed to be stateless and are cached process-wide, ditto 
for TypeInfo-s. There are lots of static method paths accessing those...
I think I might scrape a lot of the patch and add a globally accessible static 
that would have to be initialize on CLI/HS2/task startup.. The only exception 
would be write path that happens outside of Hive services... 

This will reduce size of the patch a lot (but also make it a global setting not 
modifiable per query...)

[~ashutoshc] [~hagleitn] [~jdere] opinions?

> Add a strict check for when the decimal gets converted to null due to 
> insufficient width
> 
>
> Key: HIVE-13098
> URL: https://issues.apache.org/jira/browse/HIVE-13098
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13098.WIP.patch, HIVE-13098.WIP2.patch
>
>
> When e.g. 99 is selected as decimal(5,0), the result is null. This can be 
> problematic, esp. if the data is written to a table and lost without the user 
> realizing it. There should be an option to error out in such cases instead; 
> it should probably be on by default and the error message should instruct the 
> user on how to disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14861) Support precedence for set operator using parentheses

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14861:
---
Status: Patch Available  (was: Open)

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14861.01.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14861) Support precedence for set operator using parentheses

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14861:
---
Attachment: HIVE-14861.01.patch

> Support precedence for set operator using parentheses
> -
>
> Key: HIVE-14861
> URL: https://issues.apache.org/jira/browse/HIVE-14861
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14861.01.patch
>
>
> We should support precedence for set operator by using parentheses. For 
> example
> {code}
> select * from src union all (select * from src union select * from src);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14358) Add metrics for number of queries executed for each execution engine (mr, spark, tez)

2016-09-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534185#comment-15534185
 ] 

Lefty Leverenz commented on HIVE-14358:
---

Good plan, [~zsombor.klara].  (I was mixing up the two web interfaces, thanks 
for setting me straight.)

I just created the child page "Hive Metrics" -- if you have a better title, 
please change it.  I  listed all the metrics in MetricsConstant.java but wasn't 
sure how to deal with the prefixes for HS2 & SQL operations.

* [Hive Metrics | https://cwiki.apache.org/confluence/display/Hive/Hive+Metrics]

Versions and JIRA issues for all the metrics will be added after a bit of 
research.

What about other metrics, such as LLAP metrics created by HIVE-13536?

A Metrics Dump screen shot would be helpful too.

> Add metrics for number of queries executed for each execution engine (mr, 
> spark, tez)
> -
>
> Key: HIVE-14358
> URL: https://issues.apache.org/jira/browse/HIVE-14358
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Lenni Kuff
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14358.patch
>
>
> HiveServer2 currently has a metric for the total number of queries ran since 
> last restart, but it would be useful to also have metrics for number of 
> queries ran for each execution engine. This would improve supportability by 
> allowing users to get a high-level understanding of what workloads had been 
> running on the server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14806) Support UDTF in CBO (AST return path)

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14806:
---
Status: Open  (was: Patch Available)

> Support UDTF in CBO (AST return path)
> -
>
> Key: HIVE-14806
> URL: https://issues.apache.org/jira/browse/HIVE-14806
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14806.01.patch, HIVE-14806.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14806) Support UDTF in CBO (AST return path)

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14806:
---
Attachment: HIVE-14806.02.patch

> Support UDTF in CBO (AST return path)
> -
>
> Key: HIVE-14806
> URL: https://issues.apache.org/jira/browse/HIVE-14806
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14806.01.patch, HIVE-14806.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14806) Support UDTF in CBO (AST return path)

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14806:
---
Status: Patch Available  (was: Open)

address [~ashutoshc]'s comments and also update golden files.

> Support UDTF in CBO (AST return path)
> -
>
> Key: HIVE-14806
> URL: https://issues.apache.org/jira/browse/HIVE-14806
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14806.01.patch, HIVE-14806.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14858) Analyze command should support custom input formats

2016-09-29 Thread Prasanna Rajaperumal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534102#comment-15534102
 ] 

Prasanna Rajaperumal commented on HIVE-14858:
-

+1 Looks good. 

> Analyze command should support custom input formats
> ---
>
> Key: HIVE-14858
> URL: https://issues.apache.org/jira/browse/HIVE-14858
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Attachments: HIVE-14858.1.patch
>
>
> Currently analyze command with partialscan or noscan only applies to 
> OrcInputFormat and MapredParquetInputFormat. However, if custom input formats 
> extend these two they should also be able to use the same command to collect 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2016-09-29 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534076#comment-15534076
 ] 

Matt McCline commented on HIVE-14735:
-

Oh, on my Mac laptop and usually current master.

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2016-09-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534060#comment-15534060
 ] 

Sergio Peña commented on HIVE-14735:


are you running linux or mac? which linux distro if so?

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14819) FunctionInfo for permanent functions shows TEMPORARY FunctionType

2016-09-29 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14819:
--
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Committed to master

> FunctionInfo for permanent functions shows TEMPORARY FunctionType
> -
>
> Key: HIVE-14819
> URL: https://issues.apache.org/jira/browse/HIVE-14819
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 2.1.0
>
> Attachments: HIVE-14819.1.patch, HIVE-14819.2.patch
>
>
> The FunctionInfo has a FunctionType field which describes if the function is 
> a builtin/persistent/temporary function. But for permanent functions, the 
> FunctionInfo being returned by the FunctionRegistry is showing the type to be 
> TEMPORARY.
> This affects things which may be depending on function type, for example 
> LlapDecider, which will allow builtin/persistent UDFs to be used in LLAP but 
> not temporary functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14830) Move a majority of the MiniLlapCliDriver tests to use an inline AM

2016-09-29 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14830:
--
Attachment: HIVE-14830.03.patch

Looks like acid_bucket_pruning.q is a flaky test, and fails in some 
combinations (or reveals a real bug with these combinations). Opening a 
separate jira to track this, and uploading a patch with the test possibly 
re-ordered)

> Move a majority of the MiniLlapCliDriver tests to use an inline AM
> --
>
> Key: HIVE-14830
> URL: https://issues.apache.org/jira/browse/HIVE-14830
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14830.01.patch, HIVE-14830.01.patch, 
> HIVE-14830.02.patch, HIVE-14830.02_OnHive14854.txt, HIVE-14830.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14830) Move a majority of the MiniLlapCliDriver tests to use an inline AM

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533978#comment-15533978
 ] 

Hive QA commented on HIVE-14830:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830823/HIVE-14830.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1346/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1346/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1346/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-09-29 20:36:14.626
+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1346/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-09-29 20:36:14.629
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   a6c6080..474425a  master -> origin/master
+ git reset --hard HEAD
HEAD is now at a6c6080 HIVE-14852. Change qtest logging to not redirect all 
logs to console. (Siddharth Seth, reviewed by Prasanth Jayachandran)
+ git clean -f -d
Removing ql/src/java/org/apache/hadoop/hive/ql/udf/generic/DecimalUdf.java
Removing 
storage-api/src/java/org/apache/hadoop/hive/common/type/HiveDecimalOverflow.java
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 474425a HIVE-14854. Add a core cluster type to QTestUtil. 
(Siddharth Seth, reviewed by Prasanth Jayachandran)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-09-29 20:36:16.205
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
error: patch failed: 
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCliConfig.java:409
error: 
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/AbstractCliConfig.java:
 patch does not apply
error: patch failed: 
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java:61
error: 
itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java:
 patch does not apply
error: patch failed: 
itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java:50
error: itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java: 
patch does not apply
error: patch failed: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java:116
error: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java: 
patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830823 - PreCommit-HIVE-Build

> Move a majority of the MiniLlapCliDriver tests to use an inline AM
> --
>
> Key: HIVE-14830
> URL: https://issues.apache.org/jira/browse/HIVE-14830
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14830.01.patch, HIVE-14830.01.patch, 
> HIVE-14830.02.patch, HIVE-14830.02_OnHive14854.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14784) Operation logs are disabled automatically if the parent directory does not exist.

2016-09-29 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533942#comment-15533942
 ] 

Yongzhi Chen commented on HIVE-14784:
-

+1

> Operation logs are disabled automatically if the parent directory does not 
> exist.
> -
>
> Key: HIVE-14784
> URL: https://issues.apache.org/jira/browse/HIVE-14784
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14784.1.patch, HIVE-14784.patch
>
>
> Operation logging is disabled automatically for the query if for some reason 
> the parent directory (named after the hive session id) that gets created when 
> the session is established gets deleted (for any reason). For ex: if the 
> operation logdir is /tmp which automatically can get purged at a configured 
> interval by the OS.
> Running a query from that session leads to
> {code}
> 2016-09-15 15:09:16,723 WARN org.apache.hive.service.cli.operation.Operation: 
> Unable to create operation log file: 
> /tmp/hive/operation_logs/b8809985-6b38-47ec-a49b-6158a67cd9fc/d35414f7-2418-426c-8489-c6f643ca4599
> java.io.IOException: No such file or directory
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hive.service.cli.operation.Operation.createOperationLog(Operation.java:195)
>   at 
> org.apache.hive.service.cli.operation.Operation.beforeRun(Operation.java:237)
>   at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:255)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:398)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:385)
>   at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:490)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This later leads to errors like (more prominent when using HUE as HUE does 
> not close hive sessions and attempts to retrieve the operations logs days 
> after they were created).
> {code}
> WARN org.apache.hive.service.cli.thrift.ThriftCLIService: Error fetching 
> results: 
> org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated 
> with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, 
> getHandleIdentifier()=d35414f7-2418-426c-8489-c6f643ca4599]
>   at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:259)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:701)
>   at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:676)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745) 
> {code}



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HIVE-14858) Analyze command should support custom input formats

2016-09-29 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533940#comment-15533940
 ] 

Prasanth Jayachandran commented on HIVE-14858:
--

lgtm too, +1

> Analyze command should support custom input formats
> ---
>
> Key: HIVE-14858
> URL: https://issues.apache.org/jira/browse/HIVE-14858
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Attachments: HIVE-14858.1.patch
>
>
> Currently analyze command with partialscan or noscan only applies to 
> OrcInputFormat and MapredParquetInputFormat. However, if custom input formats 
> extend these two they should also be able to use the same command to collect 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14858) Analyze command should support custom input formats

2016-09-29 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533933#comment-15533933
 ] 

Chao Sun commented on HIVE-14858:
-

cc [~prasanth_j] as well since this changes your original code.

> Analyze command should support custom input formats
> ---
>
> Key: HIVE-14858
> URL: https://issues.apache.org/jira/browse/HIVE-14858
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Attachments: HIVE-14858.1.patch
>
>
> Currently analyze command with partialscan or noscan only applies to 
> OrcInputFormat and MapredParquetInputFormat. However, if custom input formats 
> extend these two they should also be able to use the same command to collect 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14858) Analyze command should support custom input formats

2016-09-29 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533929#comment-15533929
 ] 

Xuefu Zhang commented on HIVE-14858:


+1

> Analyze command should support custom input formats
> ---
>
> Key: HIVE-14858
> URL: https://issues.apache.org/jira/browse/HIVE-14858
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Attachments: HIVE-14858.1.patch
>
>
> Currently analyze command with partialscan or noscan only applies to 
> OrcInputFormat and MapredParquetInputFormat. However, if custom input formats 
> extend these two they should also be able to use the same command to collect 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14858) Analyze command should support custom input formats

2016-09-29 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-14858:

Status: Patch Available  (was: Open)

> Analyze command should support custom input formats
> ---
>
> Key: HIVE-14858
> URL: https://issues.apache.org/jira/browse/HIVE-14858
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Attachments: HIVE-14858.1.patch
>
>
> Currently analyze command with partialscan or noscan only applies to 
> OrcInputFormat and MapredParquetInputFormat. However, if custom input formats 
> extend these two they should also be able to use the same command to collect 
> stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14854) Add a core cluster type to QTestUtil

2016-09-29 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14854:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

> Add a core cluster type to QTestUtil
> 
>
> Key: HIVE-14854
> URL: https://issues.apache.org/jira/browse/HIVE-14854
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.2.0
>
> Attachments: HIVE-14854.01.patch, HIVE-14854.02.patch, 
> HIVE-14854.03.patch
>
>
> Follow up to HIVE-14824. There's tez, tez_local, llap, llap_local - all of 
> which are of a single type, similaryl spark, sparkOnYarn, and none,mr. 
> Introducing a core cluster type to make a bunch of conditional checks simpler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533919#comment-15533919
 ] 

Pengcheng Xiong commented on HIVE-14768:


[~ashutoshc], it actually looks at the arguments. Please see the new test case 
that I have added. Thanks.

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch, 
> HIVE-14768.03.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14768:
---
Status: Patch Available  (was: Open)

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch, 
> HIVE-14768.03.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14768:
---
Attachment: HIVE-14768.03.patch

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch, 
> HIVE-14768.03.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14768:
---
Status: Open  (was: Patch Available)

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch, 
> HIVE-14768.03.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14721) Fix TestJdbcWithMiniHS2 runtime

2016-09-29 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14721:
--
Status: Patch Available  (was: Open)

> Fix TestJdbcWithMiniHS2 runtime
> ---
>
> Key: HIVE-14721
> URL: https://issues.apache.org/jira/browse/HIVE-14721
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14721.1.patch
>
>
> Currently 450s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2016-09-29 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533893#comment-15533893
 ] 

Matt McCline commented on HIVE-14735:
-


{code}
...
   [exec] arget/spark
 [exec] + [[ ! -f ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
]]
 [exec] + local md5File=spark-1.6.0-bin-hadoop2-without-hive.tgz.md5sum
 [exec] + curl -Sso 
./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz.md5sum 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz.md5sum
 [exec] + cd ./../thirdparty
 [exec] + md5sum -c spark-1.6.0-bin-hadoop2-without-hive.tgz.md5sum
 [exec] ../target/download.sh: line 18: md5sum: command not found
 [exec] + curl -Sso 
./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz
 [exec] + cd -
 [exec] + tar -zxf ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
-C ./target
 [exec] /Users/mmccline/VecDetail/itests/qtest-spark
 [exec] + mv ./target/spark-1.6.0-bin-hadoop2-without-hive ./target/spark
 [exec] + cp -f ./target/../../..//data/conf/spark/log4j2.properties 
./target/spark/conf/
{code}

After the "./target/download.sh: line 18: md5sum: command not found" line, the 
download of "+ curl -Sso 
./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz " takes a very long 
time and happens everytime.  I tried downloading a version of md5sum and that 
seems to make it worse -- the build went off and hung.

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13098) Add a strict check for when the decimal gets converted to null due to insufficient width

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533799#comment-15533799
 ] 

Hive QA commented on HIVE-13098:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830837/HIVE-13098.WIP2.patch

{color:green}SUCCESS:{color} +1 due to 57 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 97 failed/errored test(s), 10645 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_precision]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_skewjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_stats]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_ppd_decimal]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_decimal]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_format_number]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_greatest]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_least]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_to_byte]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_to_long]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_to_short]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_struct_in]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_17]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_short_regress]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[tez_union_decimal]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_cast_constant]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_char_mapjoin1]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_3]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_inner_join]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_interval_mapjoin]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_join_filters]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_left_outer_join2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_left_outer_join]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_leftsemi_mapjoin]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_mapjoin_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_outer_join0]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_outer_join1]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_outer_join2]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_outer_join3]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_outer_join4]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_outer_join5]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[vector_outer_join6]

[jira] [Updated] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support

2016-09-29 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5317:
-
Component/s: Transactions

> Implement insert, update, and delete in Hive with full ACID support
> ---
>
> Key: HIVE-5317
> URL: https://issues.apache.org/jira/browse/HIVE-5317
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.14.0
>
> Attachments: InsertUpdatesinHive.pdf
>
>
> Many customers want to be able to insert, update and delete rows from Hive 
> tables with full ACID support. The use cases are varied, but the form of the 
> queries that should be supported are:
> * INSERT INTO tbl SELECT …
> * INSERT INTO tbl VALUES ...
> * UPDATE tbl SET … WHERE …
> * DELETE FROM tbl WHERE …
> * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
> ...
> * SET TRANSACTION LEVEL …
> * BEGIN/END TRANSACTION
> Use Cases
> * Once an hour, a set of inserts and updates (up to 500k rows) for various 
> dimension tables (eg. customer, inventory, stores) needs to be processed. The 
> dimension tables have primary keys and are typically bucketed and sorted on 
> those keys.
> * Once a day a small set (up to 100k rows) of records need to be deleted for 
> regulatory compliance.
> * Once an hour a log of transactions is exported from a RDBS and the fact 
> tables need to be updated (up to 1m rows)  to reflect the new data. The 
> transactions are a combination of inserts, updates, and deletes. The table is 
> partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14784) Operation logs are disabled automatically if the parent directory does not exist.

2016-09-29 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533540#comment-15533540
 ] 

Naveen Gangam commented on HIVE-14784:
--

forgot to comment on this. They do not appear to be related. Other prior builds 
also have the same failures. So +1 for me.

> Operation logs are disabled automatically if the parent directory does not 
> exist.
> -
>
> Key: HIVE-14784
> URL: https://issues.apache.org/jira/browse/HIVE-14784
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14784.1.patch, HIVE-14784.patch
>
>
> Operation logging is disabled automatically for the query if for some reason 
> the parent directory (named after the hive session id) that gets created when 
> the session is established gets deleted (for any reason). For ex: if the 
> operation logdir is /tmp which automatically can get purged at a configured 
> interval by the OS.
> Running a query from that session leads to
> {code}
> 2016-09-15 15:09:16,723 WARN org.apache.hive.service.cli.operation.Operation: 
> Unable to create operation log file: 
> /tmp/hive/operation_logs/b8809985-6b38-47ec-a49b-6158a67cd9fc/d35414f7-2418-426c-8489-c6f643ca4599
> java.io.IOException: No such file or directory
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hive.service.cli.operation.Operation.createOperationLog(Operation.java:195)
>   at 
> org.apache.hive.service.cli.operation.Operation.beforeRun(Operation.java:237)
>   at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:255)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:398)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:385)
>   at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:490)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This later leads to errors like (more prominent when using HUE as HUE does 
> not close hive sessions and attempts to retrieve the operations logs days 
> after they were created).
> {code}
> WARN org.apache.hive.service.cli.thrift.ThriftCLIService: Error fetching 
> results: 
> org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated 
> with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, 
> getHandleIdentifier()=d35414f7-2418-426c-8489-c6f643ca4599]
>   at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:259)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:701)
>   at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:676)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> 

[jira] [Commented] (HIVE-14854) Add a core cluster type to QTestUtil

2016-09-29 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533510#comment-15533510
 ] 

Prasanth Jayachandran commented on HIVE-14854:
--

nit: typo in "CoreClusteType"
enum values all capital case?

Other than that lgtm, +1

> Add a core cluster type to QTestUtil
> 
>
> Key: HIVE-14854
> URL: https://issues.apache.org/jira/browse/HIVE-14854
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14854.01.patch, HIVE-14854.02.patch
>
>
> Follow up to HIVE-14824. There's tez, tez_local, llap, llap_local - all of 
> which are of a single type, similaryl spark, sparkOnYarn, and none,mr. 
> Introducing a core cluster type to make a bunch of conditional checks simpler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14778) document threading model of Streaming API

2016-09-29 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14778:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

pushed to master 
https://github.com/apache/hive/commit/20304c0705c4ad861b5915dacceaa6d6bdfe91fc
Thanks Alan for the review

> document threading model of Streaming API
> -
>
> Key: HIVE-14778
> URL: https://issues.apache.org/jira/browse/HIVE-14778
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 2.2.0
>
> Attachments: HIVE-14778.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The model is not obvious and needs to be documented properly.
> A StreamingConnection internally maintains 2 MetaStoreClient objects (each 
> has 1 Thrift client for actual RPC). Let's call them "primary" and 
> "heartbeat". Each TransactionBatch created from a given StreamingConnection, 
> gets a reference to both of these MetaStoreClients. 
> So the model is that there is at most 1 outstanding (not closed) 
> TransactionBatch for any given StreamingConnection and for any given 
> TransactionBatch there can be at most 2 threads accessing it concurrently. 1 
> thread calling TransactionBatch.heartbeat() (and nothing else) and the other 
> calling all other methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp

2016-09-29 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533471#comment-15533471
 ] 

Jason Dere commented on HIVE-14412:
---

bq. I thought about this. One difficulty is we need to serialize TimestampTZ 
into BytesWritable in BinarySortableSerDe. The timezone needs to be serialized 
into the BytesWritable. I'm not sure how to keep it from being used for 
comparison.

I'm not sure what to do here - I feel like proper comparison is an important 
detail, and that users might be alarmed if the data they are collecting in 
various time zones are not capable of being compared to each other. I wonder if 
it would worth creating a new BinarySortableSerDe which allows the value to 
specify how many bytes of the value need to be compared. Something like that 
might allow the TimestampTZ to specify that only the UTC time portion be used 
for comparison.

If anything, I think the most important detail for a new Timestamp type in Hive 
would be to make sure that it actually captures the "seconds from UTC" value 
and that all of the various SerDes/UDFs/conversions actually pay attention to 
this detail. The formatting/Timezone could even be done as an operation on the 
UTC time (formatting UDF, session-level timezone, or just use local timezone).


> Add a timezone-aware timestamp
> --
>
> Key: HIVE-14412
> URL: https://issues.apache.org/jira/browse/HIVE-14412
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14412.1.patch, HIVE-14412.2.patch, 
> HIVE-14412.3.patch, HIVE-14412.4.patch, HIVE-14412.5.patch, 
> HIVE-14412.6.patch, HIVE-14412.7.patch, HIVE-14412.8.patch
>
>
> Java's Timestamp stores the time elapsed since the epoch. While it's by 
> itself unambiguous, ambiguity comes when we parse a string into timestamp, or 
> convert a timestamp to string, causing problems like HIVE-14305.
> To solve the issue, I think we should make timestamp aware of timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14852) Change qtest logging to not redirect all logs to console

2016-09-29 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14852:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review.

> Change qtest logging to not redirect all logs to console
> 
>
> Key: HIVE-14852
> URL: https://issues.apache.org/jira/browse/HIVE-14852
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.2.0
>
> Attachments: HIVE-14852.01.patch, HIVE-14852.02.patch
>
>
> A change was made recently to redirect all logs to console, to make IDE 
> debugging of regular tests easier. That unfortunately makes qtest debugging 
> tougher - since there's a lot of noise along with the diffs in the output 
> file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14100) current_user() returns invalid information

2016-09-29 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533457#comment-15533457
 ] 

Mohit Sabharwal commented on HIVE-14100:


Thanks, [~pvary], LGTM, +1.

Could you fix the jira title to say you're adding a new UDF called 
logged_in_user() ?
Currently, it appears you are fixing current_user()

> current_user() returns invalid information
> --
>
> Key: HIVE-14100
> URL: https://issues.apache.org/jira/browse/HIVE-14100
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Beeline
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14100.2.patch, HIVE-14100.2.patch, 
> HIVE-14100.2.patch, HIVE-14100.patch
>
>
> Using HadoopDeaultAuthenticator the current_user() returns the username of 
> the unix user running hiveservice2.
> Using SessionStateUserAuthenticator the current_user returns the username 
> which is provided when the connection started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14830) Move a majority of the MiniLlapCliDriver tests to use an inline AM

2016-09-29 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533456#comment-15533456
 ] 

Siddharth Seth commented on HIVE-14830:
---

I don't see why TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] this 
fails with the changes in this patch. I'll try running it locally with the same 
batch that is run in jenkins to see what is going on.

> Move a majority of the MiniLlapCliDriver tests to use an inline AM
> --
>
> Key: HIVE-14830
> URL: https://issues.apache.org/jira/browse/HIVE-14830
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14830.01.patch, HIVE-14830.01.patch, 
> HIVE-14830.02.patch, HIVE-14830.02_OnHive14854.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp

2016-09-29 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533452#comment-15533452
 ] 

Alan Gates commented on HIVE-14412:
---

Agree the TZ <-> non TZ conversions look like they match up with the spec.

Using the GMT+/-HH:MM format for timezone matches the spec, but it does create 
a weird situation for many users where 1/2 the year their times will be off by 
an hour.  I think the big question is can the underlying infrastructure handle 
it, as we don't want Hive in the business of understanding when DST starts and 
stops all over the world.


> Add a timezone-aware timestamp
> --
>
> Key: HIVE-14412
> URL: https://issues.apache.org/jira/browse/HIVE-14412
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14412.1.patch, HIVE-14412.2.patch, 
> HIVE-14412.3.patch, HIVE-14412.4.patch, HIVE-14412.5.patch, 
> HIVE-14412.6.patch, HIVE-14412.7.patch, HIVE-14412.8.patch
>
>
> Java's Timestamp stores the time elapsed since the epoch. While it's by 
> itself unambiguous, ambiguity comes when we parse a string into timestamp, or 
> convert a timestamp to string, causing problems like HIVE-14305.
> To solve the issue, I think we should make timestamp aware of timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14784) Operation logs are disabled automatically if the parent directory does not exist.

2016-09-29 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533435#comment-15533435
 ] 

Yongzhi Chen commented on HIVE-14784:
-

Are the failures related to the change?

> Operation logs are disabled automatically if the parent directory does not 
> exist.
> -
>
> Key: HIVE-14784
> URL: https://issues.apache.org/jira/browse/HIVE-14784
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14784.1.patch, HIVE-14784.patch
>
>
> Operation logging is disabled automatically for the query if for some reason 
> the parent directory (named after the hive session id) that gets created when 
> the session is established gets deleted (for any reason). For ex: if the 
> operation logdir is /tmp which automatically can get purged at a configured 
> interval by the OS.
> Running a query from that session leads to
> {code}
> 2016-09-15 15:09:16,723 WARN org.apache.hive.service.cli.operation.Operation: 
> Unable to create operation log file: 
> /tmp/hive/operation_logs/b8809985-6b38-47ec-a49b-6158a67cd9fc/d35414f7-2418-426c-8489-c6f643ca4599
> java.io.IOException: No such file or directory
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hive.service.cli.operation.Operation.createOperationLog(Operation.java:195)
>   at 
> org.apache.hive.service.cli.operation.Operation.beforeRun(Operation.java:237)
>   at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:255)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:398)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:385)
>   at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:490)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This later leads to errors like (more prominent when using HUE as HUE does 
> not close hive sessions and attempts to retrieve the operations logs days 
> after they were created).
> {code}
> WARN org.apache.hive.service.cli.thrift.ThriftCLIService: Error fetching 
> results: 
> org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated 
> with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, 
> getHandleIdentifier()=d35414f7-2418-426c-8489-c6f643ca4599]
>   at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:259)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:701)
>   at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:451)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:676)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745) 
> {code}



--
This message 

[jira] [Commented] (HIVE-14778) document threading model of Streaming API

2016-09-29 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533378#comment-15533378
 ] 

Alan Gates commented on HIVE-14778:
---

+1, makes sense.

> document threading model of Streaming API
> -
>
> Key: HIVE-14778
> URL: https://issues.apache.org/jira/browse/HIVE-14778
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14778.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The model is not obvious and needs to be documented properly.
> A StreamingConnection internally maintains 2 MetaStoreClient objects (each 
> has 1 Thrift client for actual RPC). Let's call them "primary" and 
> "heartbeat". Each TransactionBatch created from a given StreamingConnection, 
> gets a reference to both of these MetaStoreClients. 
> So the model is that there is at most 1 outstanding (not closed) 
> TransactionBatch for any given StreamingConnection and for any given 
> TransactionBatch there can be at most 2 threads accessing it concurrently. 1 
> thread calling TransactionBatch.heartbeat() (and nothing else) and the other 
> calling all other methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14806) Support UDTF in CBO (AST return path)

2016-09-29 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533347#comment-15533347
 ] 

Ashutosh Chauhan commented on HIVE-14806:
-

Design looks good. Some code level comments on RB

> Support UDTF in CBO (AST return path)
> -
>
> Key: HIVE-14806
> URL: https://issues.apache.org/jira/browse/HIVE-14806
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14806.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-09-29 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533280#comment-15533280
 ] 

Aihua Xu commented on HIVE-14146:
-

+1. The patch looks good to me.

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.10.patch, HIVE-14146.11.patch, 
> HIVE-14146.2.patch, HIVE-14146.3.patch, HIVE-14146.4.patch, 
> HIVE-14146.5.patch, HIVE-14146.6.patch, HIVE-14146.7.patch, 
> HIVE-14146.8.patch, HIVE-14146.9.patch, HIVE-14146.patch, changes
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9423) HiveServer2: Provide the user with different error messages depending on the Thrift client exception code

2016-09-29 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-9423:
-
Attachment: HIVE-9423.6-branch-2.1.patch

Updated branch 2.1 patch - pom modification for tests, and an import

> HiveServer2: Provide the user with different error messages depending on the 
> Thrift client exception code
> -
>
> Key: HIVE-9423
> URL: https://issues.apache.org/jira/browse/HIVE-9423
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.15.0
>Reporter: Vaibhav Gumashta
>Assignee: Peter Vary
> Attachments: HIVE-9423.2.patch, HIVE-9423.3.patch, HIVE-9423.4.patch, 
> HIVE-9423.5-branch-2.1.patch, HIVE-9423.5.patch, 
> HIVE-9423.6-branch-2.1.patch, HIVE-9423.patch
>
>
> After verifying that the original problem is mostly solved by the Thrift 
> upgrade, I created a patch to provide better error message when possible
> Original description for reference:
> ---
> An example of where it is needed: it has been reported that when # of client 
> connections is greater than   {{hive.server2.thrift.max.worker.threads}}, 
> HiveServer2 stops accepting new connections and ends up having to be 
> restarted. This should be handled more gracefully by the server and the JDBC 
> driver, so that the end user gets aware of the problem and can take 
> appropriate steps (either close existing connections or bump of the config 
> value or use multiple server instances with dynamic service discovery 
> enabled). Similarly, we should also review the behaviour of background thread 
> pool to have a well defined behavior on the the pool getting exhausted. 
> Ideally implementing some form of general admission control will be a better 
> solution, so that we do not accept new work unless sufficient resources are 
> available and display graceful degradation under overload.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12158) Add methods to HCatClient for partition synchronization

2016-09-29 Thread David Maughan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533205#comment-15533205
 ] 

David Maughan commented on HIVE-12158:
--

Hi [~mithun], [~sushanth],

Apologies for the long delay. I've addressed the problem and attached a new 
patch.

> Add methods to HCatClient for partition synchronization
> ---
>
> Key: HIVE-12158
> URL: https://issues.apache.org/jira/browse/HIVE-12158
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 2.0.0
>Reporter: David Maughan
>Assignee: David Maughan
>Priority: Minor
>  Labels: hcatalog
> Attachments: HIVE-12158.1.patch, HIVE-12158.2.patch
>
>
> We have a use case where we have a list of partitions that are created as a 
> result of a batch job (new or updated) outside of Hive and would like to 
> synchronize them with the Hive MetaStore. We would like to use the HCatalog 
> {{HCatClient}} but it currently does not seem to support this. However it is 
> possible with the {{HiveMetaStoreClient}} directly. I am proposing to add the 
> following method to {{HCatClient}} and {{HCatClientHMSImpl}}:
> A method for altering partitions. The implementation would delegate to 
> {{HiveMetaStoreClient#alter_partitions}}. I've used "update" instead of 
> "alter" in the name so it's consistent with the 
> {{HCatClient#updateTableSchema}} method.
> {code}
> public void updatePartitions(List partitions) throws 
> HCatException
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12158) Add methods to HCatClient for partition synchronization

2016-09-29 Thread David Maughan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Maughan updated HIVE-12158:
-
Attachment: HIVE-12158.2.patch

> Add methods to HCatClient for partition synchronization
> ---
>
> Key: HIVE-12158
> URL: https://issues.apache.org/jira/browse/HIVE-12158
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 2.0.0
>Reporter: David Maughan
>Assignee: David Maughan
>Priority: Minor
>  Labels: hcatalog
> Attachments: HIVE-12158.1.patch, HIVE-12158.2.patch
>
>
> We have a use case where we have a list of partitions that are created as a 
> result of a batch job (new or updated) outside of Hive and would like to 
> synchronize them with the Hive MetaStore. We would like to use the HCatalog 
> {{HCatClient}} but it currently does not seem to support this. However it is 
> possible with the {{HiveMetaStoreClient}} directly. I am proposing to add the 
> following method to {{HCatClient}} and {{HCatClientHMSImpl}}:
> A method for altering partitions. The implementation would delegate to 
> {{HiveMetaStoreClient#alter_partitions}}. I've used "update" instead of 
> "alter" in the name so it's consistent with the 
> {{HCatClient#updateTableSchema}} method.
> {code}
> public void updatePartitions(List partitions) throws 
> HCatException
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9423) HiveServer2: Provide the user with different error messages depending on the Thrift client exception code

2016-09-29 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-9423:
-
Attachment: HIVE-9423.5-branch-2.1.patch

Patch for branch 2.1

> HiveServer2: Provide the user with different error messages depending on the 
> Thrift client exception code
> -
>
> Key: HIVE-9423
> URL: https://issues.apache.org/jira/browse/HIVE-9423
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.15.0
>Reporter: Vaibhav Gumashta
>Assignee: Peter Vary
> Attachments: HIVE-9423.2.patch, HIVE-9423.3.patch, HIVE-9423.4.patch, 
> HIVE-9423.5-branch-2.1.patch, HIVE-9423.5.patch, HIVE-9423.patch
>
>
> After verifying that the original problem is mostly solved by the Thrift 
> upgrade, I created a patch to provide better error message when possible
> Original description for reference:
> ---
> An example of where it is needed: it has been reported that when # of client 
> connections is greater than   {{hive.server2.thrift.max.worker.threads}}, 
> HiveServer2 stops accepting new connections and ends up having to be 
> restarted. This should be handled more gracefully by the server and the JDBC 
> driver, so that the end user gets aware of the problem and can take 
> appropriate steps (either close existing connections or bump of the config 
> value or use multiple server instances with dynamic service discovery 
> enabled). Similarly, we should also review the behaviour of background thread 
> pool to have a well defined behavior on the the pool getting exhausted. 
> Ideally implementing some form of general admission control will be a better 
> solution, so that we do not accept new work unless sufficient resources are 
> available and display graceful degradation under overload.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14536) Unit test code cleanup

2016-09-29 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14536:
--
Status: Open  (was: Patch Available)

> Unit test code cleanup
> --
>
> Key: HIVE-14536
> URL: https://issues.apache.org/jira/browse/HIVE-14536
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14536.5.patch, HIVE-14536.6.patch, 
> HIVE-14536.7.patch, HIVE-14536.8.patch, HIVE-14536.9.patch, HIVE-14536.patch
>
>
> Clean up the itest infrastructure, to create a readable, easy to understand 
> code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14536) Unit test code cleanup

2016-09-29 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-14536.
---
Resolution: Later

> Unit test code cleanup
> --
>
> Key: HIVE-14536
> URL: https://issues.apache.org/jira/browse/HIVE-14536
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14536.5.patch, HIVE-14536.6.patch, 
> HIVE-14536.7.patch, HIVE-14536.8.patch, HIVE-14536.9.patch, HIVE-14536.patch
>
>
> Clean up the itest infrastructure, to create a readable, easy to understand 
> code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14536) Unit test code cleanup

2016-09-29 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533121#comment-15533121
 ] 

Peter Vary commented on HIVE-14536:
---

Abandoning this for now, since as discussed with [~kgyrtkirk] the main force of 
the test framework cleanup is going another way

> Unit test code cleanup
> --
>
> Key: HIVE-14536
> URL: https://issues.apache.org/jira/browse/HIVE-14536
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14536.5.patch, HIVE-14536.6.patch, 
> HIVE-14536.7.patch, HIVE-14536.8.patch, HIVE-14536.9.patch, HIVE-14536.patch
>
>
> Clean up the itest infrastructure, to create a readable, easy to understand 
> code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533115#comment-15533115
 ] 

Ashutosh Chauhan commented on HIVE-14768:
-

Another thought: What if we extend UDTF explode such that it accepts long 
argument and then calls forward as many time as its value. Then that in 
conjunction with Lateral view will give us this desired functionality.
Reason I am suggesting this is because this udtf gives the impression that it 
can accept any number of expressions as argument and will output them after 
evaluating, while in implementation it just forwards without even looking at 
arguments. Lateral view explode more matches the semantics since there explode 
will only accept one argument and still able to match row schema via Lateral 
view join.

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14100) current_user() returns invalid information

2016-09-29 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533064#comment-15533064
 ] 

Peter Vary commented on HIVE-14100:
---

That is correct [~mohitsabharwal]!

I know some specific distribution, with some specific configuration use other 
Authenticator than SessionStateUserAuthenticator :)

> current_user() returns invalid information
> --
>
> Key: HIVE-14100
> URL: https://issues.apache.org/jira/browse/HIVE-14100
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Beeline
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14100.2.patch, HIVE-14100.2.patch, 
> HIVE-14100.2.patch, HIVE-14100.patch
>
>
> Using HadoopDeaultAuthenticator the current_user() returns the username of 
> the unix user running hiveservice2.
> Using SessionStateUserAuthenticator the current_user returns the username 
> which is provided when the connection started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14857) select count(*) fails with tez over cassandra

2016-09-29 Thread jean carlo rivera ura (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jean carlo rivera ura updated HIVE-14857:
-
Affects Version/s: 1.2.1

> select count(*) fails with tez over cassandra
> -
>
> Key: HIVE-14857
> URL: https://issues.apache.org/jira/browse/HIVE-14857
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: jean carlo rivera ura
>
> Hello,
> We have a cluster with nodes having cassandra and hadoop (hortonworks 2.3.2) 
> and we have tez as our engine by default.
> I have a table in cassandra, and I use the driver hive-cassandra to do 
> selects over it. This is the table
> {code:sql}
> CREATE TABLE table1 ( campaign_id text, sid text, name text, ts timestamp, 
> PRIMARY KEY (campaign_id, sid) ) WITH CLUSTERING ORDER BY (sid ASC)
> {code}
> And I have only 3 partitions
> ||campaign_id ||   sid  ||  name  ||  ts||
> |45sqdqs| sqsd |  dea| NULL|
> |QSHJKA | sqsd |  dea| NULL|
> |45s-qs   | sqsd |  dea| NULL|
> At the moment to do a "select count ( * )" over table using hive like that 
> (tez is our engine by default)
> {code} hive -e "select count(*) from table1;" {code}
> I got this error:
> {code}
> Status: Failed
> Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1474275943985_0179_1_00, diagnostics=[Task failed, 
> taskId=task_1474275943985_0179_1_00_01, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.tez.dag.api.TezUncheckedException: Expected length: 12416 
> actual length: 9223372036854775711
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.tez.dag.api.TezUncheckedException: Expected length: 
> 12416 actual length: 9223372036854775711
>at 
> org.apache.hadoop.mapred.split.TezGroupedSplit.readFields(TezGroupedSplit.java:128)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.createOldFormatSplitFromUserPayload(MRInputHelpers.java:177)
>at 
> org.apache.tez.mapreduce.lib.MRInputUtils.getOldSplitDetailsFromEvent(MRInputUtils.java:136)
>at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:643)
>at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>... 14 more
> {code}
> So far I understand, in readfields we are getting more data that we are 
> expecting. But considering the size of the table( only 3 records), I dont 
> think the data is a problem. 
> Another thing to add is that if I do  a "select *", it works perfectly fine 
> with tez. Using the engine mp, select count ( * ) and select * work fine as 
> well.
> We are using hortonworks version 2.3.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14857) select count(*) fails with tez over cassandra

2016-09-29 Thread jean carlo rivera ura (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532991#comment-15532991
 ] 

jean carlo rivera ura commented on HIVE-14857:
--

that's correct, for tez the version is the 0.7.0 and for Hive it is the 1.2.1. 

> select count(*) fails with tez over cassandra
> -
>
> Key: HIVE-14857
> URL: https://issues.apache.org/jira/browse/HIVE-14857
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: jean carlo rivera ura
>
> Hello,
> We have a cluster with nodes having cassandra and hadoop (hortonworks 2.3.2) 
> and we have tez as our engine by default.
> I have a table in cassandra, and I use the driver hive-cassandra to do 
> selects over it. This is the table
> {code:sql}
> CREATE TABLE table1 ( campaign_id text, sid text, name text, ts timestamp, 
> PRIMARY KEY (campaign_id, sid) ) WITH CLUSTERING ORDER BY (sid ASC)
> {code}
> And I have only 3 partitions
> ||campaign_id ||   sid  ||  name  ||  ts||
> |45sqdqs| sqsd |  dea| NULL|
> |QSHJKA | sqsd |  dea| NULL|
> |45s-qs   | sqsd |  dea| NULL|
> At the moment to do a "select count ( * )" over table using hive like that 
> (tez is our engine by default)
> {code} hive -e "select count(*) from table1;" {code}
> I got this error:
> {code}
> Status: Failed
> Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1474275943985_0179_1_00, diagnostics=[Task failed, 
> taskId=task_1474275943985_0179_1_00_01, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.tez.dag.api.TezUncheckedException: Expected length: 12416 
> actual length: 9223372036854775711
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.tez.dag.api.TezUncheckedException: Expected length: 
> 12416 actual length: 9223372036854775711
>at 
> org.apache.hadoop.mapred.split.TezGroupedSplit.readFields(TezGroupedSplit.java:128)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.createOldFormatSplitFromUserPayload(MRInputHelpers.java:177)
>at 
> org.apache.tez.mapreduce.lib.MRInputUtils.getOldSplitDetailsFromEvent(MRInputUtils.java:136)
>at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:643)
>at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>... 14 more
> {code}
> So far I understand, in readfields we are getting more data that we are 
> expecting. But considering the size of the table( only 3 records), I dont 
> think the data is a problem. 
> Another thing to add is that if I do  a "select *", it works perfectly fine 
> with tez. Using the engine mp, select count ( * ) and select * work fine as 
> well.
> We are using hortonworks version 2.3.2



--
This 

[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2016-09-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532978#comment-15532978
 ] 

Sergio Peña commented on HIVE-14735:


We added a .md5sum spark file to detect if a file must be downloaded again in 
the next build. This saves time if you already have an exact copy of the spark 
assembly.
Where is the issue happening? On our Jenkins build?

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14857) select count(*) fails with tez over cassandra

2016-09-29 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532983#comment-15532983
 ] 

Hitesh Shah edited comment on HIVE-14857 at 9/29/16 2:48 PM:
-

[~carlo_4002] I just moved this to Hive. Can you please update the affects 
version field to indicate what versions of Hive you were running. For Tez, I 
believe you had mentioned 0.7.0?


was (Author: hitesh):
[~carlo_4002] I just moved this to Hive. Can you please update the affects 
version field to indicate what versions of Hive and Tez you were running. 

> select count(*) fails with tez over cassandra
> -
>
> Key: HIVE-14857
> URL: https://issues.apache.org/jira/browse/HIVE-14857
> Project: Hive
>  Issue Type: Bug
>Reporter: jean carlo rivera ura
>
> Hello,
> We have a cluster with nodes having cassandra and hadoop (hortonworks 2.3.2) 
> and we have tez as our engine by default.
> I have a table in cassandra, and I use the driver hive-cassandra to do 
> selects over it. This is the table
> {code:sql}
> CREATE TABLE table1 ( campaign_id text, sid text, name text, ts timestamp, 
> PRIMARY KEY (campaign_id, sid) ) WITH CLUSTERING ORDER BY (sid ASC)
> {code}
> And I have only 3 partitions
> ||campaign_id ||   sid  ||  name  ||  ts||
> |45sqdqs| sqsd |  dea| NULL|
> |QSHJKA | sqsd |  dea| NULL|
> |45s-qs   | sqsd |  dea| NULL|
> At the moment to do a "select count ( * )" over table using hive like that 
> (tez is our engine by default)
> {code} hive -e "select count(*) from table1;" {code}
> I got this error:
> {code}
> Status: Failed
> Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1474275943985_0179_1_00, diagnostics=[Task failed, 
> taskId=task_1474275943985_0179_1_00_01, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.tez.dag.api.TezUncheckedException: Expected length: 12416 
> actual length: 9223372036854775711
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.tez.dag.api.TezUncheckedException: Expected length: 
> 12416 actual length: 9223372036854775711
>at 
> org.apache.hadoop.mapred.split.TezGroupedSplit.readFields(TezGroupedSplit.java:128)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.createOldFormatSplitFromUserPayload(MRInputHelpers.java:177)
>at 
> org.apache.tez.mapreduce.lib.MRInputUtils.getOldSplitDetailsFromEvent(MRInputUtils.java:136)
>at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:643)
>at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>... 14 more
> {code}
> So far I understand, in readfields we are getting more data that we are 
> expecting. But considering the size of the 

[jira] [Commented] (HIVE-14857) select count(*) fails with tez over cassandra

2016-09-29 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532983#comment-15532983
 ] 

Hitesh Shah commented on HIVE-14857:


[~carlo_4002] I just moved this to Hive. Can you please update the affects 
version field to indicate what versions of Hive and Tez you were running. 

> select count(*) fails with tez over cassandra
> -
>
> Key: HIVE-14857
> URL: https://issues.apache.org/jira/browse/HIVE-14857
> Project: Hive
>  Issue Type: Bug
>Reporter: jean carlo rivera ura
>
> Hello,
> We have a cluster with nodes having cassandra and hadoop (hortonworks 2.3.2) 
> and we have tez as our engine by default.
> I have a table in cassandra, and I use the driver hive-cassandra to do 
> selects over it. This is the table
> {code:sql}
> CREATE TABLE table1 ( campaign_id text, sid text, name text, ts timestamp, 
> PRIMARY KEY (campaign_id, sid) ) WITH CLUSTERING ORDER BY (sid ASC)
> {code}
> And I have only 3 partitions
> ||campaign_id ||   sid  ||  name  ||  ts||
> |45sqdqs| sqsd |  dea| NULL|
> |QSHJKA | sqsd |  dea| NULL|
> |45s-qs   | sqsd |  dea| NULL|
> At the moment to do a "select count ( * )" over table using hive like that 
> (tez is our engine by default)
> {code} hive -e "select count(*) from table1;" {code}
> I got this error:
> {code}
> Status: Failed
> Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1474275943985_0179_1_00, diagnostics=[Task failed, 
> taskId=task_1474275943985_0179_1_00_01, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.tez.dag.api.TezUncheckedException: Expected length: 12416 
> actual length: 9223372036854775711
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.tez.dag.api.TezUncheckedException: Expected length: 
> 12416 actual length: 9223372036854775711
>at 
> org.apache.hadoop.mapred.split.TezGroupedSplit.readFields(TezGroupedSplit.java:128)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.createOldFormatSplitFromUserPayload(MRInputHelpers.java:177)
>at 
> org.apache.tez.mapreduce.lib.MRInputUtils.getOldSplitDetailsFromEvent(MRInputUtils.java:136)
>at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:643)
>at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>... 14 more
> {code}
> So far I understand, in readfields we are getting more data that we are 
> expecting. But considering the size of the table( only 3 records), I dont 
> think the data is a problem. 
> Another thing to add is that if I do  a "select *", it works perfectly fine 
> with tez. Using the engine mp, select count ( * ) and select * work fine as 
> well.
> We are using hortonworks 

[jira] [Moved] (HIVE-14857) select count(*) fails with tez over cassandra

2016-09-29 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah moved TEZ-3451 to HIVE-14857:
-

Affects Version/s: (was: 0.7.0)
  Key: HIVE-14857  (was: TEZ-3451)
  Project: Hive  (was: Apache Tez)

> select count(*) fails with tez over cassandra
> -
>
> Key: HIVE-14857
> URL: https://issues.apache.org/jira/browse/HIVE-14857
> Project: Hive
>  Issue Type: Bug
>Reporter: jean carlo rivera ura
>
> Hello,
> We have a cluster with nodes having cassandra and hadoop (hortonworks 2.3.2) 
> and we have tez as our engine by default.
> I have a table in cassandra, and I use the driver hive-cassandra to do 
> selects over it. This is the table
> {code:sql}
> CREATE TABLE table1 ( campaign_id text, sid text, name text, ts timestamp, 
> PRIMARY KEY (campaign_id, sid) ) WITH CLUSTERING ORDER BY (sid ASC)
> {code}
> And I have only 3 partitions
> ||campaign_id ||   sid  ||  name  ||  ts||
> |45sqdqs| sqsd |  dea| NULL|
> |QSHJKA | sqsd |  dea| NULL|
> |45s-qs   | sqsd |  dea| NULL|
> At the moment to do a "select count ( * )" over table using hive like that 
> (tez is our engine by default)
> {code} hive -e "select count(*) from table1;" {code}
> I got this error:
> {code}
> Status: Failed
> Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1474275943985_0179_1_00, diagnostics=[Task failed, 
> taskId=task_1474275943985_0179_1_00_01, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.tez.dag.api.TezUncheckedException: Expected length: 12416 
> actual length: 9223372036854775711
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:422)
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.tez.dag.api.TezUncheckedException: Expected length: 
> 12416 actual length: 9223372036854775711
>at 
> org.apache.hadoop.mapred.split.TezGroupedSplit.readFields(TezGroupedSplit.java:128)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
>at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.createOldFormatSplitFromUserPayload(MRInputHelpers.java:177)
>at 
> org.apache.tez.mapreduce.lib.MRInputUtils.getOldSplitDetailsFromEvent(MRInputUtils.java:136)
>at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:643)
>at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
>at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
>at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>... 14 more
> {code}
> So far I understand, in readfields we are getting more data that we are 
> expecting. But considering the size of the table( only 3 records), I dont 
> think the data is a problem. 
> Another thing to add is that if I do  a "select *", it works perfectly fine 
> with tez. Using the engine mp, select count ( * ) and select * work fine as 
> well.
> We are using hortonworks version 2.3.2



--
This message was 

[jira] [Commented] (HIVE-14856) create table with select from table limit is failing with NFE if limit exceed than allowed 32bit integer length

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532967#comment-15532967
 ] 

Hive QA commented on HIVE-14856:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830835/HIVE-14856.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1344/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1344/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1344/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-09-29 14:43:41.406
+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1344/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-09-29 14:43:41.408
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 291f3d5 HIVE-14849: Support google-compute-engine provider on 
Hive ptest framework (Sergio Pena, reviewed by Prasanth Jayachandran)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 291f3d5 HIVE-14849: Support google-compute-engine provider on 
Hive ptest framework (Sergio Pena, reviewed by Prasanth Jayachandran)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-09-29 14:43:42.478
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java: No such 
file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java: No 
such file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830835 - PreCommit-HIVE-Build

> create table with select from table limit is failing with NFE if limit exceed 
> than allowed 32bit integer length
> ---
>
> Key: HIVE-14856
> URL: https://issues.apache.org/jira/browse/HIVE-14856
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
> Environment: centos 6.6
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
> Fix For: 1.2.1
>
> Attachments: HIVE-14856.patch
>
>
> query with limit is failing with NumberFormatException if the limit exceeds 
> 32bit integer length.
> create table sample1 as select * from sample limit 2248321440;
> FAILED: NumberFormatException For input string: "2248321440"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14855) test patch

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532939#comment-15532939
 ] 

Hive QA commented on HIVE-14855:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830830/HIVE-14855.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10660 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testMergeProto
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMergeProto
org.apache.hadoop.hive.ql.parse.TestIUD.testMergeNegative2
org.apache.hadoop.hive.ql.parse.TestIUD.testMergeNegative3
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1343/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1343/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1343/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830830 - PreCommit-HIVE-Build

> test patch
> --
>
> Key: HIVE-14855
> URL: https://issues.apache.org/jira/browse/HIVE-14855
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14855.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14830) Move a majority of the MiniLlapCliDriver tests to use an inline AM

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532673#comment-15532673
 ] 

Hive QA commented on HIVE-14830:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830823/HIVE-14830.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10647 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1342/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1342/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1342/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830823 - PreCommit-HIVE-Build

> Move a majority of the MiniLlapCliDriver tests to use an inline AM
> --
>
> Key: HIVE-14830
> URL: https://issues.apache.org/jira/browse/HIVE-14830
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14830.01.patch, HIVE-14830.01.patch, 
> HIVE-14830.02.patch, HIVE-14830.02_OnHive14854.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14830) Move a majority of the MiniLlapCliDriver tests to use an inline AM

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532479#comment-15532479
 ] 

Hive QA commented on HIVE-14830:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830823/HIVE-14830.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10647 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1341/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1341/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1341/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830823 - PreCommit-HIVE-Build

> Move a majority of the MiniLlapCliDriver tests to use an inline AM
> --
>
> Key: HIVE-14830
> URL: https://issues.apache.org/jira/browse/HIVE-14830
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14830.01.patch, HIVE-14830.01.patch, 
> HIVE-14830.02.patch, HIVE-14830.02_OnHive14854.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14854) Add a core cluster type to QTestUtil

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532226#comment-15532226
 ] 

Hive QA commented on HIVE-14854:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830815/HIVE-14854.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10645 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1340/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1340/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1340/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830815 - PreCommit-HIVE-Build

> Add a core cluster type to QTestUtil
> 
>
> Key: HIVE-14854
> URL: https://issues.apache.org/jira/browse/HIVE-14854
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14854.01.patch, HIVE-14854.02.patch
>
>
> Follow up to HIVE-14824. There's tez, tez_local, llap, llap_local - all of 
> which are of a single type, similaryl spark, sparkOnYarn, and none,mr. 
> Introducing a core cluster type to make a bunch of conditional checks simpler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14358) Add metrics for number of queries executed for each execution engine (mr, spark, tez)

2016-09-29 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532213#comment-15532213
 ] 

Barna Zsombor Klara commented on HIVE-14358:


The metrics show up on the debug webUI of the HiveServer2, so we shouldn't mix 
them with the HWI. Maybe we should have a child page under [HiveServer2 
Overview|https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Overview] 
for this?

> Add metrics for number of queries executed for each execution engine (mr, 
> spark, tez)
> -
>
> Key: HIVE-14358
> URL: https://issues.apache.org/jira/browse/HIVE-14358
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Lenni Kuff
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14358.patch
>
>
> HiveServer2 currently has a metric for the total number of queries ran since 
> last restart, but it would be useful to also have metrics for number of 
> queries ran for each execution engine. This would improve supportability by 
> allowing users to get a high-level understanding of what workloads had been 
> running on the server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14775) Investigate IOException usage in Metrics APIs

2016-09-29 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-14775:
---
Fix Version/s: 2.2.0
 Release Note: Refactored the public metrics API. Calls capturing metrics 
should log warnings instead of throwing IOExceptions.
   Status: Patch Available  (was: Open)

> Investigate IOException usage in Metrics APIs
> -
>
> Key: HIVE-14775
> URL: https://issues.apache.org/jira/browse/HIVE-14775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2, Metastore
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14775.patch
>
>
> A large number of metrics APIs seem to declare to throw IOExceptions 
> needlessly. (incrementCounter, decrementCounter etc.)
> This is not only misleading but it fills up the code with unnecessary catch 
> blocks never to be reached.
> We should investigate if these exceptions are thrown at all, and remove them 
> if  it is truly unused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14852) Change qtest logging to not redirect all logs to console

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532080#comment-15532080
 ] 

Hive QA commented on HIVE-14852:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830816/HIVE-14852.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10645 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1339/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1339/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1339/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830816 - PreCommit-HIVE-Build

> Change qtest logging to not redirect all logs to console
> 
>
> Key: HIVE-14852
> URL: https://issues.apache.org/jira/browse/HIVE-14852
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14852.01.patch, HIVE-14852.02.patch
>
>
> A change was made recently to redirect all logs to console, to make IDE 
> debugging of regular tests easier. That unfortunately makes qtest debugging 
> tougher - since there's a lot of noise along with the diffs in the output 
> file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14358) Add metrics for number of queries executed for each execution engine (mr, spark, tez)

2016-09-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532075#comment-15532075
 ] 

Lefty Leverenz commented on HIVE-14358:
---

Agreed, the wiki needs to document Hive metrics.  I'm not sure where they 
belong -- perhaps in a new wikidoc, or a section of the Hive Web UI doc -- what 
do you think, [~zsombor.klara]?

* [Hive Web Interface | 
https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface]

> Add metrics for number of queries executed for each execution engine (mr, 
> spark, tez)
> -
>
> Key: HIVE-14358
> URL: https://issues.apache.org/jira/browse/HIVE-14358
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Lenni Kuff
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14358.patch
>
>
> HiveServer2 currently has a metric for the total number of queries ran since 
> last restart, but it would be useful to also have metrics for number of 
> queries ran for each execution engine. This would improve supportability by 
> allowing users to get a high-level understanding of what workloads had been 
> running on the server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14775) Investigate IOException usage in Metrics APIs

2016-09-29 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-14775:
---
Attachment: HIVE-14775.patch

> Investigate IOException usage in Metrics APIs
> -
>
> Key: HIVE-14775
> URL: https://issues.apache.org/jira/browse/HIVE-14775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2, Metastore
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-14775.patch
>
>
> A large number of metrics APIs seem to declare to throw IOExceptions 
> needlessly. (incrementCounter, decrementCounter etc.)
> This is not only misleading but it fills up the code with unnecessary catch 
> blocks never to be reached.
> We should investigate if these exceptions are thrown at all, and remove them 
> if  it is truly unused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-09-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532054#comment-15532054
 ] 

Lefty Leverenz commented on HIVE-7224:
--

Done.  Welcome to the Hive wiki team, [~stakiar].

> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch, HIVE-7224.4.patch, HIVE-7224.5.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11812) datediff sometimes returns incorrect results when called with dates

2016-09-29 Thread Chetna Chaudhari (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532050#comment-15532050
 ] 

Chetna Chaudhari commented on HIVE-11812:
-

[~jdere]: I am running test in India Time Zone.
Yes the tests are working fine in IST zone, but failing in other.


> datediff sometimes returns incorrect results when called with dates
> ---
>
> Key: HIVE-11812
> URL: https://issues.apache.org/jira/browse/HIVE-11812
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.0.0
>Reporter: Nicholas Brenwald
>Assignee: Chetna Chaudhari
>Priority: Minor
> Attachments: HIVE-11812.1.patch
>
>
> DATEDIFF returns an incorrect result when one of the arguments is a date 
> type. 
> The Hive Language Manual provides the following signature for datediff:
> {code}
> int datediff(string enddate, string startdate)
> {code}
> I think datediff should either throw an error (if date types are not 
> supported), or return the correct result.
> To reproduce, create a table:
> {code}
> create table t (c1 string, c2 date);
> {code}
> Assuming you have a table x containing some data, populate table t with 1 row:
> {code}
> insert into t select '2015-09-15', '2015-09-15' from x limit 1;
> {code}
> Then run the following 12 test queries:
> {code}
> select datediff(c1, '2015-09-14') from t;
> select datediff(c1, '2015-09-15') from t;
> select datediff(c1, '2015-09-16') from t;
> select datediff('2015-09-14', c1) from t;
> select datediff('2015-09-15', c1) from t;
> select datediff('2015-09-16', c1) from t;
> select datediff(c2, '2015-09-14') from t;
> select datediff(c2, '2015-09-15') from t;
> select datediff(c2, '2015-09-16') from t;
> select datediff('2015-09-14', c2) from t;
> select datediff('2015-09-15', c2) from t;
> select datediff('2015-09-16', c2) from t;
> {code}
> The below table summarises the result. All results for column c1 (which is a 
> string) are correct, but when using c2 (which is a date), two of the results 
> are incorrect.
> || Test || Expected Result || Actual Result || Passed / Failed ||
> |datediff(c1, '2015-09-14')| 1 | 1| Passed |
> |datediff(c1, '2015-09-15')| 0 | 0| Passed |
> |datediff(c1, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c1) | -1 | -1| Passed |
> |datediff('2015-09-15', c1)| 0 | 0| Passed |
> |datediff('2015-09-16', c1)| 1 | 1| Passed |
> |datediff(c2, '2015-09-14')| 1 | 0| {color:red}Failed{color} |
> |datediff(c2, '2015-09-15')| 0 | 0| Passed |
> |datediff(c2, '2015-09-16') | -1 | -1| Passed |
> |datediff('2015-09-14', c2) | -1 | 0 | {color:red}Failed{color} |
> |datediff('2015-09-15', c2)| 0 | 0| Passed |
> |datediff('2015-09-16', c2)| 1 | 1| Passed |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script

2016-09-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532041#comment-15532041
 ] 

Lefty Leverenz commented on HIVE-5867:
--

Here's how you can get wiki write access:

* [About This Wiki -- How to get permission to edit | 
https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit]

You can put a draft in the wiki and I'll edit it if any changes are needed.

> JDBC driver and beeline should support executing an initial SQL script
> --
>
> Key: HIVE-5867
> URL: https://issues.apache.org/jira/browse/HIVE-5867
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Reporter: Prasad Mujumdar
>Assignee: Jianguo Tian
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch
>
>
> HiveCLI support the .hiverc script that is executed at the start of the 
> session. This is helpful for things like registering UDFs, session specific 
> configs etc.
> This functionality is missing for beeline and JDBC clients. It would be 
> useful for JDBC driver to support an init script with SQL statements that's 
> automatically executed after connection. The script path can be specified via 
> JDBC connection URL. For example 
> {noformat}
> jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql
> {noformat}
> This can be added to Beeline's command line option like "-i 
> /home/user1/scripts/init.sql"
> To help transition from HiveCLI to Beeline, we can keep the default init 
> script as $HOME/.hiverc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5867) JDBC driver and beeline should support executing an initial SQL script

2016-09-29 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532015#comment-15532015
 ] 

Jianguo Tian commented on HIVE-5867:


Thanks for reminding. But I don't have wiki edit privilege now, could you 
please help me update this section of wiki, of course, I'll provide you it's 
draft. Or maybe I should request to access writing wiki. So how do you think?

> JDBC driver and beeline should support executing an initial SQL script
> --
>
> Key: HIVE-5867
> URL: https://issues.apache.org/jira/browse/HIVE-5867
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Reporter: Prasad Mujumdar
>Assignee: Jianguo Tian
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-5867.1.patch, HIVE-5867.2.patch, HIVE-5867.3 .patch
>
>
> HiveCLI support the .hiverc script that is executed at the start of the 
> session. This is helpful for things like registering UDFs, session specific 
> configs etc.
> This functionality is missing for beeline and JDBC clients. It would be 
> useful for JDBC driver to support an init script with SQL statements that's 
> automatically executed after connection. The script path can be specified via 
> JDBC connection URL. For example 
> {noformat}
> jdbc:hive2://localhost:1/default;initScript=/home/user1/scripts/init.sql
> {noformat}
> This can be added to Beeline's command line option like "-i 
> /home/user1/scripts/init.sql"
> To help transition from HiveCLI to Beeline, we can keep the default init 
> script as $HOME/.hiverc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531993#comment-15531993
 ] 

Pengcheng Xiong commented on HIVE-14768:


[~ashutoshc], could u take another look? Thanks.

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14768:
---
Attachment: HIVE-14768.02.patch

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14768:
---
Status: Open  (was: Patch Available)

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14768) Add a new UDTF ExplodeByNumber

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14768:
---
Status: Patch Available  (was: Open)

> Add a new UDTF ExplodeByNumber
> --
>
> Key: HIVE-14768
> URL: https://issues.apache.org/jira/browse/HIVE-14768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.01.patch, HIVE-14768.02.patch
>
>
> For intersect all and except all implementation purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14806) Support UDTF in CBO (AST return path)

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14806:
---
Attachment: (was: HIVE-14768.02.patch)

> Support UDTF in CBO (AST return path)
> -
>
> Key: HIVE-14806
> URL: https://issues.apache.org/jira/browse/HIVE-14806
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14806.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14806) Support UDTF in CBO (AST return path)

2016-09-29 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14806:
---
Attachment: HIVE-14768.02.patch

> Support UDTF in CBO (AST return path)
> -
>
> Key: HIVE-14806
> URL: https://issues.apache.org/jira/browse/HIVE-14806
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14768.02.patch, HIVE-14806.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14772) NPE when MSCK REPAIR

2016-09-29 Thread Per Ullberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531953#comment-15531953
 ] 

Per Ullberg edited comment on HIVE-14772 at 9/29/16 6:36 AM:
-

[~alunarbeach] Not really, the NPE is already worked around with a boolean 
before that commit. Ugly!

But using a set should probably remove the NPE so applying what you link to 
should do the trick I guess.


was (Author: hiverunner@github):
[~alunarbeach] Not really, the NPE is already worked around with a boolean 
before that commit. Ugly!

> NPE when MSCK REPAIR
> 
>
> Key: HIVE-14772
> URL: https://issues.apache.org/jira/browse/HIVE-14772
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
> Environment: HiveRunner on OSX Yosemite
>Reporter: Per Ullberg
>
> HiveMetaStoreChecker throws NullPointerException when doing a MSCK REPAIR 
> TABLE.
> The bug is here:
> {code}
> ...
> 18  package org.apache.hadoop.hive.ql.metadata;
> ...
> 58  public class HiveMetaStoreChecker {
> ...
> 408if (!directoryFound) {
> 409 allDirs.put(path, null);
> 410}
> ...
> {code}
> allDirs is a ConcurrentHashMap and those does not allow either key or value 
> to be null.
> I found the bug while trying to port https://github.com/klarna/HiveRunner to 
> Hive 2.1.0
> Implemented explicit test case that exposes the bug here: 
> https://github.com/klarna/HiveRunner/blob/hive-2.1.0-NPE-at-msck-repair/src/test/java/com/klarna/hiverunner/MSCKRepairNPE.java
> Reproduce by cloning branch 
> https://github.com/klarna/HiveRunner/tree/hive-2.1.0-NPE-at-msck-repair
> and run 
> {code} mvn -Dtest=MSCKRepairNPE clean test{code}
> ( Does not work on windows :( )
> Looks like this email thread talks about the same issue: 
> http://user.hive.apache.narkive.com/ETOpbKk5/msck-repair-table-and-hive-v2-1-0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14772) NPE when MSCK REPAIR

2016-09-29 Thread Per Ullberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531953#comment-15531953
 ] 

Per Ullberg commented on HIVE-14772:


[~alunarbeach] Not really, the NPE is already worked around with a boolean 
before that commit. Ugly!

> NPE when MSCK REPAIR
> 
>
> Key: HIVE-14772
> URL: https://issues.apache.org/jira/browse/HIVE-14772
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
> Environment: HiveRunner on OSX Yosemite
>Reporter: Per Ullberg
>
> HiveMetaStoreChecker throws NullPointerException when doing a MSCK REPAIR 
> TABLE.
> The bug is here:
> {code}
> ...
> 18  package org.apache.hadoop.hive.ql.metadata;
> ...
> 58  public class HiveMetaStoreChecker {
> ...
> 408if (!directoryFound) {
> 409 allDirs.put(path, null);
> 410}
> ...
> {code}
> allDirs is a ConcurrentHashMap and those does not allow either key or value 
> to be null.
> I found the bug while trying to port https://github.com/klarna/HiveRunner to 
> Hive 2.1.0
> Implemented explicit test case that exposes the bug here: 
> https://github.com/klarna/HiveRunner/blob/hive-2.1.0-NPE-at-msck-repair/src/test/java/com/klarna/hiverunner/MSCKRepairNPE.java
> Reproduce by cloning branch 
> https://github.com/klarna/HiveRunner/tree/hive-2.1.0-NPE-at-msck-repair
> and run 
> {code} mvn -Dtest=MSCKRepairNPE clean test{code}
> ( Does not work on windows :( )
> Looks like this email thread talks about the same issue: 
> http://user.hive.apache.narkive.com/ETOpbKk5/msck-repair-table-and-hive-v2-1-0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14854) Add a core cluster type to QTestUtil

2016-09-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531912#comment-15531912
 ] 

Hive QA commented on HIVE-14854:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12830815/HIVE-14854.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10644 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1338/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1338/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1338/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12830815 - PreCommit-HIVE-Build

> Add a core cluster type to QTestUtil
> 
>
> Key: HIVE-14854
> URL: https://issues.apache.org/jira/browse/HIVE-14854
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14854.01.patch, HIVE-14854.02.patch
>
>
> Follow up to HIVE-14824. There's tez, tez_local, llap, llap_local - all of 
> which are of a single type, similaryl spark, sparkOnYarn, and none,mr. 
> Introducing a core cluster type to make a bunch of conditional checks simpler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)