[jira] [Updated] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-10-17 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13557:

Attachment: HIVE-13557.1.patch

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, 
> HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584522#comment-15584522
 ] 

Hive QA commented on HIVE-14913:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833860/HIVE-14913.5.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10592 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[alter_merge_orc] 
(batchId=119)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1610/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1610/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1610/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833860 - PreCommit-HIVE-Build

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14989) FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte

2016-10-17 Thread Ruslan Dautkhanov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruslan Dautkhanov resolved HIVE-14989.
--
   Resolution: Duplicate
Fix Version/s: 0.14.1

> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte
> --
>
> Key: HIVE-14989
> URL: https://issues.apache.org/jira/browse/HIVE-14989
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Parser, Reader
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Ruslan Dautkhanov
> Fix For: 0.14.1
>
>
> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte. 
> Delimiter starting from 2nd character becomes part of returned data. No 
> parsed properly.
> Test case:
> {noformat}
> CREATE external TABLE test_muldelim
> (  string1 STRING,
>string2 STRING,
>string3 STRING
> )
>  ROW FORMAT 
>DELIMITED FIELDS TERMINATED BY '<>'
>   LINES TERMINATED BY '\n'
>  STORED AS TEXTFILE
>   location '/user/hive/test_muldelim'
> {noformat}
> Create a text file under /user/hive/test_muldelim with following 2 lines:
> {noformat}
> data1<>data2<>data3
> aa<>bb<>cc
> {noformat}
> Now notice that two-character delimiter wasn't parsed properly:
> {noformat}
> jdbc:hive2://host.domain.com:1> select * from ruslan_test.test_muldelim ;
> ++++--+
> | test_muldelim.string1  | test_muldelim.string2  | test_muldelim.string3  |
> ++++--+
> | data1  | >data2 | >data3 |
> | aa | >bb| >cc|
> ++++--+
> 2 rows selected (0.453 seconds)
> {noformat}
> The second delimiter's character ('>') became part of the columns to the 
> right (`string2` and `string3`).
> Table DDL:
> {noformat}
> 0: jdbc:hive2://host.domain.com:1> show create table dafault.test_muldelim ;
> +-+--+
> | createtab_stmt  |
> +-+--+
> | CREATE EXTERNAL TABLE `default.test_muldelim`(  |
> |   `string1` string, |
> |   `string2` string, |
> |   `string3` string) |
> | ROW FORMAT DELIMITED|
> |   FIELDS TERMINATED BY '<>' |
> |   LINES TERMINATED BY '\n'  |
> | STORED AS INPUTFORMAT   |
> |   'org.apache.hadoop.mapred.TextInputFormat'|
> | OUTPUTFORMAT|
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'  |
> | LOCATION|
> |   'hdfs://epsdatalake/user/hive/test_muldelim'  |
> | TBLPROPERTIES ( |
> |   'transient_lastDdlTime'='1476727100') |
> +-+--+
> 15 rows selected (0.286 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14989) FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte

2016-10-17 Thread Ruslan Dautkhanov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584443#comment-15584443
 ] 

Ruslan Dautkhanov commented on HIVE-14989:
--

Thank you.

> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte
> --
>
> Key: HIVE-14989
> URL: https://issues.apache.org/jira/browse/HIVE-14989
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Parser, Reader
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Ruslan Dautkhanov
>
> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte. 
> Delimiter starting from 2nd character becomes part of returned data. No 
> parsed properly.
> Test case:
> {noformat}
> CREATE external TABLE test_muldelim
> (  string1 STRING,
>string2 STRING,
>string3 STRING
> )
>  ROW FORMAT 
>DELIMITED FIELDS TERMINATED BY '<>'
>   LINES TERMINATED BY '\n'
>  STORED AS TEXTFILE
>   location '/user/hive/test_muldelim'
> {noformat}
> Create a text file under /user/hive/test_muldelim with following 2 lines:
> {noformat}
> data1<>data2<>data3
> aa<>bb<>cc
> {noformat}
> Now notice that two-character delimiter wasn't parsed properly:
> {noformat}
> jdbc:hive2://host.domain.com:1> select * from ruslan_test.test_muldelim ;
> ++++--+
> | test_muldelim.string1  | test_muldelim.string2  | test_muldelim.string3  |
> ++++--+
> | data1  | >data2 | >data3 |
> | aa | >bb| >cc|
> ++++--+
> 2 rows selected (0.453 seconds)
> {noformat}
> The second delimiter's character ('>') became part of the columns to the 
> right (`string2` and `string3`).
> Table DDL:
> {noformat}
> 0: jdbc:hive2://host.domain.com:1> show create table dafault.test_muldelim ;
> +-+--+
> | createtab_stmt  |
> +-+--+
> | CREATE EXTERNAL TABLE `default.test_muldelim`(  |
> |   `string1` string, |
> |   `string2` string, |
> |   `string3` string) |
> | ROW FORMAT DELIMITED|
> |   FIELDS TERMINATED BY '<>' |
> |   LINES TERMINATED BY '\n'  |
> | STORED AS INPUTFORMAT   |
> |   'org.apache.hadoop.mapred.TextInputFormat'|
> | OUTPUTFORMAT|
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'  |
> | LOCATION|
> |   'hdfs://epsdatalake/user/hive/test_muldelim'  |
> | TBLPROPERTIES ( |
> |   'transient_lastDdlTime'='1476727100') |
> +-+--+
> 15 rows selected (0.286 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1558#comment-1558
 ] 

Hive QA commented on HIVE-14993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833848/HIVE-14993.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 849 failed/errored test(s), 10594 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist] 
(batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter1] (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter2] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter3] (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter4] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter5] (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_char1] (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_char2] (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_file_format] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge] (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2_orc] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_3] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_orc] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_clusterby_sortby]
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_format_loc]
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition_authorization]
 (batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_skewed_table] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_cascade] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_invalidate_column_stats]
 (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_location] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_not_sorted] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_partition_drop]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_varchar1] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_varchar2] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_as_select] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_rename] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby2] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_limit] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_table] 
(batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_union] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[archive_excludeHadoop20] 
(batchId=59)

[jira] [Commented] (HIVE-14989) FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte

2016-10-17 Thread Niklaus Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584420#comment-15584420
 ] 

Niklaus Xiao commented on HIVE-14989:
-

You should use {{MultiDelimtSerde}} in this case.

> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte
> --
>
> Key: HIVE-14989
> URL: https://issues.apache.org/jira/browse/HIVE-14989
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Parser, Reader
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Ruslan Dautkhanov
>
> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte. 
> Delimiter starting from 2nd character becomes part of returned data. No 
> parsed properly.
> Test case:
> {noformat}
> CREATE external TABLE test_muldelim
> (  string1 STRING,
>string2 STRING,
>string3 STRING
> )
>  ROW FORMAT 
>DELIMITED FIELDS TERMINATED BY '<>'
>   LINES TERMINATED BY '\n'
>  STORED AS TEXTFILE
>   location '/user/hive/test_muldelim'
> {noformat}
> Create a text file under /user/hive/test_muldelim with following 2 lines:
> {noformat}
> data1<>data2<>data3
> aa<>bb<>cc
> {noformat}
> Now notice that two-character delimiter wasn't parsed properly:
> {noformat}
> jdbc:hive2://host.domain.com:1> select * from ruslan_test.test_muldelim ;
> ++++--+
> | test_muldelim.string1  | test_muldelim.string2  | test_muldelim.string3  |
> ++++--+
> | data1  | >data2 | >data3 |
> | aa | >bb| >cc|
> ++++--+
> 2 rows selected (0.453 seconds)
> {noformat}
> The second delimiter's character ('>') became part of the columns to the 
> right (`string2` and `string3`).
> Table DDL:
> {noformat}
> 0: jdbc:hive2://host.domain.com:1> show create table dafault.test_muldelim ;
> +-+--+
> | createtab_stmt  |
> +-+--+
> | CREATE EXTERNAL TABLE `default.test_muldelim`(  |
> |   `string1` string, |
> |   `string2` string, |
> |   `string3` string) |
> | ROW FORMAT DELIMITED|
> |   FIELDS TERMINATED BY '<>' |
> |   LINES TERMINATED BY '\n'  |
> | STORED AS INPUTFORMAT   |
> |   'org.apache.hadoop.mapred.TextInputFormat'|
> | OUTPUTFORMAT|
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'  |
> | LOCATION|
> |   'hdfs://epsdatalake/user/hive/test_muldelim'  |
> | TBLPROPERTIES ( |
> |   'transient_lastDdlTime'='1476727100') |
> +-+--+
> 15 rows selected (0.286 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14940:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master.

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584377#comment-15584377
 ] 

Prasanth Jayachandran commented on HIVE-14940:
--

These test failures are consistently failing in master for a while now after 
the ptest migration. 

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584369#comment-15584369
 ] 

Hive QA commented on HIVE-14940:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833845/HIVE-14940.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10592 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1608/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1608/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1608/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833845 - PreCommit-HIVE-Build

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-10-17 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584325#comment-15584325
 ] 

Rui Li commented on HIVE-14029:
---

Hmm even with a shim layer, it's difficult to support different Spark versions 
if b/c is not maintained between minor releases of Spark.
I'm wondering if the Spark used by Hive can be considered as some kind of 
embedded binaries that exclusively used for HoS. On Hive side, we just need to 
set spark.home pointing to this Spark. User's other Spark applications, e.g. 
SparkSQL, streaming, can still run against the current Spark they have in the 
cluster. Will this make it easier for the upgrade?
I think we also need to be more careful to upgrade Spark in the future, if the 
upgrade is breaking compatibility. For such upgrade, we need to firstly make 
sure there's no obvious regression in functionality and performance.

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: Incompatible, TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14029.1.patch, HIVE-14029.2.patch, 
> HIVE-14029.3.patch, HIVE-14029.4.patch, HIVE-14029.5.patch, 
> HIVE-14029.6.patch, HIVE-14029.7.patch, HIVE-14029.8.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.
> To update Spark version to 2.0.0, the following changes are required:
> * Spark API updates:
> ** SparkShuffler#call return Iterator instead of Iterable
> ** SparkListener -> JavaSparkListener
> ** InputMetrics constructor doesn’t accept readMethod
> ** Method remoteBlocksFetched and localBlocksFetched in ShuffleReadMetrics 
> return long type instead of integer
> * Dependency upgrade:
> ** Jackson: 2.4.2 -> 2.6.5
> ** Netty version: 4.0.23.Final -> 4.0.29.Final
> ** Scala binary version: 2.10 -> 2.11
> ** Scala version: 2.10.4 -> 2.11.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584307#comment-15584307
 ] 

Rui Li commented on HIVE-14797:
---

Thanks for the update [~roncenzhao]. I have one more question.
{{ObjectInspectorUtils.getBucketHashCode}} is also used in several places other 
than RS, e.g. in FS. Now if the # of reducers is 31, RS will compute the hash 
code differently from the other places. Wondering if we need to keep some kind 
of consistency among these calling paths. [~xuefuz] do you have any ideas?

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14642) handle insert overwrite for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14642:

Fix Version/s: hive-14535
   Status: Patch Available  (was: Open)

> handle insert overwrite for MM tables
> -
>
> Key: HIVE-14642
> URL: https://issues.apache.org/jira/browse/HIVE-14642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14642.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14642) handle insert overwrite for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14642:

Attachment: HIVE-14642.patch

Relatively small patch (mostly test changes, one fix for DP).
Seems like non-ORC merge is also broken... need to take a look separately

> handle insert overwrite for MM tables
> -
>
> Key: HIVE-14642
> URL: https://issues.apache.org/jira/browse/HIVE-14642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14642.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584278#comment-15584278
 ] 

Hive QA commented on HIVE-14921:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833847/HIVE-14921.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10553 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=199)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=92)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=157)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=206)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1607/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1607/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1607/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833847 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch, 
> HIVE-14921.2.patch, HIVE-14921.2.patch, HIVE-14921.3.patch, HIVE-14921.3.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread roncenzhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

roncenzhao updated HIVE-14797:
--
Attachment: HIVE-14797.4.patch

resolve the problem about running on spark/tez

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread roncenzhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584262#comment-15584262
 ] 

roncenzhao commented on HIVE-14797:
---

Hi, [~lirui] , I hava resolved this problem in the new patch.
Please check it. Thanks~

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14642) handle insert overwrite for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14642:

Summary: handle insert overwrite for MM tables  (was: handle insert 
overwrite, load, import)

> handle insert overwrite for MM tables
> -
>
> Key: HIVE-14642
> URL: https://issues.apache.org/jira/browse/HIVE-14642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14887) Reduce the memory requirements for tests

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584186#comment-15584186
 ] 

Hive QA commented on HIVE-14887:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833300/HIVE-14887.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10594 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_partitioned] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1606/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1606/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1606/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833300 - PreCommit-HIVE-Build

> Reduce the memory requirements for tests
> 
>
> Key: HIVE-14887
> URL: https://issues.apache.org/jira/browse/HIVE-14887
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14887.01.patch, HIVE-14887.02.patch, 
> HIVE-14887.03.patch
>
>
> The clusters that we spin up end up requiring 16GB at times. Also the maven 
> arguments seem a little heavy weight.
> Reducing this will allow for additional ptest drones per box, which should 
> bring down the runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9941:
---
Attachment: HIVE-9941.3.patch

Actually, I have a further update, with import and drop ptn as well. I was 
assuming this was tested elsewhere, but apparently not. Added them in.

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.3.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert with dynamic partitions

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Summary: double conversion can corrupt partition column values for insert 
with dynamic partitions  (was: double conversion can corrupt partition column 
values for insert overwrite with DP)

> double conversion can corrupt partition column values for insert with dynamic 
> partitions
> 
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
>   Map Operator Tree:
> ...
>   Select Operator
> expressions: (UDFToDouble(key) + 1.0) (type: double)
> ...
> Reduce Output Operator
>   key expressions: _col0 (type: double)
> ...
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
> (type: double)
> ...
> Select Operator
>   expressions: UDFToInteger(_col0) (type: int), _col1 (type: 
> double)
> ... followed by FSOP and load into table
> {noformat}
> The result of the select from the resulting table is:
> {noformat}
> POSTHOOK: query: select key, key2 from iow1
> ...
> POSTHOOK: Input: default@iow1@key2=499.0
> ...
> 499   NULL
> {noformat}
> Woops!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14994) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-14994.
-
Resolution: Duplicate

Dup of HIVE-14995. Resolving this as I edited the description in the other one 
to improve it.

> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14994
> URL: https://issues.apache.org/jira/browse/HIVE-14994
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> The value is converted correctly to integer for the regular column, but not 
> for partition column.
> {noformat}
> 498   499.0   499.0
> {noformat}
> Explain for insert (extracted)
> {noformat}
> Map Reduce
>   Map Operator Tree:
> ...
>   Select Operator
> expressions: (UDFToDouble(key) + 1.0) (type: double)
> ...
> Reduce Output Operator
>   key expressions: _col0 (type: double)
>   sort order: -
> ...
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
> (type: double)
> ...
> Select Operator
>   expressions: UDFToInteger(_col0) (type: int), _col1 (type: 
> double)
>   followed by FSOP and load into table
> {noformat}
> The result of the select from the resulting table is:
> {noformat}
> POSTHOOK: query: select key, key2 from iow1
> ...
> POSTHOOK: Input: default@iow1@key2=499.0
> ...
> 499   NULL
> {noformat}
> Woops!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584092#comment-15584092
 ] 

Hive QA commented on HIVE-14940:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833845/HIVE-14940.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10592 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1605/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1605/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1605/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833845 - PreCommit-HIVE-Build

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14969) add test cases for ACID

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584091#comment-15584091
 ] 

Sergey Shelukhin commented on HIVE-14969:
-

Note: tablesample pruner is super buggy, so it's probably best to disable it 
for ACID tables like it's disabled for some other stuff

> add test cases for ACID
> ---
>
> Key: HIVE-14969
> URL: https://issues.apache.org/jira/browse/HIVE-14969
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>
> I think the following tests are added
> 1) CTAS into transactional table must be transactional.
> 2) tablesample with buckets from ACID table - judging by HIVE-14967, 
> selecting buckets with nested directories may have bugs on Tez
> 3) insert with union - same reason, if the test doesn't already exist it 
> would be nice to see that bases and deltas are processed correctly given that 
> union creates 2 directories for the results of the same insert



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13873) Column pruning for nested fields

2016-10-17 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-13873:

Attachment: HIVE-13873.4.patch

> Column pruning for nested fields
> 
>
> Key: HIVE-13873
> URL: https://issues.apache.org/jira/browse/HIVE-13873
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-13873.1.patch, HIVE-13873.2.patch, 
> HIVE-13873.3.patch, HIVE-13873.4.patch, HIVE-13873.patch, HIVE-13873.wip.patch
>
>
> Some columnar file formats such as Parquet store fields in struct type also 
> column by column using encoding described in Google Dramel pager. It's very 
> common in big data where data are stored in structs while queries only needs 
> a subset of the the fields in the structs. However, presently Hive still 
> needs to read the whole struct regardless whether all fields are selected. 
> Therefore, pruning unwanted sub-fields in struct or nested fields at file 
> reading time would be a big performance boost for such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584061#comment-15584061
 ] 

Sergey Shelukhin commented on HIVE-14995:
-

[~hagleitn] [~ashutoshc] another interesting one... incorrect results

> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
>   Map Operator Tree:
> ...
>   Select Operator
> expressions: (UDFToDouble(key) + 1.0) (type: double)
> ...
> Reduce Output Operator
>   key expressions: _col0 (type: double)
> ...
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
> (type: double)
> ...
> Select Operator
>   expressions: UDFToInteger(_col0) (type: int), _col1 (type: 
> double)
> ... followed by FSOP and load into table
> {noformat}
> The result of the select from the resulting table is:
> {noformat}
> POSTHOOK: query: select key, key2 from iow1
> ...
> POSTHOOK: Input: default@iow1@key2=499.0
> ...
> 499   NULL
> {noformat}
> Woops!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Description: 
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
... followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!



  was:
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!




> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
>   Map Operator Tree:
> ...
>   Select 

[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Description: 
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!



  was:
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
  sort order: -
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!




> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
> Map 

[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Description: 
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
  sort order: -
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!



  was:
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
The value is converted correctly to integer for the regular column, but not for 
partition column.
{noformat}
498 499.0   499.0
{noformat}

Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
  sort order: -
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!




> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
> Map 

[jira] [Updated] (HIVE-14969) add test cases for ACID

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14969:
--
Component/s: Transactions

> add test cases for ACID
> ---
>
> Key: HIVE-14969
> URL: https://issues.apache.org/jira/browse/HIVE-14969
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>
> I think the following tests are added
> 1) CTAS into transactional table must be transactional.
> 2) tablesample with buckets from ACID table - judging by HIVE-14967, 
> selecting buckets with nested directories may have bugs on Tez
> 3) insert with union - same reason, if the test doesn't already exist it 
> would be nice to see that bases and deltas are processed correctly given that 
> union creates 2 directories for the results of the same insert



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Patch Available  (was: Open)

Address review comments

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Attachment: HIVE-14913.5.patch

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Open  (was: Patch Available)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584003#comment-15584003
 ] 

Hive QA commented on HIVE-14921:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833204/HIVE-14921.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10567 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=199)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables_compact]
 (batchId=32)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[pcs] 
(batchId=144)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=157)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=206)
org.apache.hive.spark.client.TestSparkClient.testJobSubmission (batchId=265)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1604/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1604/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1604/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833204 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch, 
> HIVE-14921.2.patch, HIVE-14921.2.patch, HIVE-14921.3.patch, HIVE-14921.3.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583993#comment-15583993
 ] 

Jason Dere commented on HIVE-9941:
--

Actually, I'll hold off my +1 until we see the ptest run, per the discussed new 
guildlines for waiting on test results before committing.
But the test cases look good to me.

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583988#comment-15583988
 ] 

Jason Dere commented on HIVE-9941:
--

+1 if the tests pass

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-9941:
-
Comment: was deleted

(was: +1 if the tests pass)

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9941:
---
Target Version/s: 2.2.0  (was: 1.3.0, 1.2.2, 2.2.0)
  Status: Patch Available  (was: Open)

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.0, 1.0.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Component/s: Transactions

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Attachment: HIVE-14993.patch

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Status: Patch Available  (was: Open)

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14921:
-
Attachment: HIVE-14921.3.patch

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch, 
> HIVE-14921.2.patch, HIVE-14921.2.patch, HIVE-14921.3.patch, HIVE-14921.3.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14940:
-
Attachment: HIVE-14940.4.patch

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12764) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all) in Hive

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12764:
---
Summary: Support Intersect (distinct/all) Except (distinct/all) Minus 
(distinct/all) in Hive  (was: Support set operators in Hive)

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all) 
> in Hive
> ---
>
> Key: HIVE-12764
> URL: https://issues.apache.org/jira/browse/HIVE-12764
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> We plan to address union distinct (already done), intersect (all, distinct) 
> and except (all, distinct) by leveraging the power of relational algebra 
> through query rewriting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583744#comment-15583744
 ] 

Chaoyu Tang commented on HIVE-14927:


Yeah, it seems that precommit build have some issues.

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14992) Relocate several common libraries in hive jdbc uber jar

2016-10-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14992:
--
Attachment: HIVE-14992.1.patch

> Relocate several common libraries in hive jdbc uber jar
> ---
>
> Key: HIVE-14992
> URL: https://issues.apache.org/jira/browse/HIVE-14992
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14992.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14992) Relocate several common libraries in hive jdbc uber jar

2016-10-17 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583711#comment-15583711
 ] 

Tao Li commented on HIVE-14992:
---

This is to avoid dependency version conflicts when users are using some common 
libs along with the JDBC standalone jar.
cc [~gopalv], [~thejas]

> Relocate several common libraries in hive jdbc uber jar
> ---
>
> Key: HIVE-14992
> URL: https://issues.apache.org/jira/browse/HIVE-14992
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-17 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583692#comment-15583692
 ] 

Eugene Koifman commented on HIVE-14980:
---

Relying on "show compactions" is not atomic so it's not a complete fix.
It should use locks of some kind, but not in the current lock manager.  
MutexAPI.acquireLock(String) was meant to support the kind of locking that this 
needs but it's not quite complete.  If you use  for the 
key, and use this from Worker, it will achieve the proper synchronization 
atomically and the "lock" will be released if the process dies.


> Minor compaction when triggered simultaniously on the same table/partition 
> deletes data
> ---
>
> Key: HIVE-14980
> URL: https://issues.apache.org/jira/browse/HIVE-14980
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.1.0
>Reporter: Mahipal Jupalli
>Assignee: Mahipal Jupalli
>Priority: Critical
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I have two tables (TABLEA, TABLEB). If I manually trigger compaction after 
> each INSERT into TABLEB from TABLEA, compactions are triggered on random 
> metastore asynchronously and are stepping on each other which is causing the 
> data to be deleted.
> Example here: 
> TABLEA - has 10k rows. 
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> Once all the compactions are complete, I should ideally see 20k rows in 
> TABLEB. But I see only 10k rows (Only the rows INSERTED before the last 
> compaction persist, the old rows are deleted. I believe the old delta files 
> are deleted). 
> To further confirm the bug, if I do only one compaction after two inserts, I 
> see 20k rows in TABLEB.
> Proposed Fix:
> I have identified the bug in the code, it requires an additional check in the 
> org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
> compactions on the table/partition. I will 'share the details of the fix once 
> I test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583621#comment-15583621
 ] 

Illya Yalovyy commented on HIVE-14927:
--

At the moment I can see many builds are failing with similar symptoms.

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-10-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583487#comment-15583487
 ] 

Xuefu Zhang commented on HIVE-14029:


Spark claims API compatibility within a major release, but it doesn't seem so 
based on our experience. 
https://issues.apache.org/jira/browse/HIVE-9726
https://issues.apache.org/jira/browse/HIVE-10999
https://issues.apache.org/jira/browse/HIVE-11473
https://issues.apache.org/jira/browse/HIVE-12828
In two of the four upgrades, there are incompatibility API changes.

Spark is still a young project, so people may have lower expectation on this.

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: Incompatible, TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14029.1.patch, HIVE-14029.2.patch, 
> HIVE-14029.3.patch, HIVE-14029.4.patch, HIVE-14029.5.patch, 
> HIVE-14029.6.patch, HIVE-14029.7.patch, HIVE-14029.8.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.
> To update Spark version to 2.0.0, the following changes are required:
> * Spark API updates:
> ** SparkShuffler#call return Iterator instead of Iterable
> ** SparkListener -> JavaSparkListener
> ** InputMetrics constructor doesn’t accept readMethod
> ** Method remoteBlocksFetched and localBlocksFetched in ShuffleReadMetrics 
> return long type instead of integer
> * Dependency upgrade:
> ** Jackson: 2.4.2 -> 2.6.5
> ** Netty version: 4.0.23.Final -> 4.0.29.Final
> ** Scala binary version: 2.10 -> 2.11
> ** Scala version: 2.10.4 -> 2.11.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14991) JDBC result set iterator has useless DEBUG log

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-14991.
--
   Resolution: Fixed
Fix Version/s: 2.2.0

Committed to master.

> JDBC result set iterator has useless DEBUG log
> --
>
> Key: HIVE-14991
> URL: https://issues.apache.org/jira/browse/HIVE-14991
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14991.1.patch
>
>
> Result set iterator prints the following debug lines for every row. The row 
> string is always empty as per code.
> {code}
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> {code}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14991) JDBC result set iterator has useless DEBUG log

2016-10-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583469#comment-15583469
 ] 

Prasanth Jayachandran commented on HIVE-14991:
--

There is no other assignment for rowStr in the code. Since this patch just 
removes the unwanted logs I don't we need precommit tests for this patch.

> JDBC result set iterator has useless DEBUG log
> --
>
> Key: HIVE-14991
> URL: https://issues.apache.org/jira/browse/HIVE-14991
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14991.1.patch
>
>
> Result set iterator prints the following debug lines for every row. The row 
> string is always empty as per code.
> {code}
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> {code}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14991) JDBC result set iterator has useless DEBUG log

2016-10-17 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583466#comment-15583466
 ] 

Vaibhav Gumashta commented on HIVE-14991:
-

+1

> JDBC result set iterator has useless DEBUG log
> --
>
> Key: HIVE-14991
> URL: https://issues.apache.org/jira/browse/HIVE-14991
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14991.1.patch
>
>
> Result set iterator prints the following debug lines for every row. The row 
> string is always empty as per code.
> {code}
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> {code}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14991) JDBC result set iterator has useless DEBUG log

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14991:
-
Description: 
Result set iterator prints the following debug lines for every row. The row 
string is always empty as per code.

{code}
2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
{code}

NO PRECOMMIT TESTS

  was:
Result set iterator prints the following debug lines for every row. The row 
string is always empty as per code.

{code}
2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
string: 
{code}


> JDBC result set iterator has useless DEBUG log
> --
>
> Key: HIVE-14991
> URL: https://issues.apache.org/jira/browse/HIVE-14991
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14991.1.patch
>
>
> Result set iterator prints the following debug lines for every row. The row 
> string is always empty as per code.
> {code}
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> {code}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14991) JDBC result set iterator has useless DEBUG log

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14991:
-
Attachment: HIVE-14991.1.patch

[~thejas][~vgumashta] can some please take a look?

> JDBC result set iterator has useless DEBUG log
> --
>
> Key: HIVE-14991
> URL: https://issues.apache.org/jira/browse/HIVE-14991
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14991.1.patch
>
>
> Result set iterator prints the following debug lines for every row. The row 
> string is always empty as per code.
> {code}
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,792 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> 2016-10-17T11:49:52,793 DEBUG [main] jdbc.HiveQueryResultSet: Fetched row 
> string: 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14899) MM: support (or disable) alter table concatenate

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14899:

Attachment: HIVE-14899.patch

> MM: support (or disable) alter table concatenate
> 
>
> Key: HIVE-14899
> URL: https://issues.apache.org/jira/browse/HIVE-14899
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14899.patch
>
>
> Doesn't make much sense for a table with lots of separate directories, at the 
> first glance. However, it could concatenate files within separate directories 
> if some insert produces a lot.
> Merges between directories are also possible, as long as they take into 
> account correctness of what is committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14899) MM: support (or disable) alter table concatenate

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-14899.
-
   Resolution: Fixed
Fix Version/s: hive-14535

This has already been done in some other jira. Added a test.

> MM: support (or disable) alter table concatenate
> 
>
> Key: HIVE-14899
> URL: https://issues.apache.org/jira/browse/HIVE-14899
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14899.patch
>
>
> Doesn't make much sense for a table with lots of separate directories, at the 
> first glance. However, it could concatenate files within separate directories 
> if some insert produces a lot.
> Merges between directories are also possible, as long as they take into 
> account correctness of what is committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-10-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583393#comment-15583393
 ] 

Sergio Peña commented on HIVE-14029:


Interesting, so even between Spark 1.x versions, Hive wasn't compatible at all 
with them? This is going to be a lot of work as you said. If Spark 2.1 isn't 
compatible with Spark 2.0 for instance, then we will have a shim layer with 
minor changes per Spark version to keep compatibility.

[~xuefuz] Were there users in the community complaining about Spark 1.x 
incompatibilities with Hive in the past? 

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: Incompatible, TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14029.1.patch, HIVE-14029.2.patch, 
> HIVE-14029.3.patch, HIVE-14029.4.patch, HIVE-14029.5.patch, 
> HIVE-14029.6.patch, HIVE-14029.7.patch, HIVE-14029.8.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.
> To update Spark version to 2.0.0, the following changes are required:
> * Spark API updates:
> ** SparkShuffler#call return Iterator instead of Iterable
> ** SparkListener -> JavaSparkListener
> ** InputMetrics constructor doesn’t accept readMethod
> ** Method remoteBlocksFetched and localBlocksFetched in ShuffleReadMetrics 
> return long type instead of integer
> * Dependency upgrade:
> ** Jackson: 2.4.2 -> 2.6.5
> ** Netty version: 4.0.23.Final -> 4.0.29.Final
> ** Scala binary version: 2.10 -> 2.11
> ** Scala version: 2.10.4 -> 2.11.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14899) MM: support (or disable) alter table concatenate

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-14899:
---

Assignee: Sergey Shelukhin

> MM: support (or disable) alter table concatenate
> 
>
> Key: HIVE-14899
> URL: https://issues.apache.org/jira/browse/HIVE-14899
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Doesn't make much sense for a table with lots of separate directories, at the 
> first glance. However, it could concatenate files within separate directories 
> if some insert produces a lot.
> Merges between directories are also possible, as long as they take into 
> account correctness of what is committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14932) handle bucketing for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14932:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to the feature branch after modifying the test. Seems like bucketing 
results in ten thousand million tasks if one is not careful, making the tests 
extremely slow.

> handle bucketing for MM tables
> --
>
> Key: HIVE-14932
> URL: https://issues.apache.org/jira/browse/HIVE-14932
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14932.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14643) handle ctas for the MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14643:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to the feature branch.

> handle ctas for the MM tables
> -
>
> Key: HIVE-14643
> URL: https://issues.apache.org/jira/browse/HIVE-14643
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14643.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583371#comment-15583371
 ] 

Hive QA commented on HIVE-14981:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833777/HIVE-14981.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[orc_llap.q,vectorization_pushdown.q,correlationoptimizer2.q,cbo_gby_empty.q,vectorization_short_regress.q,identity_project_remove_skip.q,auto_sortmerge_join_1.q,lineage3.q,cross_product_check_1.q,cbo_join.q,vector_struct_in.q,correlationoptimizer6.q,union_remove_26.q,vectorization_13.q,union2.q,groupby2.q,schema_evol_text_vec_table.q,dynpart_sort_opt_vectorization.q,exchgpartition2lel.q,multiMapJoin1.q,sample10.q,vectorized_timestamp_ints_casts.q,vector_char_simple.q,dynpart_sort_optimization_acid.q,auto_sortmerge_join_2.q,bucketizedhiveinputformat.q,leftsemijoin.q,special_character_in_tabnames_1.q,cte_mat_2.q,vectorization_8.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
org.apache.hive.spark.client.TestSparkClient.testJobSubmission (batchId=263)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1600/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1600/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1600/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833777 - PreCommit-HIVE-Build

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14982) Remove some reserved keywords in 2.2

2016-10-17 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583386#comment-15583386
 ] 

Pengcheng Xiong commented on HIVE-14982:


We are moving towards SQL2011 standard compliance.  Those keywords conflict 
with SQL2011 standard.


> Remove some reserved keywords in 2.2
> 
>
> Key: HIVE-14982
> URL: https://issues.apache.org/jira/browse/HIVE-14982
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> It seems that CACHE, DAYOFWEEK, VIEWS are reserved keywords in master. This 
> conflicts with SQL2011 standard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-10-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583378#comment-15583378
 ] 

Thejas M Nair commented on HIVE-13931:
--

>From some system testing at hortonworks, it seems like this version of 
>hikaricp might not be more robust that bonecp. So it might not make sense to 
>make it the default right now. I think it would still be useful to provide 
>hikaricp as option so that users can choose what works better in their 
>environment.

@sushanth will you be able to update the patch ?


> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13931.2.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583372#comment-15583372
 ] 

Matt McCline commented on HIVE-11394:
-

Reverted.  Giving up for now.

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
>   

[jira] [Reopened] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reopened HIVE-11394:
-

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution 

[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-10-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583356#comment-15583356
 ] 

Xuefu Zhang commented on HIVE-14029:


[~spena], Keeping b/c is a good thing in general. Before we take the effort 
(which seems a lot) to do it, I think we should clearly understand and define 
what b/c is in this case. Spark is rapidly releasing w/o much b/c in mind. So 
far, Hive on Spark has once depended on Spark 1.2, 1.3, 1.4, 1.5, and 1.6. I'm 
not sure what versions of Spark Hive has been released with, but one thing is 
clear, Spark isn't b/c between these releases. Before Spark community has a 
good sense of keeping b/c in their APIs, it's going to be very hard and 
burdensome for Hive to maintain support for different Spark releases, not to 
mention the library dependency issues we have had.

I'm okay to start thinking of a shim layer to support multiple versions of 
Spark, but it sounds daunting to me due to the dynamics of Spark project.

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: Incompatible, TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14029.1.patch, HIVE-14029.2.patch, 
> HIVE-14029.3.patch, HIVE-14029.4.patch, HIVE-14029.5.patch, 
> HIVE-14029.6.patch, HIVE-14029.7.patch, HIVE-14029.8.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.
> To update Spark version to 2.0.0, the following changes are required:
> * Spark API updates:
> ** SparkShuffler#call return Iterator instead of Iterable
> ** SparkListener -> JavaSparkListener
> ** InputMetrics constructor doesn’t accept readMethod
> ** Method remoteBlocksFetched and localBlocksFetched in ShuffleReadMetrics 
> return long type instead of integer
> * Dependency upgrade:
> ** Jackson: 2.4.2 -> 2.6.5
> ** Netty version: 4.0.23.Final -> 4.0.29.Final
> ** Scala binary version: 2.10 -> 2.11
> ** Scala version: 2.10.4 -> 2.11.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583344#comment-15583344
 ] 

Matt McCline commented on HIVE-14981:
-

Still have a problem.

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14932) handle bucketing for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583299#comment-15583299
 ] 

Sergey Shelukhin commented on HIVE-14932:
-

Hmm, for some reason some queries in mm_all2 are very slow. Might be specific 
to bucketing or tablesample

> handle bucketing for MM tables
> --
>
> Key: HIVE-14932
> URL: https://issues.apache.org/jira/browse/HIVE-14932
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14932.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-10-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583248#comment-15583248
 ] 

Sergio Peña commented on HIVE-14029:


[~xuefuz] [~Ferd] [~lirui] Back to compatibility discussion, I think we should 
continue keeping Spark 1.x compatibility on Hive 2.x series (as we did on Hive 
1.x with Hadoop 1.x/2.x). If there are users using Spark 1.x, then they won't 
be able to upgrade to Hive 2.2, and they do not necessary need to upgrade to 
Spark 2.0 as it is still a new release, and not many people upgrade to a 2.0 
version immediately.

What do you thing about this guys? is it important to keep compatibility on 
Hive 2.x until we release Hive 3.0 in the future?

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: Incompatible, TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14029.1.patch, HIVE-14029.2.patch, 
> HIVE-14029.3.patch, HIVE-14029.4.patch, HIVE-14029.5.patch, 
> HIVE-14029.6.patch, HIVE-14029.7.patch, HIVE-14029.8.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.
> To update Spark version to 2.0.0, the following changes are required:
> * Spark API updates:
> ** SparkShuffler#call return Iterator instead of Iterable
> ** SparkListener -> JavaSparkListener
> ** InputMetrics constructor doesn’t accept readMethod
> ** Method remoteBlocksFetched and localBlocksFetched in ShuffleReadMetrics 
> return long type instead of integer
> * Dependency upgrade:
> ** Jackson: 2.4.2 -> 2.6.5
> ** Netty version: 4.0.23.Final -> 4.0.29.Final
> ** Scala binary version: 2.10 -> 2.11
> ** Scala version: 2.10.4 -> 2.11.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14970) repeated insert into is broken for buckets (incorrect results for tablesample)

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583230#comment-15583230
 ] 

Sergey Shelukhin commented on HIVE-14970:
-

According to [~gopalv] this is specific to table sample.
Perhaps we should disable or remove SamplePruner entirely, and check other 
places to make sure that noone else makes this assumption (one file per bucket, 
in order)

> repeated insert into is broken for buckets (incorrect results for tablesample)
> --
>
> Key: HIVE-14970
> URL: https://issues.apache.org/jira/browse/HIVE-14970
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> Running on a regular CLI driver
> {noformat}
> CREATE TABLE src_bucket(key STRING, value STRING) CLUSTERED BY (key) SORTED 
> BY (key) INTO 2 BUCKETS;
> insert into table src_bucket select key,value from srcpart limit 10;
> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/src_bucket/;
> select *, INPUT__FILE__NAME from src_bucket;
> select * from src_bucket tablesample (bucket 1 out of 2) s;
> select * from src_bucket tablesample (bucket 2 out of 2) s;
> insert into table src_bucket select key,value from srcpart limit 10;
> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/src_bucket/;
> select *, INPUT__FILE__NAME from src_bucket;
> select * from src_bucket tablesample (bucket 1 out of 2) s;
> select * from src_bucket tablesample (bucket 2 out of 2) s;
> {noformat}
> Results in the following (with masking disabled and grepping away the noise).
> Looks like bucket mapping completely breaks due to extra files, which may 
> have implications for all the optimizations that depend on them.
> This should work or at least fail if this is not supported.
> {noformat}
> PREHOOK: query: CREATE TABLE src_bucket(key STRING, value STRING) CLUSTERED 
> BY (key) SORTED BY (key) INTO 2 BUCKETS
> PREHOOK: query: insert into table src_bucket select key,value from srcpart 
> limit 10
> Found 2 items
> -rwxr-xr-x   1 sergey staff 46 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> -rwxr-xr-x   1 sergey staff 68 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> PREHOOK: query: select *, INPUT__FILE__NAME from src_bucket
> 165   val_165 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 255   val_255 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 484   val_484 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 86val_86  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 238   val_238 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 27val_27  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 278   val_278 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 311   val_311 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 409   val_409 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 98val_98  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> PREHOOK: query: select * from src_bucket tablesample (bucket 1 out of 2) s
> 165   val_165
> 255   val_255
> 484   val_484
> 86val_86
> PREHOOK: query: select * from src_bucket tablesample (bucket 2 out of 2) s
> 238   val_238
> 27val_27
> 278   val_278
> 311   val_311
> 409   val_409
> 98val_98
> {noformat}
> So far so good.
> {noformat}
> PREHOOK: query: insert into table src_bucket select key,value from srcpart 
> limit 10
> Found 4 items
> -rwxr-xr-x   1 sergey staff 46 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> -rwxr-xr-x   1 sergey staff 46 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0_copy_1
> -rwxr-xr-x   1 sergey staff 68 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> -rwxr-xr-x   1 sergey staff 68 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0_copy_1
> PREHOOK: query: select *, INPUT__FILE__NAME from src_bucket
> 165   val_165 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 255   val_255 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 484   val_484 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 86val_86  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 165  

[jira] [Updated] (HIVE-14970) repeated insert into is broken for buckets (incorrect results for tablesample)

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14970:

Summary: repeated insert into is broken for buckets (incorrect results for 
tablesample)  (was: repeated insert into is broken for buckets (incorrect 
results))

> repeated insert into is broken for buckets (incorrect results for tablesample)
> --
>
> Key: HIVE-14970
> URL: https://issues.apache.org/jira/browse/HIVE-14970
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> Running on a regular CLI driver
> {noformat}
> CREATE TABLE src_bucket(key STRING, value STRING) CLUSTERED BY (key) SORTED 
> BY (key) INTO 2 BUCKETS;
> insert into table src_bucket select key,value from srcpart limit 10;
> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/src_bucket/;
> select *, INPUT__FILE__NAME from src_bucket;
> select * from src_bucket tablesample (bucket 1 out of 2) s;
> select * from src_bucket tablesample (bucket 2 out of 2) s;
> insert into table src_bucket select key,value from srcpart limit 10;
> dfs -ls ${hiveconf:hive.metastore.warehouse.dir}/src_bucket/;
> select *, INPUT__FILE__NAME from src_bucket;
> select * from src_bucket tablesample (bucket 1 out of 2) s;
> select * from src_bucket tablesample (bucket 2 out of 2) s;
> {noformat}
> Results in the following (with masking disabled and grepping away the noise).
> Looks like bucket mapping completely breaks due to extra files, which may 
> have implications for all the optimizations that depend on them.
> This should work or at least fail if this is not supported.
> {noformat}
> PREHOOK: query: CREATE TABLE src_bucket(key STRING, value STRING) CLUSTERED 
> BY (key) SORTED BY (key) INTO 2 BUCKETS
> PREHOOK: query: insert into table src_bucket select key,value from srcpart 
> limit 10
> Found 2 items
> -rwxr-xr-x   1 sergey staff 46 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> -rwxr-xr-x   1 sergey staff 68 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> PREHOOK: query: select *, INPUT__FILE__NAME from src_bucket
> 165   val_165 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 255   val_255 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 484   val_484 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 86val_86  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 238   val_238 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 27val_27  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 278   val_278 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 311   val_311 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 409   val_409 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> 98val_98  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> PREHOOK: query: select * from src_bucket tablesample (bucket 1 out of 2) s
> 165   val_165
> 255   val_255
> 484   val_484
> 86val_86
> PREHOOK: query: select * from src_bucket tablesample (bucket 2 out of 2) s
> 238   val_238
> 27val_27
> 278   val_278
> 311   val_311
> 409   val_409
> 98val_98
> {noformat}
> So far so good.
> {noformat}
> PREHOOK: query: insert into table src_bucket select key,value from srcpart 
> limit 10
> Found 4 items
> -rwxr-xr-x   1 sergey staff 46 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> -rwxr-xr-x   1 sergey staff 46 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0_copy_1
> -rwxr-xr-x   1 sergey staff 68 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0
> -rwxr-xr-x   1 sergey staff 68 2016-10-14 16:09 
> pfile:///Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/01_0_copy_1
> PREHOOK: query: select *, INPUT__FILE__NAME from src_bucket
> 165   val_165 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 255   val_255 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 484   val_484 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 86val_86  
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0
> 165   val_165 
> pfile:/Users/sergey/git/hive/itests/qtest/target/warehouse/src_bucket/00_0_copy_1
> 

[jira] [Updated] (HIVE-14959) Fix DISTINCT with windowing when CBO is enabled/disabled

2016-10-17 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14959:
---
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master, branch-2.1. Thanks [~ashutoshc] for the review!

> Fix DISTINCT with windowing when CBO is enabled/disabled
> 
>
> Key: HIVE-14959
> URL: https://issues.apache.org/jira/browse/HIVE-14959
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14959.01.patch, HIVE-14959.patch
>
>
> For instance, the following query with CBO off:
> {code:sql}
> select distinct last_value(i) over ( partition by si order by i ),
>   first_value(t)  over ( partition by si order by i )
> from over10k limit 50;
> {code}
> will fail, with the following message:
> {noformat}
> SELECT DISTINCT not allowed in the presence of windowing functions when CBO 
> is off
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-10-17 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13557:

Attachment: HIVE-13557.1.patch

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14893) vectorized execution may convert LongCV to smaller types incorrectly

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583221#comment-15583221
 ] 

Sergey Shelukhin commented on HIVE-14893:
-

[~mmccline] [~hagleitn] la la la

> vectorized execution may convert LongCV to smaller types incorrectly
> 
>
> Key: HIVE-14893
> URL: https://issues.apache.org/jira/browse/HIVE-14893
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Matt McCline
>Priority: Critical
>
> See the results for vectorized in decimal_11 test added in HIVE-14863. 
> We cast decimal to various int types; the cast is specialized for each type 
> on non-vectorized side; on vectorized side, it's only specialized for 
> LongColumnVector, so all the decimals get converted to longs. 
> LongColumnVector gets converted to a proper type in some other mysterious 
> place later, and tiny/small/regular ints become truncated at that point.
> Logically, I am not sure if every vectorized expression should be aware of 
> the underlying type for the LongColumnVector (that seems implausible - I am 
> not sure if type information is even available, and if yes it doesn't look 
> like it's used in other places), or if the long-to-smaller-type automatic 
> conversion should be fixed to produce nulls on overflow.
> However it seems like a good idea to do the latter in any case, to have a 
> catch-all for all the vectorized expressions that might treat LongCV as 
> representing longs at all times.
> Update - I see 10s of places in the code where it does something like this: 
> {noformat}(int) ((LongColumnVector) 
> batch.cols[projectionColumnNum]).vector[adjustedIndex]{noformat}
> Also for other types. These might all be problematic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-14927:
-
Status: Patch Available  (was: Open)

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14643) handle ctas for the MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14643:

Summary: handle ctas for the MM tables  (was: handle ctas of the MM tables)

> handle ctas for the MM tables
> -
>
> Key: HIVE-14643
> URL: https://issues.apache.org/jira/browse/HIVE-14643
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14643.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14643) handle ctas of the MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14643:

Summary: handle ctas of the MM tables  (was: handle ctas)

> handle ctas of the MM tables
> 
>
> Key: HIVE-14643
> URL: https://issues.apache.org/jira/browse/HIVE-14643
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14643.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583199#comment-15583199
 ] 

Hive QA commented on HIVE-14981:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833623/HIVE-14981.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10563 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[orc_llap.q,vectorization_pushdown.q,correlationoptimizer2.q,cbo_gby_empty.q,vectorization_short_regress.q,identity_project_remove_skip.q,auto_sortmerge_join_1.q,lineage3.q,cross_product_check_1.q,cbo_join.q,vector_struct_in.q,correlationoptimizer6.q,union_remove_26.q,vectorization_13.q,union2.q,groupby2.q,schema_evol_text_vec_table.q,dynpart_sort_opt_vectorization.q,exchgpartition2lel.q,multiMapJoin1.q,sample10.q,vectorized_timestamp_ints_casts.q,vector_char_simple.q,dynpart_sort_optimization_acid.q,auto_sortmerge_join_2.q,bucketizedhiveinputformat.q,leftsemijoin.q,special_character_in_tabnames_1.q,cte_mat_2.q,vectorization_8.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=217)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1599/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1599/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1599/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833623 - PreCommit-HIVE-Build

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14985) Remove UDF-s created during test runs

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583173#comment-15583173
 ] 

Sergey Shelukhin commented on HIVE-14985:
-

+1 pending tests

> Remove UDF-s created during test runs
> -
>
> Key: HIVE-14985
> URL: https://issues.apache.org/jira/browse/HIVE-14985
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14985.patch
>
>
> When I tried to run llap_udf.q repeatedly from my IDE then the first run was 
> a pass, but following runs were failed. 
> The query does not remove the created functions in the query file which could 
> cause problems for the follow up tests.
> The same problem could happen if a query test fails in the middle of the 
> script, and even though the file contains the removal sql commands, those are 
> not executed.
> It might be a good idea to clean up not just tables and keys, but functions 
> created during the test run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14979) Removing stale Zookeeper locks at HiveServer2 initialization

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583190#comment-15583190
 ] 

Sergey Shelukhin commented on HIVE-14979:
-

Hmm... cannot the nodes be made ephemeral in ZK, if we indeed want to release 
them when we crash?

> Removing stale Zookeeper locks at HiveServer2 initialization
> 
>
> Key: HIVE-14979
> URL: https://issues.apache.org/jira/browse/HIVE-14979
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14979.patch
>
>
> HiveServer2 could use Zookeeper to store token that indicate that particular 
> tables are locked with the creation of persistent Zookeeper objects. 
> A problem can occur when a HiveServer2 instance creates a lock on a table and 
> the HiveServer2 instances crashes ("Out of Memory" for example) and the locks 
> are not released in Zookeeper. This lock will then remain until it is 
> manually cleared by an admin.
> There should be a way to remove stale locks at HiveServer2 initialization, 
> helping the admins life.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583187#comment-15583187
 ] 

Sergey Shelukhin commented on HIVE-14980:
-

cc [~ekoifman] 

Should the compactor just use locks?

> Minor compaction when triggered simultaniously on the same table/partition 
> deletes data
> ---
>
> Key: HIVE-14980
> URL: https://issues.apache.org/jira/browse/HIVE-14980
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.1.0
>Reporter: Mahipal Jupalli
>Assignee: Mahipal Jupalli
>Priority: Critical
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I have two tables (TABLEA, TABLEB). If I manually trigger compaction after 
> each INSERT into TABLEB from TABLEA, compactions are triggered on random 
> metastore asynchronously and are stepping on each other which is causing the 
> data to be deleted.
> Example here: 
> TABLEA - has 10k rows. 
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> Once all the compactions are complete, I should ideally see 20k rows in 
> TABLEB. But I see only 10k rows (Only the rows INSERTED before the last 
> compaction persist, the old rows are deleted. I believe the old delta files 
> are deleted). 
> To further confirm the bug, if I do only one compaction after two inserts, I 
> see 20k rows in TABLEB.
> Proposed Fix:
> I have identified the bug in the code, it requires an additional check in the 
> org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
> compactions on the table/partition. I will 'share the details of the fix once 
> I test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14982) Remove some reserved keywords in 2.2

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583183#comment-15583183
 ] 

Sergey Shelukhin commented on HIVE-14982:
-

Hmm. Aren't implementations allowed to have their own reserved keywords? I 
suspect all the RDBMS-es have some.

> Remove some reserved keywords in 2.2
> 
>
> Key: HIVE-14982
> URL: https://issues.apache.org/jira/browse/HIVE-14982
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> It seems that CACHE, DAYOFWEEK, VIEWS are reserved keywords in master. This 
> conflicts with SQL2011 standard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14959) Fix DISTINCT with windowing when CBO is enabled/disabled

2016-10-17 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14959:
---
Summary: Fix DISTINCT with windowing when CBO is enabled/disabled  (was: 
Support distinct with windowing when CBO is disabled)

> Fix DISTINCT with windowing when CBO is enabled/disabled
> 
>
> Key: HIVE-14959
> URL: https://issues.apache.org/jira/browse/HIVE-14959
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14959.01.patch, HIVE-14959.patch
>
>
> For instance, the following query with CBO off:
> {code:sql}
> select distinct last_value(i) over ( partition by si order by i ),
>   first_value(t)  over ( partition by si order by i )
> from over10k limit 50;
> {code}
> will fail, with the following message:
> {noformat}
> SELECT DISTINCT not allowed in the presence of windowing functions when CBO 
> is off
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13316) Upgrade to Calcite 1.10

2016-10-17 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583168#comment-15583168
 ] 

Ashutosh Chauhan commented on HIVE-13316:
-

Other thing to consider is to possibly add an interface in hive for TS which 
both DruidQuery and HiveTableScan can extend.
Patch looks good otherwise. +1

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.05.patch, HIVE-13316.07.patch, HIVE-13316.08.patch, 
> HIVE-13316.09.patch, HIVE-13316.10.patch, HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14956) Parallelize TestHCatLoader

2016-10-17 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-14956:
---

Assignee: Vaibhav Gumashta

> Parallelize TestHCatLoader
> --
>
> Key: HIVE-14956
> URL: https://issues.apache.org/jira/browse/HIVE-14956
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14891) Parallelize TestHCatStorer

2016-10-17 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583144#comment-15583144
 ] 

Vaibhav Gumashta commented on HIVE-14891:
-

For future reference: new formats won't need a new testcase. The class 
TestHCatStorer still works as a parameterized class, with all tests disabled 
for Avro, RC, Text, SequenceFile, Orc and Parquet. In case a new format is 
added, which HCat supports, we'll need to add a new testcase if we wish to 
parallelize it like others. 

> Parallelize TestHCatStorer
> --
>
> Key: HIVE-14891
> URL: https://issues.apache.org/jira/browse/HIVE-14891
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Testing Infrastructure
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14891.1.patch, HIVE-14891.1.patch, 
> HIVE-14891.1.patch
>
>
> Currently TestHCatStorer runs as a parameterized test, where it runs the same 
> tests for each storage format but within the same junit test case. This 
> prevents it from being parallelized using ptest where parallelism granularity 
> is at a test case level. Instead of using parameterized tests, it makes sense 
> to create a new test case for each storage format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14891) Parallelize TestHCatStorer

2016-10-17 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14891:

Fix Version/s: 2.2.0

> Parallelize TestHCatStorer
> --
>
> Key: HIVE-14891
> URL: https://issues.apache.org/jira/browse/HIVE-14891
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Testing Infrastructure
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14891.1.patch, HIVE-14891.1.patch, 
> HIVE-14891.1.patch
>
>
> Currently TestHCatStorer runs as a parameterized test, where it runs the same 
> tests for each storage format but within the same junit test case. This 
> prevents it from being parallelized using ptest where parallelism granularity 
> is at a test case level. Instead of using parameterized tests, it makes sense 
> to create a new test case for each storage format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14891) Parallelize TestHCatStorer

2016-10-17 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14891:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to master. Thanks [~sseth]

> Parallelize TestHCatStorer
> --
>
> Key: HIVE-14891
> URL: https://issues.apache.org/jira/browse/HIVE-14891
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Testing Infrastructure
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14891.1.patch, HIVE-14891.1.patch, 
> HIVE-14891.1.patch
>
>
> Currently TestHCatStorer runs as a parameterized test, where it runs the same 
> tests for each storage format but within the same junit test case. This 
> prevents it from being parallelized using ptest where parallelism granularity 
> is at a test case level. Instead of using parameterized tests, it makes sense 
> to create a new test case for each storage format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-14927:
-
Status: Open  (was: Patch Available)

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14837) JDBC: standalone jar is missing hadoop core dependencies

2016-10-17 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583112#comment-15583112
 ] 

Gopal V commented on HIVE-14837:


The sample code from -  
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-RunningtheJDBCSampleCode

With just hive-jdbc-standalone jar as a dependency without explicitly copying 
the extraneous jars from the hive build dirs (the docs describe the missing 
jars).

> JDBC: standalone jar is missing hadoop core dependencies
> 
>
> Key: HIVE-14837
> URL: https://issues.apache.org/jira/browse/HIVE-14837
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> 2016/09/24 00:31:57 ERROR - jmeter.threads.JMeterThread: Test failed! 
> java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
> at 
> org.apache.hive.jdbc.HiveConnection.createUnderlyingTransport(HiveConnection.java:418)
> at 
> org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:438)
> at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:225)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:182)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-14927:
-
Status: Patch Available  (was: In Progress)

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14959) Support distinct with windowing when CBO is disabled

2016-10-17 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583107#comment-15583107
 ] 

Ashutosh Chauhan commented on HIVE-14959:
-

+1

> Support distinct with windowing when CBO is disabled
> 
>
> Key: HIVE-14959
> URL: https://issues.apache.org/jira/browse/HIVE-14959
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14959.01.patch, HIVE-14959.patch
>
>
> For instance, the following query with CBO off:
> {code:sql}
> select distinct last_value(i) over ( partition by si order by i ),
>   first_value(t)  over ( partition by si order by i )
> from over10k limit 50;
> {code}
> will fail, with the following message:
> {noformat}
> SELECT DISTINCT not allowed in the presence of windowing functions when CBO 
> is off
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-14927:
-
Attachment: HIVE-14927.3.patch

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Illya Yalovyy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Illya Yalovyy updated HIVE-14927:
-
Status: In Progress  (was: Patch Available)

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13423) Handle the overflow case for decimal datatype for sum()

2016-10-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583055#comment-15583055
 ] 

Xuefu Zhang commented on HIVE-13423:


[~ctang.ma]/[~aihuaxu], as to sacrificing scale for bigger integer part, it 
doesn't seem to be a viable option. The precision/scale of the result type of 
sum udf is determined statically as result metadata. That is, the result type 
of sum(decimal(p, s)) is decimal(p+10, s), which is decided before seeing any 
actual data. Thus, at run time when the data is actually processed, we cannot 
return the result of decimal( p+10+d, s-d) because the data (result) doesn't 
conform to the metadata (type decimal(p+10, s).

Please feel free to check standards or what other dbs are dong. As far as I 
know, there is no standard that permits this.

> Handle the overflow case for decimal datatype for sum()
> ---
>
> Key: HIVE-13423
> URL: https://issues.apache.org/jira/browse/HIVE-13423
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13423.1.patch
>
>
> When a column col1 defined as decimal and if the sum of the column overflows, 
> we will try to increase the decimal precision by 10. But if it's reaching 38 
> (the max precision), the overflow still could happen. Right now, if such case 
> happens, the following exception will throw since hive is writing incorrect 
> data.
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:219)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14837) JDBC: standalone jar is missing hadoop core dependencies

2016-10-17 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583053#comment-15583053
 ] 

Tao Li commented on HIVE-14837:
---

[~gopalv] Can you please describe how to repo this error? Thanks.

> JDBC: standalone jar is missing hadoop core dependencies
> 
>
> Key: HIVE-14837
> URL: https://issues.apache.org/jira/browse/HIVE-14837
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> 2016/09/24 00:31:57 ERROR - jmeter.threads.JMeterThread: Test failed! 
> java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
> at 
> org.apache.hive.jdbc.HiveConnection.createUnderlyingTransport(HiveConnection.java:418)
> at 
> org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:438)
> at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:225)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:182)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14891) Parallelize TestHCatStorer

2016-10-17 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583047#comment-15583047
 ] 

Siddharth Seth commented on HIVE-14891:
---

HIVE-14973 to HIVE-14978, along with HIVE-14910 cover the flaky tests. They're 
not related to the patch.
+1 for the patch. There's one downside which is that any new formats would need 
to explicitly add a test class (earlier this was discovered). I think that's 
acceptable for now.

> Parallelize TestHCatStorer
> --
>
> Key: HIVE-14891
> URL: https://issues.apache.org/jira/browse/HIVE-14891
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Testing Infrastructure
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14891.1.patch, HIVE-14891.1.patch, 
> HIVE-14891.1.patch
>
>
> Currently TestHCatStorer runs as a parameterized test, where it runs the same 
> tests for each storage format but within the same junit test case. This 
> prevents it from being parallelized using ptest where parallelism granularity 
> is at a test case level. Instead of using parameterized tests, it makes sense 
> to create a new test case for each storage format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14981:

Status: Patch Available  (was: In Progress)

Build #1 failed due to infrastructure.

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14981:

Attachment: HIVE-14981.02.patch

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14981:

Status: In Progress  (was: Patch Available)

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >