[jira] [Updated] (HIVE-17016) ATSHook does not check yarn.timeline-service.enabled

2017-07-03 Thread KWON BYUNGCHANG (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated HIVE-17016:
---
Description: 
timeline enabled flag check with in the TimelineCliemtImpl removed at YARN-2375
but ATSHook does not check timeline enabled.

{code:title= ATSHook.java}
YarnConfiguration yarnConf = new YarnConfiguration();
timelineClient = TimelineClient.createTimelineClient();
timelineClient.init(yarnConf);
timelineClient.start();
{code}



in the cluster that does not use timeline,
hive cli and hiveserver2 emit timeline retry log. 

{code}

2017-07-04 13:31:07,835 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 23 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:08,842 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 22 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:09,851 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 21 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:10,858 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 20 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:11,865 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 19 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:12,870 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 18 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:13,872 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 17 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:14,877 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 16 more time(s).
Message: java.net.ConnectException: Connection refused
{code}

  was:
timeline enabled flag check with in the TimelineCliemtImpl removed at YARN-2375
but ATSHook does not check timeline enabled.

in the cluster that does not use timeline,
hive cli and hiveserver2 emit timeline retry log. 

{code}

2017-07-04 13:31:07,835 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 23 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:08,842 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 22 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:09,851 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 21 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:10,858 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 20 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:11,865 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 19 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:12,870 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 18 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:13,872 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 17 more time(s).
Message: java.net.ConnectException: Connection refused
2017-07-04 13:31:14,877 INFO  [ATS Logger 0]: impl.TimelineClientImpl 
(TimelineClientImpl.java:logException(213)) - Exception caught by 
TimelineClientConnectionRetry, will try 16 more time(s).
Message: java.net.ConnectException: Connection refused
{code}


> ATSHook does not check yarn.timeline-service.enabled
> 

[jira] [Commented] (HIVE-16770) Concatinate is not working on Table/Partial Partition level

2017-07-03 Thread guojh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073128#comment-16073128
 ] 

guojh commented on HIVE-16770:
--

[~klsnre...@gmail.com] Our Hive version is 1.1.0, the command " ALTER TABLE 
tableName " seems only  react on nonPartition table, and "ALTER TABLE tableName 
PARTITION(xx='xx')" only act on one partition. So, do you have resolved this 
problem? can you share it?

> Concatinate is not working on Table/Partial Partition level 
> 
>
> Key: HIVE-16770
> URL: https://issues.apache.org/jira/browse/HIVE-16770
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.1
> Environment: centOS7
>Reporter: Kallam Reddy
>
> Not able to CONCATENATE at Table/Partial partition levels. I have table test 
> partitioned on year, month and date. If I try to concatenate by providing 
> corresponding year, month and date of partition it is working fine, but when 
> I want to concatenate ORC files for all the sub partition corresponding to 
> year and month, it is giving exception.
> hive> ALTER TABLE test PARTITION (year = '2017', month = '05') CONCATENATE;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> Partition not found {year=2017, month=05}
> hive> ALTER TABLE test CONCATENATE;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> source table test is partitioned but no partition desc found.
> I am expecting this to trigger concatenate in all available sub partitions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) CRLF of the script causes the service to fail

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Description: 
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
The following documents are required CRLFto be replaced by LF:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.

!screenshot-1.png!
!screenshot-2.png!


  was:
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
The following documents are required LF to be replaced by CRLF:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.

!screenshot-1.png!
!screenshot-2.png!



> CRLF of the script causes the service to fail
> -
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required CRLFto be replaced by LF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.
> !screenshot-1.png!
> !screenshot-2.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) CRLF of the script causes the service to fail

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Description: 
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
The following documents are required CRLF to be replaced by LF:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.

!screenshot-1.png!
!screenshot-2.png!


  was:
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
The following documents are required CRLFto be replaced by LF:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.

!screenshot-1.png!
!screenshot-2.png!



> CRLF of the script causes the service to fail
> -
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required CRLF to be replaced by LF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.
> !screenshot-1.png!
> !screenshot-2.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073078#comment-16073078
 ] 

anishek commented on HIVE-16750:


+1 cc [~thejas]/[~sushanth]/[~daijy]

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch, 
> HIVE-16750.03.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) CRLF of the script causes the service to fail

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Summary: CRLF of the script causes the service to fail  (was: Script file 
should be LF, should not be CRLF, this will lead to the script can not be 
implemented, the service can not start)

> CRLF of the script causes the service to fail
> -
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.
> !screenshot-1.png!
> !screenshot-2.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073062#comment-16073062
 ] 

ZhangBing Lin commented on HIVE-17015:
--

Hi,[~Ferd],[~xuefuz],can you take a quick review?

> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.
> !screenshot-1.png!
> !screenshot-2.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073061#comment-16073061
 ] 

Hive QA commented on HIVE-17015:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875564/screenshot-2.png

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5878/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5878/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5878/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-07-04 03:18:48.681
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-5878/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-07-04 03:18:48.685
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at d68630b HIVE-16958: Setting hive.merge.sparkfiles=true will 
retrun an error when generating parquet databases (Liyun Zhang reviewed by Li 
Rui, Ferdinand Xu)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at d68630b HIVE-16958: Setting hive.merge.sparkfiles=true will 
retrun an error when generating parquet databases (Liyun Zhang reviewed by Li 
Rui, Ferdinand Xu)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-07-04 03:18:53.901
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
patch:  Only garbage was found in the patch input.
fatal: unrecognized input
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875564 - PreCommit-HIVE-Build

> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.
> !screenshot-1.png!
> !screenshot-2.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-16950) Dropping hive database/table which was created explicitly in default database location, deletes all databases data from default database location

2017-07-03 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16950 started by Bing Li.
--
> Dropping hive database/table which was created explicitly in default database 
> location, deletes all databases data from default database location
> -
>
> Key: HIVE-16950
> URL: https://issues.apache.org/jira/browse/HIVE-16950
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Rahul Kalgunde
>Assignee: Bing Li
>Priority: Minor
>
> When database/table is created explicitly pointing to the default location, 
> dropping the database/table deletes all the data associated with the all 
> databases/tables.
> Steps to replicate: 
> in below e.g. dropping table test_db2 also deletes data of test_db1 where as 
> metastore still contains test_db1
> hive> create database test_db1;
> OK
> Time taken: 4.858 seconds
> hive> describe database test_db1;
> OK
> test_db1
> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/test_db1.db root  
>   USER
> Time taken: 0.599 seconds, Fetched: 1 row(s)
> hive> create database test_db2 location '/apps/hive/warehouse' ;
> OK
> Time taken: 1.457 seconds
> hive> describe database test_db2;
> OK
> test_db2
> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse rootUSER
> Time taken: 0.582 seconds, Fetched: 1 row(s)
> hive> drop database test_db2;
> OK
> Time taken: 1.317 seconds
> hive> dfs -ls /apps/hive/warehouse;
> ls: `/apps/hive/warehouse': No such file or directory
> Command failed with exit code = 1
> Query returned non-zero code: 1, cause: null
> hive> describe database test_db1;
> OK
> test_db1
> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/test_db1.db root  
>   USER
> Time taken: 0.629 seconds, Fetched: 1 row(s)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073060#comment-16073060
 ] 

Hive QA commented on HIVE-16901:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875558/HIVE-16901.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=226)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery 
(batchId=226)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5877/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5877/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5877/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875558 - PreCommit-HIVE-Build

> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16901.01.patch, HIVE-16901.02.patch, 
> HIVE-16901.03.patch
>
>
> Currently, if a ReplCopyTask is created to copy a list of files, then distcp 
> is invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Attachment: screenshot-2.png

> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Description: 
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
The following documents are required LF to be replaced by CRLF:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.

!screenshot-1.png!
!screenshot-2.png!


  was:
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
The following documents are required LF to be replaced by CRLF:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.




> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.
> !screenshot-1.png!
> !screenshot-2.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Attachment: screenshot-1.png

> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
> Attachments: screenshot-1.png
>
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073059#comment-16073059
 ] 

ZhangBing Lin commented on HIVE-17015:
--

The issue do not need patch

> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Status: Patch Available  (was: Open)

> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Affects Version/s: 3.0.0

> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17015:
-
Description: 
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
The following documents are required LF to be replaced by CRLF:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.



  was:
Script file should be LF, should not be CRLF, this will lead to the script can 
not be implemented, the service can not start.
List:
bin/beeline.
bin/hive.
bin/hiveserver2.
bin/hplsql.
bin/metatool.
bin/schematool.




> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> The following documents are required LF to be replaced by CRLF:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17015) Script file should be LF, should not be CRLF, this will lead to the script can not be implemented, the service can not start

2017-07-03 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin reassigned HIVE-17015:



> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start
> 
>
> Key: HIVE-17015
> URL: https://issues.apache.org/jira/browse/HIVE-17015
> Project: Hive
>  Issue Type: Bug
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>
> Script file should be LF, should not be CRLF, this will lead to the script 
> can not be implemented, the service can not start.
> List:
> bin/beeline.
> bin/hive.
> bin/hiveserver2.
> bin/hplsql.
> bin/metatool.
> bin/schematool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-03 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-17010:

Attachment: HIVE-17010.1.patch

[~Ferd]: can you help review HIVE-17010.1.patch?
We can use double to replace long type to solve the overflow problem.
{code}
   //long max=9223372036854775807
  long a1= 9223372036854775807L;
  long a2=1022672;

  long res = a1+a2;
  System.out.println(res);  //-9223372036853753137


  //double max=1.7976931348623157E308
  double d1= 9223372036854775807L;
  double d2=1022672;

  double dres = d1+d2;
  System.out.println(dres);//9.223372036855798E18

{code}


> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> [link title|http://example.com] We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-07-03 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073024#comment-16073024
 ] 

Sankar Hariappan edited comment on HIVE-16901 at 7/4/17 2:04 AM:
-

Added 03.patch with fixes for Anishek's comments.

Request [~anishek] to review the updated patch!


was (Author: sankarh):
Added 03.patch with fixes for Anishek's comments.

> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16901.01.patch, HIVE-16901.02.patch, 
> HIVE-16901.03.patch
>
>
> Currently, if a ReplCopyTask is created to copy a list of files, then distcp 
> is invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16901:

Status: Patch Available  (was: Open)

> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16901.01.patch, HIVE-16901.02.patch, 
> HIVE-16901.03.patch
>
>
> Currently, if a ReplCopyTask is created to copy a list of files, then distcp 
> is invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16901:

Status: Open  (was: Patch Available)

> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16901.01.patch, HIVE-16901.02.patch
>
>
> Currently, if a ReplCopyTask is created to copy a list of files, then distcp 
> is invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16901:

Attachment: HIVE-16901.03.patch

Added 03.patch with fixes for Anishek's comments.

> Distcp optimization - One distcp per ReplCopyTask 
> --
>
> Key: HIVE-16901
> URL: https://issues.apache.org/jira/browse/HIVE-16901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16901.01.patch, HIVE-16901.02.patch, 
> HIVE-16901.03.patch
>
>
> Currently, if a ReplCopyTask is created to copy a list of files, then distcp 
> is invoked for each and every file. Instead, need to pass the list of source 
> files to be copied to distcp tool which basically copies the files in 
> parallel and hence gets lot of performance gain.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073014#comment-16073014
 ] 

Sankar Hariappan commented on HIVE-16750:
-

The test failure explainanalyze_2 is unrelated to the change. Rest all having 
the age of more than 1.

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch, 
> HIVE-16750.03.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072935#comment-16072935
 ] 

Hive QA commented on HIVE-12631:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875550/HIVE-12631.15.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_buckets] 
(batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=76)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.ql.io.orc.TestVectorizedOrcAcidRowBatchReader.testVectorizedOrcAcidRowBatchReader
 (batchId=260)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5876/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5876/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5876/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875550 - PreCommit-HIVE-Build

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.1.patch, 
> HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch, 
> HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, 
> HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (HIVE-12631) LLAP: support ORC ACID tables

2017-07-03 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Comment: was deleted

(was: This 11th patch has two major changes. The first one is the new ORC ACID 
row batch encoded data consumer. It adds the vectorized ORC ACID row batch 
reader in LLAP, which is very performant for LLAP ACID. The second one is the 
reader generalization in the ORC raw record merger. The ACID logic now can work 
with more readers, rather than ORC reader only.

This patch enables following works in other issues;
# Introducing the LLAP record reader in the ORC raw record merger to minimize 
non-LLAP reads
# Replacing BitSet objects with integer arrays for more performance
# Adding the vectorized ORC ACID row reader in LLAP.)

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.1.patch, 
> HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch, 
> HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, 
> HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables

2017-07-03 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Attachment: HIVE-12631.15.patch

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.1.patch, 
> HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch, 
> HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, 
> HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072855#comment-16072855
 ] 

Hive QA commented on HIVE-12631:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875537/HIVE-12631.13.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 47 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_reader] (batchId=7)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_globallimit]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_non_partitioned]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_tmp_table]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_no_match]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_non_partitioned]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_partitioned]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_whole_partition]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_orig_table]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_update_delete]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_dynamic_partitioned]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_non_partitioned]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_partitioned]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_tmp_table]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_table]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_after_multiple_inserts]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_non_partitioned]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_partitioned]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_types]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_tmp_table]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_two_cols]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_where_no_match]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_where_non_partitioned]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_where_partitioned]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_acid3]
 (batchId=148)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testColumnProjectionWithAcid
 (batchId=261)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testNewBase 
(batchId=261)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testOriginalReaderPair 
(batchId=261)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testOriginalReaderPairNoMin
 (batchId=261)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testReaderPair 
(batchId=261)

[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables

2017-07-03 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Attachment: HIVE-12631.13.patch

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.1.patch, HIVE-12631.2.patch, 
> HIVE-12631.3.patch, HIVE-12631.4.patch, HIVE-12631.5.patch, 
> HIVE-12631.6.patch, HIVE-12631.7.patch, HIVE-12631.8.patch, 
> HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072766#comment-16072766
 ] 

Hive QA commented on HIVE-16750:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875530/HIVE-16750.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10832 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5874/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5874/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5874/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875530 - PreCommit-HIVE-Build

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch, 
> HIVE-16750.03.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error

2017-07-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072760#comment-16072760
 ] 

Steve Loughran commented on HIVE-16983:
---

good point

Everyone: look at the S3A troubleshooting docs before filing bugreps, thanks: 
[http://hadoop.apache.org/docs/r2.8.0/hadoop-aws/tools/hadoop-aws/index.html#Troubleshooting_S3A]

> getFileStatus on accessible s3a://[bucket-name]/folder: throws 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden;
> -
>
> Key: HIVE-16983
> URL: https://issues.apache.org/jira/browse/HIVE-16983
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.1
> Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to 
> S3 using s3a:// protocol
>Reporter: Alex Baretto
>
> I've followed various published documentation on integrating Apache Hive 
> 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` 
> and 
> `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and 
> `hive/conf/hive-site.xml`.
> I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` 
> to work properly (it returns s3 ls of that bucket). So I know my creds, 
> bucket access, and overall Hadoop setup is valid. 
> hdfs dfs -ls s3a://[bucket-name]/
> 
> drwxrwxrwx   - hdfs hdfs  0 2017-06-27 22:43 
> s3a://[bucket-name]/files
> ...etc. 
> hdfs dfs -ls s3a://[bucket-name]/files
> 
> drwxrwxrwx   - hdfs hdfs  0 2017-06-27 22:43 
> s3a://[bucket-name]/files/my-csv.csv
> However, when I attempt to access the same s3 resources from hive, e.g. run 
> any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION 
> 's3a://[bucket-name]/files/'`, it fails. 
> for example:
> >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, 
> >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED 
> >BY ',' LOCATION 's3a://[bucket-name]/files/';
> I keep getting this error:
> >FAILED: Execution Error, return code 1 from 
> >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
> >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus 
> >on s3a://[bucket-name]/files: 
> >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: 
> >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 
> >C9CF3F9C50EF08D1), S3 Extended Request ID: 
> >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=)
> This makes no sense. I have access to the bucket as one can see in the hdfs 
> test. And I've added the proper creds to hive-site.xml. 
> Anyone have any idea what's missing from this equation?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error

2017-07-03 Thread Aleksandr Balitsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072719#comment-16072719
 ] 

Aleksandr Balitsky commented on HIVE-16983:
---

I faced with the same issue because of the mismatching of aws-java-sdk, 
joda-time and java version.In my case Hive used joda-time 2.5 but this version 
doesn't work with java8 and aws-java-sdk-1.7.15.  We should use at least the 
joda-time version 2.8.1.

> getFileStatus on accessible s3a://[bucket-name]/folder: throws 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden;
> -
>
> Key: HIVE-16983
> URL: https://issues.apache.org/jira/browse/HIVE-16983
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.1
> Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to 
> S3 using s3a:// protocol
>Reporter: Alex Baretto
>
> I've followed various published documentation on integrating Apache Hive 
> 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` 
> and 
> `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and 
> `hive/conf/hive-site.xml`.
> I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` 
> to work properly (it returns s3 ls of that bucket). So I know my creds, 
> bucket access, and overall Hadoop setup is valid. 
> hdfs dfs -ls s3a://[bucket-name]/
> 
> drwxrwxrwx   - hdfs hdfs  0 2017-06-27 22:43 
> s3a://[bucket-name]/files
> ...etc. 
> hdfs dfs -ls s3a://[bucket-name]/files
> 
> drwxrwxrwx   - hdfs hdfs  0 2017-06-27 22:43 
> s3a://[bucket-name]/files/my-csv.csv
> However, when I attempt to access the same s3 resources from hive, e.g. run 
> any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION 
> 's3a://[bucket-name]/files/'`, it fails. 
> for example:
> >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, 
> >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED 
> >BY ',' LOCATION 's3a://[bucket-name]/files/';
> I keep getting this error:
> >FAILED: Execution Error, return code 1 from 
> >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
> >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus 
> >on s3a://[bucket-name]/files: 
> >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: 
> >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 
> >C9CF3F9C50EF08D1), S3 Extended Request ID: 
> >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=)
> This makes no sense. I have access to the bucket as one can see in the hdfs 
> test. And I've added the proper creds to hive-site.xml. 
> Anyone have any idea what's missing from this equation?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072681#comment-16072681
 ] 

Sankar Hariappan edited comment on HIVE-16750 at 7/3/17 4:42 PM:
-

Added 03.patch with fix for the test failure of encryption_move_tbl.q.
Other failures are irrelevant to the change.



was (Author: sankarh):
Added 03.patch with fix for the test failure of encryption_move_tbl.q.
Other failures are irrelevant to the change.

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch, 
> HIVE-16750.03.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Status: Patch Available  (was: Open)

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch, 
> HIVE-16750.03.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Attachment: HIVE-16750.03.patch

Added 03.patch with fix for the test failure of encryption_move_tbl.q.
Other failures are irrelevant to the change.

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch, 
> HIVE-16750.03.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Status: Open  (was: Patch Available)

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Status: Patch Available  (was: Open)

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Status: Open  (was: Patch Available)

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17011) PlanUtils.getExprList() causes memory pressure when invoked from SharedWorkOptimizer on queries with lots of union operators

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072446#comment-16072446
 ] 

Hive QA commented on HIVE-17011:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875497/HIVE-17011.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5873/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5873/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5873/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875497 - PreCommit-HIVE-Build

> PlanUtils.getExprList() causes memory pressure when invoked from 
> SharedWorkOptimizer on queries with lots of union operators
> 
>
> Key: HIVE-17011
> URL: https://issues.apache.org/jira/browse/HIVE-17011
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: CPU_usage.png, HIVE-17011.1.patch, HIVE-17011.3.patch, 
> Memory_usage.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17011) PlanUtils.getExprList() causes memory pressure when invoked from SharedWorkOptimizer on queries with lots of union operators

2017-07-03 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17011:

Attachment: HIVE-17011.3.patch

Thanks [~jcamachorodriguez]. Attaching .2 patch with additional method 
{{getOperatorSignature}}. This patch resets the signature on setters.

> PlanUtils.getExprList() causes memory pressure when invoked from 
> SharedWorkOptimizer on queries with lots of union operators
> 
>
> Key: HIVE-17011
> URL: https://issues.apache.org/jira/browse/HIVE-17011
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: CPU_usage.png, HIVE-17011.1.patch, HIVE-17011.3.patch, 
> Memory_usage.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17011) PlanUtils.getExprList() causes memory pressure when invoked from SharedWorkOptimizer on queries with lots of union operators

2017-07-03 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072357#comment-16072357
 ] 

Rajesh Balamohan edited comment on HIVE-17011 at 7/3/17 12:37 PM:
--

Thanks [~jcamachorodriguez]. Attaching .3 patch with additional method 
{{getOperatorSignature}}. This patch resets the signature on setters.


was (Author: rajesh.balamohan):
Thanks [~jcamachorodriguez]. Attaching .2 patch with additional method 
{{getOperatorSignature}}. This patch resets the signature on setters.

> PlanUtils.getExprList() causes memory pressure when invoked from 
> SharedWorkOptimizer on queries with lots of union operators
> 
>
> Key: HIVE-17011
> URL: https://issues.apache.org/jira/browse/HIVE-17011
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: CPU_usage.png, HIVE-17011.1.patch, HIVE-17011.3.patch, 
> Memory_usage.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072344#comment-16072344
 ] 

Hive QA commented on HIVE-16750:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875481/HIVE-16750.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10832 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5872/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5872/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5872/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875481 - PreCommit-HIVE-Build

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Attachment: HIVE-16750.02.patch

Added 02.patch with below changes.
- Fixed Anishek's comments
- Removed addFile method as it duplicates the copy behaviour of recycle.

Request [~anishek] to review the updated patch!

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Status: Patch Available  (was: Open)

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch, HIVE-16750.02.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16750:

Status: Open  (was: Patch Available)

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16983) getFileStatus on accessible s3a://[bucket-name]/folder: throws com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error

2017-07-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072239#comment-16072239
 ] 

Steve Loughran commented on HIVE-16983:
---

Clearly, somehow, your credentials aren't getting picked up. One problem here 
is that the S3A code can't log what's going on in any detail for security 
reasons (logging secrets is considered harmful), so not sure what could be done 
here.

> getFileStatus on accessible s3a://[bucket-name]/folder: throws 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden;
> -
>
> Key: HIVE-16983
> URL: https://issues.apache.org/jira/browse/HIVE-16983
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.1
> Environment: Hive 2.1.1 on Ubuntu 14.04 AMI in AWS EC2, connecting to 
> S3 using s3a:// protocol
>Reporter: Alex Baretto
>
> I've followed various published documentation on integrating Apache Hive 
> 2.1.1 with AWS S3 using the `s3a://` scheme, configuring `fs.s3a.access.key` 
> and 
> `fs.s3a.secret.key` for `hadoop/etc/hadoop/core-site.xml` and 
> `hive/conf/hive-site.xml`.
> I am at the point where I am able to get `hdfs dfs -ls s3a://[bucket-name]/` 
> to work properly (it returns s3 ls of that bucket). So I know my creds, 
> bucket access, and overall Hadoop setup is valid. 
> hdfs dfs -ls s3a://[bucket-name]/
> 
> drwxrwxrwx   - hdfs hdfs  0 2017-06-27 22:43 
> s3a://[bucket-name]/files
> ...etc. 
> hdfs dfs -ls s3a://[bucket-name]/files
> 
> drwxrwxrwx   - hdfs hdfs  0 2017-06-27 22:43 
> s3a://[bucket-name]/files/my-csv.csv
> However, when I attempt to access the same s3 resources from hive, e.g. run 
> any `CREATE SCHEMA` or `CREATE EXTERNAL TABLE` statements using `LOCATION 
> 's3a://[bucket-name]/files/'`, it fails. 
> for example:
> >CREATE EXTERNAL TABLE IF NOT EXISTS mydb.my_table ( my_table_id string, 
> >my_tstamp timestamp, my_sig bigint ) ROW FORMAT DELIMITED FIELDS TERMINATED 
> >BY ',' LOCATION 's3a://[bucket-name]/files/';
> I keep getting this error:
> >FAILED: Execution Error, return code 1 from 
> >org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
> >java.nio.file.AccessDeniedException s3a://[bucket-name]/files: getFileStatus 
> >on s3a://[bucket-name]/files: 
> >com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: 
> >Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 
> >C9CF3F9C50EF08D1), S3 Extended Request ID: 
> >T2xZ87REKvhkvzf+hdPTOh7CA7paRpIp6IrMWnDqNFfDWerkZuAIgBpvxilv6USD0RSxM9ymM6I=)
> This makes no sense. I have access to the bucket as one can see in the hdfs 
> test. And I've added the proper creds to hive-site.xml. 
> Anyone have any idea what's missing from this equation?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-9012) Not able to move and populate the data fully on to the table when the scratch directory is on S3

2017-07-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072238#comment-16072238
 ] 

Steve Loughran commented on HIVE-9012:
--

This is just rename() being emulated in S3 with a copy-and-delete.

> Not able to move and populate the data fully on to the table when the scratch 
> directory is on S3
> 
>
> Key: HIVE-9012
> URL: https://issues.apache.org/jira/browse/HIVE-9012
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1
> Environment: Amazon AMI and S3 as storage service
>Reporter: Kolluru Som Shekhar Sharma
>Priority: Blocker
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> I have set the hive.exec.scratchDir to point to a directory on S3 and 
> external table is on S3 level. 
> I ran a simple query which extracts the key value pairs from JSON string 
> without any WHERE clause, and the about of data is ~500GB.  The query ran 
> fine, but when it is trying to move the data from the scratch directory it 
> doesn't complete. So i need to kill the process and manually need to move the 
> data.
> The data size in the scratch directory was nearly ~550GB
> I tried the same scenario with less data and putting where clause, it 
> completed successfully and data also gets populated in the table. I checked 
> the size in the table and in the scratch directory. The data in the table was 
> showing 2MB and the data in the scratch directory is 48.6GB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17011) PlanUtils.getExprList() causes memory pressure when invoked from SharedWorkOptimizer on queries with lots of union operators

2017-07-03 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072220#comment-16072220
 ] 

Jesus Camacho Rodriguez commented on HIVE-17011:


[~rajesh.balamohan], thanks for the patch. The idea is good, maybe we could add 
invalidation of the cached expressions when we call the setter methods in those 
descriptors?

> PlanUtils.getExprList() causes memory pressure when invoked from 
> SharedWorkOptimizer on queries with lots of union operators
> 
>
> Key: HIVE-17011
> URL: https://issues.apache.org/jira/browse/HIVE-17011
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: CPU_usage.png, HIVE-17011.1.patch, Memory_usage.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13882) When hive.server2.async.exec.async.compile is turned on, from JDBC we will get "The query did not generate a result set"

2017-07-03 Thread Hitoshi Tsuda (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072219#comment-16072219
 ] 

Hitoshi Tsuda commented on HIVE-13882:
--

It seems that this fix hasn't been commited back to 2.1

> When hive.server2.async.exec.async.compile is turned on, from JDBC we will 
> get "The query did not generate a result set" 
> -
>
> Key: HIVE-13882
> URL: https://issues.apache.org/jira/browse/HIVE-13882
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13882.1.patch, HIVE-13882.2.patch
>
>
>  The following would fail with  "The query did not generate a result set"
> stmt.execute("SET hive.driver.parallel.compilation=true");
> stmt.execute("SET hive.server2.async.exec.async.compile=true");
> ResultSet res =  stmt.executeQuery("SELECT * FROM " + tableName);
> res.next();



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15883) HBase mapped table in Hive insert fail for decimal

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072211#comment-16072211
 ] 

Hive QA commented on HIVE-15883:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852182/HIVE-15883.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5871/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5871/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5871/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852182 - PreCommit-HIVE-Build

> HBase mapped table in Hive insert fail for decimal
> --
>
> Key: HIVE-15883
> URL: https://issues.apache.org/jira/browse/HIVE-15883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-15883.patch
>
>
> CREATE TABLE hbase_table (
> id int,
> balance decimal(15,2))
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '~'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping"=":key,cf:balance#b");
> insert into hbase_table values (1,1);
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
> Hive internal error.
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:733)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.RuntimeException: Hive internal error.
> at 
> 

[jira] [Commented] (HIVE-15883) HBase mapped table in Hive insert fail for decimal

2017-07-03 Thread Artur Tamazian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072145#comment-16072145
 ] 

Artur Tamazian commented on HIVE-15883:
---

Possibly related issue: https://issues.apache.org/jira/browse/HIVE-17002

> HBase mapped table in Hive insert fail for decimal
> --
>
> Key: HIVE-15883
> URL: https://issues.apache.org/jira/browse/HIVE-15883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-15883.patch
>
>
> CREATE TABLE hbase_table (
> id int,
> balance decimal(15,2))
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '~'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping"=":key,cf:balance#b");
> insert into hbase_table values (1,1);
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
> Hive internal error.
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:733)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:668)
> ... 15 more
> Caused by: java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:328)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:220)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
> ... 16 more 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16982) WebUI "Show Query" tab prints "UNKNOWN" instead of explaining configuration option

2017-07-03 Thread Karen Coppage (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072136#comment-16072136
 ] 

Karen Coppage commented on HIVE-16982:
--

[~prongs]
Hi, this patch clarifies to the user that configuration 
{noformat}hive.log.explain.output{noformat} is now also responsible for 
determining whether the WebUI displays query information, because of your patch 
HIVE-13500 where 
{noformat}
if (conf.getBoolVar(ConfVars.HIVE_LOG_EXPLAIN_OUTPUT) ||
   conf.isWebUiQueryInfoCacheEnabled()) {
{noformat}
became
{noformat}
if (conf.getBoolVar(ConfVars.HIVE_LOG_EXPLAIN_OUTPUT)) {
{noformat}
Would you mind taking a look at this?

> WebUI "Show Query" tab prints "UNKNOWN" instead of explaining configuration 
> option
> --
>
> Key: HIVE-16982
> URL: https://issues.apache.org/jira/browse/HIVE-16982
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: newbie, patch
> Attachments: HIVE-16982.3.patch
>
>
> In the Hive WebUI / Drilldown: the Show Query tab always displays "UNKNOWN."
> If the user wants to see the query plan here, they should set configuration 
> hive.log.explain.output to true. The user should be made aware of this option:
> 1) in WebUI / Drilldown / Show Query and
> 2) in HiveConf.java, line 2232.
> This configuration's description reads:
> "Whether to log explain output for every query
> When enabled, will log EXPLAIN EXTENDED output for the query at INFO log4j 
> log level."
> this should be added:
> "...and in the WebUI / Show Query tab."



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17011) PlanUtils.getExprList() causes memory pressure when invoked from SharedWorkOptimizer on queries with lots of union operators

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072120#comment-16072120
 ] 

Hive QA commented on HIVE-17011:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875453/HIVE-17011.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5870/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5870/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5870/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875453 - PreCommit-HIVE-Build

> PlanUtils.getExprList() causes memory pressure when invoked from 
> SharedWorkOptimizer on queries with lots of union operators
> 
>
> Key: HIVE-17011
> URL: https://issues.apache.org/jira/browse/HIVE-17011
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: CPU_usage.png, HIVE-17011.1.patch, Memory_usage.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16222) add a setting to disable row.serde for specific formats; enable for others

2017-07-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072092#comment-16072092
 ] 

Lefty Leverenz commented on HIVE-16222:
---

[~sershe], I don't see this commit in email or github.

> add a setting to disable row.serde for specific formats; enable for others
> --
>
> Key: HIVE-16222
> URL: https://issues.apache.org/jira/browse/HIVE-16222
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16222.01.patch, HIVE-16222.02.patch, 
> HIVE-16222.03.patch, HIVE-16222.04.patch, HIVE-16222.05.patch, 
> HIVE-16222.patch
>
>
> Per [~gopalv]
> {quote}
> row.serde = true ... breaks Parquet (they expect to get the same object back, 
> which means you can't buffer 1024 rows).
> {quote}
> We want to enable this and vector.serde for text vectorization. Need to turn 
> it off for specific formats.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16893) move replication dump related work in semantic analysis phase to execution phase using a task

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16072016#comment-16072016
 ] 

Hive QA commented on HIVE-16893:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875447/HIVE-16893.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat5]
 (batchId=3)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5869/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5869/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5869/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875447 - PreCommit-HIVE-Build

> move replication dump related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16893
> URL: https://issues.apache.org/jira/browse/HIVE-16893
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-16893.2.patch
>
>
> Since we run in to the possibility of creating a large number tasks during 
> replication bootstrap dump
> * we may not be able to hold all of them in memory for really large 
> databases, which might not hold true once we complete HIVE-16892
> * Also a compile time lock is taken such that only one query is run in this 
> phase which in replication bootstrap scenario is going to be a very long 
> running task and hence moving it to execution phase will limit the lock 
> period in compile phase.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16950) Dropping hive database/table which was created explicitly in default database location, deletes all databases data from default database location

2017-07-03 Thread Bing Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071999#comment-16071999
 ] 

Bing Li commented on HIVE-16950:


>From the description, the requirement is more like an EXTERNAL database which 
>has NOT been supported by Hive yet.

But I think we could add some check when create/drop database to avoid this 
issue.
There would be two ways to do this:
1. Throw an error when the target location on HDFS already exists.
An existing empty directory is invalid as well. Because currently, Hive allows 
to create two databases with the same location.
2. ONLY drop the tables belong to the target database.
With this purpose, we should get all the tables under this database when DROP 
DATABASE is invoked. 
But it would affect the performance of DROP statement.

I prefer the #1. [~ashutoshc], any comments on this?  Thank you.


> Dropping hive database/table which was created explicitly in default database 
> location, deletes all databases data from default database location
> -
>
> Key: HIVE-16950
> URL: https://issues.apache.org/jira/browse/HIVE-16950
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Rahul Kalgunde
>Assignee: Bing Li
>Priority: Minor
>
> When database/table is created explicitly pointing to the default location, 
> dropping the database/table deletes all the data associated with the all 
> databases/tables.
> Steps to replicate: 
> in below e.g. dropping table test_db2 also deletes data of test_db1 where as 
> metastore still contains test_db1
> hive> create database test_db1;
> OK
> Time taken: 4.858 seconds
> hive> describe database test_db1;
> OK
> test_db1
> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/test_db1.db root  
>   USER
> Time taken: 0.599 seconds, Fetched: 1 row(s)
> hive> create database test_db2 location '/apps/hive/warehouse' ;
> OK
> Time taken: 1.457 seconds
> hive> describe database test_db2;
> OK
> test_db2
> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse rootUSER
> Time taken: 0.582 seconds, Fetched: 1 row(s)
> hive> drop database test_db2;
> OK
> Time taken: 1.317 seconds
> hive> dfs -ls /apps/hive/warehouse;
> ls: `/apps/hive/warehouse': No such file or directory
> Command failed with exit code = 1
> Query returned non-zero code: 1, cause: null
> hive> describe database test_db1;
> OK
> test_db1
> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/test_db1.db root  
>   USER
> Time taken: 0.629 seconds, Fetched: 1 row(s)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16962) Better error msg for Hive on Spark in case user cancels query and closes session

2017-07-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071993#comment-16071993
 ] 

Lefty Leverenz commented on HIVE-16962:
---

[~xuefuz], you committed this to master but marked it fixed in 2.2.0 -- will 
there be another commit to 2.2.0?

> Better error msg for Hive on Spark in case user cancels query and closes 
> session
> 
>
> Key: HIVE-16962
> URL: https://issues.apache.org/jira/browse/HIVE-16962
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 2.2.0
>
> Attachments: HIVE-16962.2.patch, HIVE-16962.patch, HIVE-16962.patch
>
>
> In case user cancels a query and closes the session, Hive marks the query as 
> failed. However, the error message is a little confusing. It still says:
> {quote}
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark 
> client. This is likely because the queue you assigned to does not have free 
> resource at the moment to start the job. Please check your queue usage and 
> try the query again later.
> {quote}
> followed by some InterruptedException.
> Ideally, the error should clearly indicates the fact that user cancels the 
> execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17012) ACID Table: Number of reduce tasks should be computed correctly when sort.dynamic.partition is enabled

2017-07-03 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071990#comment-16071990
 ] 

Rajesh Balamohan commented on HIVE-17012:
-

ReducerTraits would be FIXED for ACID tables with buckets. 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java#L102
 prevents from computing reducer tasks for Reducer 3.

> ACID Table: Number of reduce tasks should be computed correctly when 
> sort.dynamic.partition is enabled
> --
>
> Key: HIVE-17012
> URL: https://issues.apache.org/jira/browse/HIVE-17012
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: plan.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17012) ACID Table: Number of reduce tasks should be computed correctly when sort.dynamic.partition is enabled

2017-07-03 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17012:

Attachment: plan.txt

> ACID Table: Number of reduce tasks should be computed correctly when 
> sort.dynamic.partition is enabled
> --
>
> Key: HIVE-17012
> URL: https://issues.apache.org/jira/browse/HIVE-17012
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: plan.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16982) WebUI "Show Query" tab prints "UNKNOWN" instead of explaining configuration option

2017-07-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071987#comment-16071987
 ] 

Lefty Leverenz commented on HIVE-16982:
---

The new description of *hive.log.explain.output* looks good to me.

> WebUI "Show Query" tab prints "UNKNOWN" instead of explaining configuration 
> option
> --
>
> Key: HIVE-16982
> URL: https://issues.apache.org/jira/browse/HIVE-16982
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: newbie, patch
> Attachments: HIVE-16982.3.patch
>
>
> In the Hive WebUI / Drilldown: the Show Query tab always displays "UNKNOWN."
> If the user wants to see the query plan here, they should set configuration 
> hive.log.explain.output to true. The user should be made aware of this option:
> 1) in WebUI / Drilldown / Show Query and
> 2) in HiveConf.java, line 2232.
> This configuration's description reads:
> "Whether to log explain output for every query
> When enabled, will log EXPLAIN EXTENDED output for the query at INFO log4j 
> log level."
> this should be added:
> "...and in the WebUI / Show Query tab."



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2017-07-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071981#comment-16071981
 ] 

Hive QA commented on HIVE-14797:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833869/HIVE-14797.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10830 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5868/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5868/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5868/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833869 - PreCommit-HIVE-Build

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
>  Labels: breaking_change
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-6348) Order by/Sort by in subquery

2017-07-03 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071980#comment-16071980
 ] 

Lefty Leverenz commented on HIVE-6348:
--

Doc note:  This adds *hive.remove.orderby.in.subquery* to HiveConf.java, so it 
needs to be documented in Configuration Properties.

* [Configuration Properties -- Query Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Do we also need general documentation?  If so, should it go in the Subqueries 
doc or the SortBy doc?

* [Subqueries | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries]
* [Sort By / Order By / etc. | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy]

Added a TODOC3.0 label.

> Order by/Sort by in subquery
> 
>
> Key: HIVE-6348
> URL: https://issues.apache.org/jira/browse/HIVE-6348
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Rui Li
>Priority: Minor
>  Labels: TODOC3.0, sub-query
> Fix For: 3.0.0
>
> Attachments: HIVE-6348.1.patch, HIVE-6348.2.patch, HIVE-6348.3.patch, 
> HIVE-6348.4.patch
>
>
> select * from (select * from foo order by c asc) bar order by c desc;
> in hive sorts the data set twice. The optimizer should probably remove any 
> order by/sort by in the sub query unless you use 'limit '. Could even go so 
> far as barring it at the semantic level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-6348) Order by/Sort by in subquery

2017-07-03 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-6348:
-
Labels: TODOC3.0 sub-query  (was: sub-query)

> Order by/Sort by in subquery
> 
>
> Key: HIVE-6348
> URL: https://issues.apache.org/jira/browse/HIVE-6348
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Rui Li
>Priority: Minor
>  Labels: TODOC3.0, sub-query
> Fix For: 3.0.0
>
> Attachments: HIVE-6348.1.patch, HIVE-6348.2.patch, HIVE-6348.3.patch, 
> HIVE-6348.4.patch
>
>
> select * from (select * from foo order by c asc) bar order by c desc;
> in hive sorts the data set twice. The optimizer should probably remove any 
> order by/sort by in the sub query unless you use 'limit '. Could even go so 
> far as barring it at the semantic level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16750) Support change management for rename table/partition.

2017-07-03 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071955#comment-16071955
 ] 

anishek commented on HIVE-16750:


Please also look at the failure tests for the commit patch with age of 1 

> Support change management for rename table/partition.
> -
>
> Key: HIVE-16750
> URL: https://issues.apache.org/jira/browse/HIVE-16750
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16750.01.patch
>
>
> Currently, rename table/partition updates the data location by renaming the 
> directory which is equivalent to moving files to new path and delete old 
> path. So, this should trigger move of files into $CMROOT.
> Scenario:
> 1. Create a table (T1)
> 2. Insert a record
> 3. Rename the table(T1 -> T2)
> 4. Repl Dump till Insert.
> 5. Repl Load from the dump.
> 6. Target DB should have table T1 with the record.
> Similar scenario with rename partition as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)