[jira] [Commented] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884127#comment-15884127
 ] 

Hive QA commented on HIVE-15881:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854267/HIVE-15881.5.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3774/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3774/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3774/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-02-25 07:26:29.532
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-3774/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-02-25 07:26:29.534
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2f6f6bd HIVE-15951 : Make sure base persist directory is unique 
and deleted (Slim Bouguerra via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 2f6f6bd HIVE-15951 : Make sure base persist directory is unique 
and deleted (Slim Bouguerra via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-02-25 07:26:33.301
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: No such 
file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: No such 
file or directory
error: a/ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java: No such 
file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854267 - PreCommit-HIVE-Build

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> --
>
> Key: HIVE-15881
> URL: https://issues.apache.org/jira/browse/HIVE-15881
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-15881.1.patch, HIVE-15881.2.patch, 
> HIVE-15881.3.patch, HIVE-15881.4.patch, HIVE-15881.5.patch
>
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15958) LLAP: IPC connections are not being reused for umbilical protocol

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884125#comment-15884125
 ] 

Hive QA commented on HIVE-15958:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854633/HIVE-15958.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10259 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_auto_partitioned]
 (batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3773/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3773/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3773/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854633 - PreCommit-HIVE-Build

> LLAP: IPC connections are not being reused for umbilical protocol
> -
>
> Key: HIVE-15958
> URL: https://issues.apache.org/jira/browse/HIVE-15958
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch, 
> HIVE-15958.3.patch, HIVE-15958.4.patch, HIVE-15958.4.patch, HIVE-15958.5.patch
>
>
> During concurrency testing, observed 1000s of ipc thread creations. Ideally, 
> the connections to same hosts should be reused.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15708) Upgrade calcite version to 1.12

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884077#comment-15884077
 ] 

Hive QA commented on HIVE-15708:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854555/HIVE-15708.12.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10255 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_timestamp] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite_multi_db]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reduce_deduplicate_extended2]
 (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_reflect2] 
(batchId=14)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_ppd_decimal]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=230)
org.apache.hive.jdbc.TestJdbcDriver2.testPrepareSetTimestamp (batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3772/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3772/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3772/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854555 - PreCommit-HIVE-Build

> Upgrade calcite version to 1.12
> ---
>
> Key: HIVE-15708
> URL: https://issues.apache.org/jira/browse/HIVE-15708
> Project: Hive
>  Issue Type: Task
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Remus Rusanu
> Attachments: HIVE-15708.01.patch, HIVE-15708.02.patch, 
> HIVE-15708.03.patch, HIVE-15708.04.patch, HIVE-15708.05.patch, 
> HIVE-15708.06.patch, HIVE-15708.07.patch, HIVE-15708.08.patch, 
> HIVE-15708.09.patch, HIVE-15708.10.patch, HIVE-15708.11.patch, 
> HIVE-15708.12.patch
>
>
> Currently we are on 1.10 Need to upgrade calcite version to 1.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15879) Fix HiveMetaStoreChecker.checkPartitionDirs method

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884045#comment-15884045
 ] 

Hive QA commented on HIVE-15879:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854556/HIVE-15879.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10265 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3771/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3771/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3771/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854556 - PreCommit-HIVE-Build

> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --
>
> Key: HIVE-15879
> URL: https://issues.apache.org/jira/browse/HIVE-15879
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch, 
> HIVE-15879.03.patch, HIVE-15879.04.patch
>
>
> HIVE-15803 fixes the msck hang issue in 
> HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if 
> the Threadpool has any spare threads. If not it uses single threaded listing 
> of the files.
> {noformat}
> if (pool != null) {
>   synchronized (pool) {
> // In case of recursive calls, it is possible to deadlock with TP. 
> Check TP usage here.
> if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>   useThreadPool = true;
> }
> if (!useThreadPool) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Not using threadPool as active count:" + 
> pool.getActiveCount()
> + ", max:" + pool.getMaximumPoolSize());
>   }
> }
>   }
> }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing 
> tasks.
> it returns only approximate number of threads and it cannot be guaranteed 
> that it always returns the exact number of active threads. This still exposes 
> the method implementation to the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into 
> the possibility of changing the recursive implementation to an iterative 
> implementation where worker threads pick tasks from a queue until the queue 
> is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884025#comment-15884025
 ] 

Hive QA commented on HIVE-16013:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854554/HIVE-16013.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10261 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite 
(batchId=186)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3770/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3770/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3770/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854554 - PreCommit-HIVE-Build

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch, HIVE-16013.2.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator

2017-02-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15844:
--
Attachment: HIVE-15844.02.patch

> Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
> 
>
> Key: HIVE-15844
> URL: https://issues.apache.org/jira/browse/HIVE-15844
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.0.0
>
> Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16006) Incremental REPL LOAD doesn't operate on the target database if name differs from source database.

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883998#comment-15883998
 ] 

Hive QA commented on HIVE-16006:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854551/HIVE-16006.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10255 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=230)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3769/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3769/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3769/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854551 - PreCommit-HIVE-Build

> Incremental REPL LOAD doesn't operate on the target database if name differs 
> from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16040) union column expansion should take aliases from the leftmost branch

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16040:

Status: Patch Available  (was: Open)

> union column expansion should take aliases from the leftmost branch
> ---
>
> Key: HIVE-16040
> URL: https://issues.apache.org/jira/browse/HIVE-16040
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16040.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16040) union column expansion should take aliases from the leftmost branch

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16040:

Attachment: HIVE-16040.patch

The patch. I'd like to add a test in CliDriver but I ran into HIVE-16039.
[~ashutoshc] can you take a look at the code change, and also, do you know how 
to gen output for perfclidriver?
When I was testing locally I c/p-ed the entire TPCDS schema and query to 
regular CliDriver, that worked for me; but when I run PerfCliDriver locally, it 
OOMs. 

> union column expansion should take aliases from the leftmost branch
> ---
>
> Key: HIVE-16040
> URL: https://issues.apache.org/jira/browse/HIVE-16040
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16040.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16040) union column expansion should take aliases from the leftmost branch

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-16040:
---


> union column expansion should take aliases from the leftmost branch
> ---
>
> Key: HIVE-16040
> URL: https://issues.apache.org/jira/browse/HIVE-16040
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16039) Calcite unsupported, and non-Calcite NPE on a query

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16039:

Description: 
This happens on master even with   HIVE-15938 reverted (I was trying to make a 
test case for it based on some TPCDS query).
{noformat}
drop table src_10;
drop table src_20;
drop table src_30;
create table src_10 as select * from src limit 10;
create table src_20 as select * from src limit 11;
create table src_30 as select * from src limit 12;

explain 
with cross_items as
 (select brand_id, class_id from src_30,
(select s1.key brand_id,s1.value class_id from src_10 s1
 intersect select ics.key, ics.value from src_20 ics
 intersect select iws.key, iws.value from src_30 iws) x where key = 
brand_id),
avg_sales as
 (select kv from (select value kv from src_10
   union all select key from src_20
   union all select value src_30) x)
  select key, value
 from (select 'foo' channel, key, value from src_10 where key in (select 
brand_id from cross_items)
   union all select 'bar' channel, key, value from src_20 where value in 
(select class_id from cross_items)
   union all select 'baz' channel, key, value from src_30 where value in 
(select kv from avg_sales)
 ) y;
{noformat}

I know this query is super intuitive...
CalcitePlanner fails in a non-informative way, and then SA NPEs.
{noformat}

org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
Unsupported
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3886)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3815)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSubQueryRelNode(CalcitePlanner.java:2419)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:2443)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterLogicalPlan(CalcitePlanner.java:2499)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3898)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3815)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1281)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1227)
 ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113) 
~[calcite-core-1.10.0.jar:1.10.0]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997)
 ~[calcite-core-1.10.0.jar:1.10.0]
at 

[jira] [Commented] (HIVE-16039) Calcite unsupported, and non-Calcite NPE on a query

2017-02-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883973#comment-15883973
 ] 

Sergey Shelukhin commented on HIVE-16039:
-

cc  [~jcamachorodriguez] [~pxiong]

> Calcite unsupported, and non-Calcite NPE on a query
> ---
>
> Key: HIVE-16039
> URL: https://issues.apache.org/jira/browse/HIVE-16039
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>
> This happens on master even with   HIVE-15938 reverted (I was trying to make 
> a test case for it based on some TPCDS query).
> {noformat}
> drop table src_10;
> drop table src_20;
> drop table src_30;
> create table src_10 as select * from src limit 10;
> create table src_20 as select * from src limit 11;
> create table src_30 as select * from src limit 12;
> explain 
> with cross_items as
>  (select brand_id, class_id from src_30,
> (select s1.key brand_id,s1.value class_id from src_10 s1
>  intersect select ics.key, ics.value from src_20 ics
>  intersect select iws.key, iws.value from src_30 iws) x where key = 
> brand_id),
> avg_sales as
>  (select kv from (select value kv from src_10
>union all select key from src_20
>union all select value src_30) x)
>   select key, value
>  from (select 'foo' channel, key, value from src_10 where key in (select 
> brand_id from cross_items)
>union all select 'bar' channel, key, value from src_20 where value in 
> (select class_id from cross_items)
>union all select 'baz' channel, key, value from src_30 where value in 
> (select kv from avg_sales)
>  ) y;
> {noformat}
> I know this query is super intuitive...
> CalciteOptimizer fails in a non-informative way, and then SA NPEs.
> {noformat}
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
> Unsupported
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3886)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3815)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSubQueryRelNode(CalcitePlanner.java:2419)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:2443)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterLogicalPlan(CalcitePlanner.java:2499)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3898)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3815)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> 

[jira] [Assigned] (HIVE-16039) Calcite unsupported, and non-Calcite NPE on a query

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-16039:
---


> Calcite unsupported, and non-Calcite NPE on a query
> ---
>
> Key: HIVE-16039
> URL: https://issues.apache.org/jira/browse/HIVE-16039
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>
> This happens on master even with   HIVE-15938 reverted (I was trying to make 
> a test case for it based on some TPCDS query).
> {noformat}
> drop table src_10;
> drop table src_20;
> drop table src_30;
> create table src_10 as select * from src limit 10;
> create table src_20 as select * from src limit 11;
> create table src_30 as select * from src limit 12;
> explain 
> with cross_items as
>  (select brand_id, class_id from src_30,
> (select s1.key brand_id,s1.value class_id from src_10 s1
>  intersect select ics.key, ics.value from src_20 ics
>  intersect select iws.key, iws.value from src_30 iws) x where key = 
> brand_id),
> avg_sales as
>  (select kv from (select value kv from src_10
>union all select key from src_20
>union all select value src_30) x)
>   select key, value
>  from (select 'foo' channel, key, value from src_10 where key in (select 
> brand_id from cross_items)
>union all select 'bar' channel, key, value from src_20 where value in 
> (select class_id from cross_items)
>union all select 'baz' channel, key, value from src_30 where value in 
> (select kv from avg_sales)
>  ) y;
> {noformat}
> I know this query is super intuitive...
> CalciteOptimizer fails in a non-informative way, and then SA NPEs.
> {noformat}
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
> Unsupported
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3886)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3815)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSubQueryRelNode(CalcitePlanner.java:2419)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:2443)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterLogicalPlan(CalcitePlanner.java:2499)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3898)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3815)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3808)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3852)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
>   at 
> 

[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Attachment: HIVE-1555.9.patch

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.7.patch, HIVE-1555.8.patch, HIVE-1555.9.patch, 
> JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Status: Patch Available  (was: Open)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.7.patch, HIVE-1555.8.patch, HIVE-1555.9.patch, 
> JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Attachment: (was: HIVE-1555.6.patch)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.7.patch, HIVE-1555.8.patch, JDBCStorageHandler 
> Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Attachment: (was: HIVE-1555.5.patch)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.7.patch, HIVE-1555.8.patch, JDBCStorageHandler 
> Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Attachment: (was: HIVE-1555.4.patch)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.5.patch, HIVE-1555.6.patch, HIVE-1555.7.patch, 
> HIVE-1555.8.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Attachment: (was: HIVE-1555.3.patch)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.5.patch, HIVE-1555.6.patch, HIVE-1555.7.patch, 
> HIVE-1555.8.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Status: Open  (was: Patch Available)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.4.patch, HIVE-1555.5.patch, HIVE-1555.6.patch, 
> HIVE-1555.7.patch, HIVE-1555.8.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14864) Distcp is not called from MoveTask when src is a directory

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883969#comment-15883969
 ] 

Hive QA commented on HIVE-14864:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854547/HIVE-14864.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10259 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3768/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3768/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3768/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854547 - PreCommit-HIVE-Build

> Distcp is not called from MoveTask when src is a directory
> --
>
> Key: HIVE-14864
> URL: https://issues.apache.org/jira/browse/HIVE-14864
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Sahil Takiar
> Attachments: HIVE-14864.1.patch, HIVE-14864.2.patch, 
> HIVE-14864.3.patch, HIVE-14864.patch
>
>
> In FileUtils.java the following code does not get executed even when src 
> directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because 
> srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We 
> should use srcFS.getContentSummary(src).getLength() instead.
> {noformat}
> /* Run distcp if source file/dir is too big */
> if (srcFS.getUri().getScheme().equals("hdfs") &&
> srcFS.getFileStatus(src).getLen() > 
> conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) {
>   LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. 
> (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + 
> ")");
>   LOG.info("Launch distributed copy (distcp) job.");
>   HiveConfUtil.updateJobCredentialProviders(conf);
>   copied = shims.runDistCp(src, dst, conf);
>   if (copied && deleteSource) {
> srcFS.delete(src, true);
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2017-02-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883956#comment-15883956
 ] 

Sergey Shelukhin commented on HIVE-14990:
-

Many of the above tests no longer fail or turned out to be expected given the 
hacky nature of this testing.
Remaining tests to look at, not covered by other subtasks of the parent:
{noformat}
Minimr:
parallel_orderby

CliDriver:
avro_partitioned
extrapolate_part_stats_full,extrapolate_part_stats_partial
skewjoin

MiniLlapLoca:
dynpart_sort_opt_vectorization,dynpart_sort_optimization,dynpart_sort_optimization2
exchgpartition2lel
extrapolate_part_stats_partial_ndv
{noformat}


> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14990.01.patch, HIVE-14990.02.patch, 
> HIVE-14990.03.patch, HIVE-14990.04.patch, HIVE-14990.04.patch, 
> HIVE-14990.05.patch, HIVE-14990.05.patch, HIVE-14990.06.patch, 
> HIVE-14990.06.patch, HIVE-14990.07.patch, HIVE-14990.08.patch, 
> HIVE-14990.09.patch, HIVE-14990.10.patch, HIVE-14990.10.patch, 
> HIVE-14990.10.patch, HIVE-14990.12.patch, HIVE-14990.13.patch, 
> HIVE-14990.14.patch, HIVE-14990.15.patch, HIVE-14990.patch
>
>
> Expected failures 
> 1) All HCat tests (cannot write MM tables via the HCat writer)
> 2) Almost all merge tests (alter .. concat is not supported).
> 3) Tests that run dfs commands with specific paths (path changes).
> 4) Truncate column (not supported).
> 5) Describe formatted will have the new table fields in the output (before 
> merging MM with ACID).
> 6) Many tests w/explain extended - diff in partition "base file name" (path 
> changes).
> 7) TestTxnCommands - all the conversion tests, as they check for bucket count 
> using file lists (path changes).
> 8) HBase metastore tests cause methods are not implemented.
> 9) Some load and ExIm tests that export a table and then rely on specific 
> path for load (path changes).
> 10) Bucket map join/etc. - diffs; disabled the optimization for MM tables due 
> to how it accounts for buckets
> 11) rand - different results due to different sequence of processing.
> 12) many (not all i.e. not the ones with just one insert) tests that have 
> stats output, such as file count, for obvious reasons
> 13) materialized views, not handled by design - the test check erroneously 
> makes them "mm", no easy way to tell them apart, I don't want to plumb more 
> stuff thru just for this test
> I'm filing jiras for some test failures that are not obvious and need an 
> investigation later



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15958) LLAP: IPC connections are not being reused for umbilical protocol

2017-02-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15958:
-
Attachment: HIVE-15958.5.patch

Addressed review comments.

[~sseth] Can you please take another look?

> LLAP: IPC connections are not being reused for umbilical protocol
> -
>
> Key: HIVE-15958
> URL: https://issues.apache.org/jira/browse/HIVE-15958
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch, 
> HIVE-15958.3.patch, HIVE-15958.4.patch, HIVE-15958.4.patch, HIVE-15958.5.patch
>
>
> During concurrency testing, observed 1000s of ipc thread creations. Ideally, 
> the connections to same hosts should be reused.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16018) Add more information for DynamicPartitionPruningOptimization

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883927#comment-15883927
 ] 

Hive QA commented on HIVE-16018:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854545/HIVE-16018.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10259 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3767/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3767/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3767/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854545 - PreCommit-HIVE-Build

> Add more information for DynamicPartitionPruningOptimization
> 
>
> Key: HIVE-16018
> URL: https://issues.apache.org/jira/browse/HIVE-16018
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16018.01.patch, HIVE-16018.02.patch, 
> HIVE-16018.03.patch, qfile.q, qfile.q.out
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-24 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13864:

Attachment: HIVE-13864.5.patch

> Beeline ignores the command that follows a semicolon and comment
> 
>
> Key: HIVE-13864
> URL: https://issues.apache.org/jira/browse/HIVE-13864
> Project: Hive
>  Issue Type: Bug
>Reporter: Muthu Manickam
>Assignee: Yongzhi Chen
> Attachments: HIVE-13864.01.patch, HIVE-13864.02.patch, 
> HIVE-13864.3.patch, HIVE-13864.4.patch, HIVE-13864.5.patch
>
>
> Beeline ignores the next line/command that follows a command with semicolon 
> and comments.
> Example 1:
> select *
> from table1; -- comments
> select * from table2;
> In this case, only the first command is executed.. second command "select * 
> from table2" is not executed.
> --
> Example 2:
> select *
> from table1; -- comments
> select * from table2;
> select * from table3;
> In this case, first command and third command is executed. second command 
> "select * from table2" is not executed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16019) Query fails when group by/order by on same column with uppercase name

2017-02-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883896#comment-15883896
 ] 

Ashutosh Chauhan commented on HIVE-16019:
-

+1

> Query fails when group by/order by on same column with uppercase name
> -
>
> Key: HIVE-16019
> URL: https://issues.apache.org/jira/browse/HIVE-16019
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16019.patch
>
>
> Query with group by/order by on same column KEY failed:
> {code}
> SELECT T1.KEY AS MYKEY FROM SRC T1 GROUP BY T1.KEY ORDER BY T1.KEY LIMIT 3;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-24 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883875#comment-15883875
 ] 

Yongzhi Chen commented on HIVE-13864:
-

[~aihuaxu], I will add the comments and change the variable name to startQuote. 
But I will keep the parameter type as int[] for I need output parameter and 
Character object type is really hard to pass the value out. 

> Beeline ignores the command that follows a semicolon and comment
> 
>
> Key: HIVE-13864
> URL: https://issues.apache.org/jira/browse/HIVE-13864
> Project: Hive
>  Issue Type: Bug
>Reporter: Muthu Manickam
>Assignee: Yongzhi Chen
> Attachments: HIVE-13864.01.patch, HIVE-13864.02.patch, 
> HIVE-13864.3.patch, HIVE-13864.4.patch
>
>
> Beeline ignores the next line/command that follows a command with semicolon 
> and comments.
> Example 1:
> select *
> from table1; -- comments
> select * from table2;
> In this case, only the first command is executed.. second command "select * 
> from table2" is not executed.
> --
> Example 2:
> select *
> from table1; -- comments
> select * from table2;
> select * from table3;
> In this case, first command and third command is executed. second command 
> "select * from table2" is not executed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883860#comment-15883860
 ] 

Hive QA commented on HIVE-15305:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854516/HIVE-15305.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10284 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=221)
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables_compact]
 (batchId=33)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3766/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3766/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3766/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854516 - PreCommit-HIVE-Build

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-24 Thread Norris Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Attachment: HIVE-14901.7.patch

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, 
> HIVE-14901.6.patch, HIVE-14901.7.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-24 Thread Norris Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Status: Patch Available  (was: In Progress)

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, 
> HIVE-14901.6.patch, HIVE-14901.7.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-24 Thread Norris Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Status: In Progress  (was: Patch Available)

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, 
> HIVE-14901.6.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15457) Return path failures

2017-02-24 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15457:
---
Status: Patch Available  (was: Open)

> Return path failures
> 
>
> Key: HIVE-15457
> URL: https://issues.apache.org/jira/browse/HIVE-15457
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15457.1.patch
>
>
> Currently following return path tests are either failing or have wrong results
> * cbo_rp_subq_in.q
> * cbo_rp_subq_not_in.q
> * cbo_rp_subq_exists.q
> These are disabled and will need to be re-enabled once fixed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15457) Return path failures

2017-02-24 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15457:
---
Attachment: HIVE-15457.1.patch

> Return path failures
> 
>
> Key: HIVE-15457
> URL: https://issues.apache.org/jira/browse/HIVE-15457
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-15457.1.patch
>
>
> Currently following return path tests are either failing or have wrong results
> * cbo_rp_subq_in.q
> * cbo_rp_subq_not_in.q
> * cbo_rp_subq_exists.q
> These are disabled and will need to be re-enabled once fixed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15879) Fix HiveMetaStoreChecker.checkPartitionDirs method

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883803#comment-15883803
 ] 

Hive QA commented on HIVE-15879:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854556/HIVE-15879.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10251 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=99)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3765/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3765/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3765/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854556 - PreCommit-HIVE-Build

> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --
>
> Key: HIVE-15879
> URL: https://issues.apache.org/jira/browse/HIVE-15879
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch, 
> HIVE-15879.03.patch, HIVE-15879.04.patch
>
>
> HIVE-15803 fixes the msck hang issue in 
> HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if 
> the Threadpool has any spare threads. If not it uses single threaded listing 
> of the files.
> {noformat}
> if (pool != null) {
>   synchronized (pool) {
> // In case of recursive calls, it is possible to deadlock with TP. 
> Check TP usage here.
> if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>   useThreadPool = true;
> }
> if (!useThreadPool) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Not using threadPool as active count:" + 
> pool.getActiveCount()
> + ", max:" + pool.getMaximumPoolSize());
>   }
> }
>   }
> }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing 
> tasks.
> it returns only approximate number of threads and it cannot be guaranteed 
> that it always returns the exact number of active threads. This still exposes 
> the method implementation to the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into 
> the possibility of changing the recursive implementation to an iterative 
> implementation where worker threads pick tasks from a queue until the queue 
> is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14990:

Attachment: HIVE-14990.15.patch

Fixing FetchTask on non-default CliDriver

> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14990.01.patch, HIVE-14990.02.patch, 
> HIVE-14990.03.patch, HIVE-14990.04.patch, HIVE-14990.04.patch, 
> HIVE-14990.05.patch, HIVE-14990.05.patch, HIVE-14990.06.patch, 
> HIVE-14990.06.patch, HIVE-14990.07.patch, HIVE-14990.08.patch, 
> HIVE-14990.09.patch, HIVE-14990.10.patch, HIVE-14990.10.patch, 
> HIVE-14990.10.patch, HIVE-14990.12.patch, HIVE-14990.13.patch, 
> HIVE-14990.14.patch, HIVE-14990.15.patch, HIVE-14990.patch
>
>
> Expected failures 
> 1) All HCat tests (cannot write MM tables via the HCat writer)
> 2) Almost all merge tests (alter .. concat is not supported).
> 3) Tests that run dfs commands with specific paths (path changes).
> 4) Truncate column (not supported).
> 5) Describe formatted will have the new table fields in the output (before 
> merging MM with ACID).
> 6) Many tests w/explain extended - diff in partition "base file name" (path 
> changes).
> 7) TestTxnCommands - all the conversion tests, as they check for bucket count 
> using file lists (path changes).
> 8) HBase metastore tests cause methods are not implemented.
> 9) Some load and ExIm tests that export a table and then rely on specific 
> path for load (path changes).
> 10) Bucket map join/etc. - diffs; disabled the optimization for MM tables due 
> to how it accounts for buckets
> 11) rand - different results due to different sequence of processing.
> 12) many (not all i.e. not the ones with just one insert) tests that have 
> stats output, such as file count, for obvious reasons
> 13) materialized views, not handled by design - the test check erroneously 
> makes them "mm", no easy way to tell them apart, I don't want to plumb more 
> stuff thru just for this test
> I'm filing jiras for some test failures that are not obvious and need an 
> investigation later



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-24 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15951:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Slim!

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15702:
--
Attachment: HIVE-15702.patch

> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
> Attachments: HIVE-15702.patch
>
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15702:
--
Status: Patch Available  (was: Open)

> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-24 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-15882:
--
Attachment: HIVE-15882.02.patch

Uploaded the second version of the patch, with comments made in reviewboard 
addressed.

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, HIVE-15882.02.patch, 
> hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883753#comment-15883753
 ] 

slim bouguerra commented on HIVE-15702:
---

@all sorry not sure how i missed this. So i confirm what Ashutosh Chauhan  said 
both need to be renamed.


> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16032) MM tables: encrypted/(minimr?) CLI driver + fetch optimizer => no results

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-16032.
-
   Resolution: Fixed
Fix Version/s: hive-14535

Committed to branch

> MM tables: encrypted/(minimr?) CLI driver + fetch optimizer => no results
> -
>
> Key: HIVE-16032
> URL: https://issues.apache.org/jira/browse/HIVE-16032
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
>
> The repro does not require encryption, but it doesn't happen on CliDriver.
> The easiest way to repro (results for the query w/none, no results w/more 
> (the default)):
> {noformat}
> DROP TABLE IF EXISTS encrypted_table PURGE;
> CREATE TABLE encrypted_table (key INT, value STRING) LOCATION 
> '${hiveconf:hive.metastore.warehouse.dir}/default/encrypted_table';
> INSERT INTO encrypted_table values(1,'foo'),(2,'bar');
> set hive.fetch.task.conversion=none;
> select * from encrypted_table;
> set hive.fetch.task.conversion=more;
> select * from encrypted_table;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16033) LLAP: Use PrintGCDateStamps for gc logging

2017-02-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-16033.
--
   Resolution: Fixed
Fix Version/s: 2.2.0

Committed to master. 

> LLAP: Use PrintGCDateStamps for gc logging
> --
>
> Key: HIVE-16033
> URL: https://issues.apache.org/jira/browse/HIVE-16033
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-16033.1.patch
>
>
> This print human readable timestamps instead of timestamp relative to jvm 
> startup



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16033) LLAP: Use PrintGCDateStamps for gc logging

2017-02-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883738#comment-15883738
 ] 

Prasanth Jayachandran commented on HIVE-16033:
--

Yeah. Tried on a secure cluster and it worked. Committing the patch.

> LLAP: Use PrintGCDateStamps for gc logging
> --
>
> Key: HIVE-16033
> URL: https://issues.apache.org/jira/browse/HIVE-16033
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16033.1.patch
>
>
> This print human readable timestamps instead of timestamp relative to jvm 
> startup



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16022) BloomFilter check not showing up in MERGE statement queries

2017-02-24 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16022:
--
Attachment: HIVE-16022.3.patch

Looks like the partition column checking still needs to follow the parent 
operators until it hits a TableScan.

> BloomFilter check not showing up in MERGE statement queries
> ---
>
> Key: HIVE-16022
> URL: https://issues.apache.org/jira/browse/HIVE-16022
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16022.1.patch, HIVE-16022.2.patch, 
> HIVE-16022.3.patch
>
>
> Running explain on a MERGE statement with runtime filtering enabled, I see 
> the min/max being applied on the large table, but not the bloom filter check:
> {noformat}
> explain merge into acidTbl as t using nonAcidOrcTbl s ON t.a = s.a
> WHEN MATCHED AND s.a > 8 THEN DELETE
> WHEN MATCHED THEN UPDATE SET b = 7
> WHEN NOT MATCHED THEN INSERT VALUES(s.a, s.b)
> ...
> Map 1
> Map Operator Tree:
> TableScan
>   alias: t
>   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
>   Filter Operator
> predicate: a BETWEEN DynamicValue(RS_3_s_a_min) AND 
> DynamicValue(RS_3_s_a_max) (type: boolean)
> Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-12767) Implement table property to address Parquet int96 timestamp bug

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883733#comment-15883733
 ] 

Hive QA commented on HIVE-12767:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854485/HIVE-12767.11.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10274 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3764/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3764/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3764/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854485 - PreCommit-HIVE-Build

> Implement table property to address Parquet int96 timestamp bug
> ---
>
> Key: HIVE-12767
> URL: https://issues.apache.org/jira/browse/HIVE-12767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Sergio Peña
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-12767.10.patch, HIVE-12767.11.patch, 
> HIVE-12767.3.patch, HIVE-12767.4.patch, HIVE-12767.5.patch, 
> HIVE-12767.6.patch, HIVE-12767.7.patch, HIVE-12767.8.patch, 
> HIVE-12767.9.patch, TestNanoTimeUtils.java
>
>
> Parque timestamps using INT96 are not compatible with other tools, like 
> Impala, due to issues in Hive because it adjusts timezones values in a 
> different way than Impala.
> To address such issues. a new table property (parquet.mr.int96.write.zone) 
> must be used in Hive that detects what timezone to use when writing and 
> reading timestamps from Parquet.
> The following is the exit criteria for the fix:
> * Hive will read Parquet MR int96 timestamp data and adjust values using a 
> time zone from a table property, if set, or using the local time zone if it 
> is absent. No adjustment will be applied to data written by Impala.
> * Hive will write Parquet int96 timestamps using a time zone adjustment from 
> the same table property, if set, or using the local time zone if it is 
> absent. This keeps the data in the table consistent.
> * New tables created by Hive will set the table property to UTC if the global 
> option to set the property for new tables is enabled.
> ** Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set 
> the property unless the global setting to do so is enabled.
> ** Tables created using CREATE TABLE LIKE  will copy the 
> property of the table that is copied.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16037) with fetch optimization, the query runs after locks are released

2017-02-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16037:
--
Component/s: Transactions
 Locking

> with fetch optimization, the query runs after locks are released 
> -
>
> Key: HIVE-16037
> URL: https://issues.apache.org/jira/browse/HIVE-16037
> Project: Hive
>  Issue Type: Bug
>  Components: Locking, Transactions
>Reporter: Sergey Shelukhin
>
> Other assumptions may also be broken.
> FetchTask.execute implementation is very curious - it does nothing, and the 
> FetchTask that actually runs the query is put in the same place as the one 
> that normally fetches the results; that is to say, the whole pipeline is run 
> after Driver has "shut down" the query. That releases locks before the query 
> runs, and may also have other implications (for transactions, etc.? I don't 
> think simple fetch can be run for updates)
> Adding a log line to TSOP process method, and running encrypted_table_insert 
> from EncryptedHDFS cli driver, I get:
> {noformat}
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> [no lines here]
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084521 duration=0 from=org.apache.hadoop.hive.ql.Driver>
> ...
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ZooKeeperHiveLockManager: About to release lock for default/encrypted_table
> 2017-02-24T14:41:24,523 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ZooKeeperHiveLockManager: About to release lock for default
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084525 duration=4 from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084525 duration=131 from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ql.Driver: Shutting down query 
> ...
> 2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> mapred.FileInputFormat: Total # of splits generated by getSplits: 1, 
> TimeTaken: 4
> 2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.FetchOperator: Creating fetchTask ...
> ...
> 2017-02-24T14:41:24,541  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: TODO# calling process
> 2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: TODO# calling process
> 2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: Closing operator TS[0]
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16037) with fetch optimization, the query runs after locks are released

2017-02-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883707#comment-15883707
 ] 

Eugene Koifman commented on HIVE-16037:
---

This may be an issue for Acid.
1. once locks are release someone can drop the table
2. if compactor runs concurrently with this, the Cleaner may delete some (pre 
compaction) files that this query is reading because the Cleaner relies on the 
state of Lock Manager to decide when it's safe to delete obsolete files



> with fetch optimization, the query runs after locks are released 
> -
>
> Key: HIVE-16037
> URL: https://issues.apache.org/jira/browse/HIVE-16037
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Other assumptions may also be broken.
> FetchTask.execute implementation is very curious - it does nothing, and the 
> FetchTask that actually runs the query is put in the same place as the one 
> that normally fetches the results; that is to say, the whole pipeline is run 
> after Driver has "shut down" the query. That releases locks before the query 
> runs, and may also have other implications (for transactions, etc.? I don't 
> think simple fetch can be run for updates)
> Adding a log line to TSOP process method, and running encrypted_table_insert 
> from EncryptedHDFS cli driver, I get:
> {noformat}
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> [no lines here]
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084521 duration=0 from=org.apache.hadoop.hive.ql.Driver>
> ...
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ZooKeeperHiveLockManager: About to release lock for default/encrypted_table
> 2017-02-24T14:41:24,523 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ZooKeeperHiveLockManager: About to release lock for default
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084525 duration=4 from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084525 duration=131 from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ql.Driver: Shutting down query 
> ...
> 2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> mapred.FileInputFormat: Total # of splits generated by getSplits: 1, 
> TimeTaken: 4
> 2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.FetchOperator: Creating fetchTask ...
> ...
> 2017-02-24T14:41:24,541  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: TODO# calling process
> 2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: TODO# calling process
> 2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: Closing operator TS[0]
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15935) ACL is not set in ATS data

2017-02-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883706#comment-15883706
 ] 

Siddharth Seth commented on HIVE-15935:
---

{code}
requestuser = hookContext.getUgi().getUserName() ;
{code}
Should this be getShortUserName? I'm not absolutely sure this line is even 
requried. Afaik, the UGI in hookcontext will be the longUser. Don't see harm in 
leaving it though.

May want to PerfLog the ATS putDomain - blocking call, which can take some time.

setupDomain can take time, and should not be inline with the ATSHook.run. ATS 
event publishing is already in background threads for the same reason. Also 
timelinelineclient is only setup after setupAtsExecutor(conf); - I believe the 
patch will fai with an NPEl at the moment.



> ACL is not set in ATS data
> --
>
> Key: HIVE-15935
> URL: https://issues.apache.org/jira/browse/HIVE-15935
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15935.1.patch, HIVE-15935.2.patch, 
> HIVE-15935.3.patch, HIVE-15935.4.patch
>
>
> When publishing ATS info, Hive does not set ACL, that make Hive ATS entries 
> visible to all users. On the other hand, Tez ATS entires is using Tez DAG ACL 
> which limit both view/modify ACL to end user only. We shall make them 
> consistent. In the Jira, I am going to limit ACL to end user for both Tez ATS 
> and Hive ATS, also provide config "hive.view.acls" and "hive.modify.acls" if 
> user need to overridden.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16037) with fetch optimization, the query runs after locks are released

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16037:

Description: 
Other assumptions may also be broken.
FetchTask.execute implementation is very curious - it does nothing, and the 
FetchTask that actually runs the query is put in the same place as the one that 
normally fetches the results; that is to say, the whole pipeline is run after 
Driver has "shut down" the query. That releases locks before the query runs, 
and may also have other implications (for transactions, etc.? I don't think 
simple fetch can be run for updates)
Adding a log line to TSOP process method, and running encrypted_table_insert 
from EncryptedHDFS cli driver, I get:
{noformat}
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
[no lines here]
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
...
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default/encrypted_table
2017-02-24T14:41:24,523 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ql.Driver: Shutting down query 
...
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
mapred.FileInputFormat: Total # of splits generated by getSplits: 1, TimeTaken: 
4
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.FetchOperator: Creating fetchTask ...
...
2017-02-24T14:41:24,541  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: Closing operator TS[0]
...
{noformat}


  was:
Other assumptions may also be broken.
FetchTask.execute implementation is very curious - it does nothing, and the 
FetchTask that actually runs the query is put in the same place as the one that 
normally fetches the results; that is to say, the whole pipeline is run after 
Driver has "shut down" the query. That releases logs before the query runs, and 
may also have other implications (for transactions, etc.? I don't think simple 
fetch can be run for updates)
Adding a log line to TSOP process method, and running encrypted_table_insert 
from EncryptedHDFS cli driver, I get:
{noformat}
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
[no lines here]
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
...
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default/encrypted_table
2017-02-24T14:41:24,523 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ql.Driver: Shutting down query 
...
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
mapred.FileInputFormat: Total # of splits generated by getSplits: 1, TimeTaken: 
4
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.FetchOperator: Creating fetchTask ...
...
2017-02-24T14:41:24,541  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: Closing operator TS[0]
...
{noformat}



> with fetch optimization, the query runs after locks are released 
> -
>
> Key: HIVE-16037
> URL: https://issues.apache.org/jira/browse/HIVE-16037
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Other assumptions may also be broken.
> FetchTask.execute implementation is very curious - it does nothing, and the 
> FetchTask that actually runs the query is put 

[jira] [Updated] (HIVE-16037) with fetch optimization, the query runs after locks are released

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16037:

Description: 
Other assumptions may also be broken.
FetchTask.execute implementation is very curious - it does nothing, and the 
FetchTask that actually runs the query is put in the same place as the one that 
normally fetches the results; that is to say, the whole pipeline is run after 
Driver has "shut down" the query. That releases logs before the query runs, and 
may also have other implications (for transactions, etc.? I don't think simple 
fetch can be run for updates)
Adding a log line to TSOP process method, and running encrypted_table_insert 
from EncryptedHDFS cli driver, I get:
{noformat}
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
[no lines here]
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
...
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default/encrypted_table
2017-02-24T14:41:24,523 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ql.Driver: Shutting down query 
...
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
mapred.FileInputFormat: Total # of splits generated by getSplits: 1, TimeTaken: 
4
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.FetchOperator: Creating fetchTask ...
...
2017-02-24T14:41:24,541  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: Closing operator TS[0]
...
{noformat}


  was:
Other assumptions may also be broken.
FetchTask.execute implementation is very curious - it does nothing, and the 
FetchTask that actually runs the query is put in the same place as the one that 
normally fetches the results; that is to say, the whole pipeline is run after 
Driver has "shut down" the query. That releases logs before the query runs, and 
may also have other implications.
Adding a log line to TSOP process method, and running encrypted_table_insert 
from EncryptedHDFS cli driver, I get:
{noformat}
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
[no lines here]
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
...
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default/encrypted_table
2017-02-24T14:41:24,523 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ZooKeeperHiveLockManager: About to release lock for default
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
log.PerfLogger: 
2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
ql.Driver: Shutting down query 
...
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
mapred.FileInputFormat: Total # of splits generated by getSplits: 1, TimeTaken: 
4
2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.FetchOperator: Creating fetchTask ...
...
2017-02-24T14:41:24,541  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: TODO# calling process
2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
exec.TableScanOperator: Closing operator TS[0]
...
{noformat}



> with fetch optimization, the query runs after locks are released 
> -
>
> Key: HIVE-16037
> URL: https://issues.apache.org/jira/browse/HIVE-16037
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Other assumptions may also be broken.
> FetchTask.execute implementation is very curious - it does nothing, and the 
> FetchTask that actually runs the query is put in the same place as the one 
> that normally fetches the results; that is 

[jira] [Commented] (HIVE-16037) with fetch optimization, the query runs after locks are released

2017-02-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883677#comment-15883677
 ] 

Sergey Shelukhin commented on HIVE-16037:
-

cc [~ashutoshc] [~ekoifman] [~hagleitn] this is a very amusing issue

> with fetch optimization, the query runs after locks are released 
> -
>
> Key: HIVE-16037
> URL: https://issues.apache.org/jira/browse/HIVE-16037
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Other assumptions may also be broken.
> FetchTask.execute implementation is very curious - it does nothing, and the 
> FetchTask that actually runs the query is put in the same place as the one 
> that normally fetches the results; that is to say, the whole pipeline is run 
> after Driver has "shut down" the query. That releases logs before the query 
> runs, and may also have other implications.
> Adding a log line to TSOP process method, and running encrypted_table_insert 
> from EncryptedHDFS cli driver, I get:
> {noformat}
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> [no lines here]
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084521 duration=0 from=org.apache.hadoop.hive.ql.Driver>
> ...
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,521 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ZooKeeperHiveLockManager: About to release lock for default/encrypted_table
> 2017-02-24T14:41:24,523 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ZooKeeperHiveLockManager: About to release lock for default
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084525 duration=4 from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> log.PerfLogger:  end=1487976084525 duration=131 from=org.apache.hadoop.hive.ql.Driver>
> 2017-02-24T14:41:24,525 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> ql.Driver: Shutting down query 
> ...
> 2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> mapred.FileInputFormat: Total # of splits generated by getSplits: 1, 
> TimeTaken: 4
> 2017-02-24T14:41:24,532 DEBUG [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.FetchOperator: Creating fetchTask ...
> ...
> 2017-02-24T14:41:24,541  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: TODO# calling process
> 2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: TODO# calling process
> 2017-02-24T14:41:24,543  INFO [50cde691-3602-4273-a4d9-e35f0c8b6001 main] 
> exec.TableScanOperator: Closing operator TS[0]
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16033) LLAP: Use PrintGCDateStamps for gc logging

2017-02-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883654#comment-15883654
 ] 

Siddharth Seth commented on HIVE-16033:
---

+1, as long as you've tried this and it works. Vaguely recollect this parameter 
not working for me when I had tried it.
Very useful if it does work - correlating events against a timer was annoying.

> LLAP: Use PrintGCDateStamps for gc logging
> --
>
> Key: HIVE-16033
> URL: https://issues.apache.org/jira/browse/HIVE-16033
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16033.1.patch
>
>
> This print human readable timestamps instead of timestamp relative to jvm 
> startup



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15958) LLAP: IPC connections are not being reused for umbilical protocol

2017-02-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883647#comment-15883647
 ] 

Siddharth Seth commented on HIVE-15958:
---

The stopUmbilical in hasAmFaield() - is this required? Think this has the 
potential to cause an NPE if there's a taskKilled already queued up.

In QueryTracker - instead of introducing a new map to store the NodeId - the 
NodeId can be stored in QueryInfo.

Otherwise, looks good.

> LLAP: IPC connections are not being reused for umbilical protocol
> -
>
> Key: HIVE-15958
> URL: https://issues.apache.org/jira/browse/HIVE-15958
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch, 
> HIVE-15958.3.patch, HIVE-15958.4.patch, HIVE-15958.4.patch
>
>
> During concurrency testing, observed 1000s of ipc thread creations. Ideally, 
> the connections to same hosts should be reused.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16023) Wrong estimation for number of rows generated by IN expression

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883625#comment-15883625
 ] 

Hive QA commented on HIVE-16023:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854475/HIVE-16023.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10259 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3763/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3763/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3763/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854475 - PreCommit-HIVE-Build

> Wrong estimation for number of rows generated by IN expression
> --
>
> Key: HIVE-16023
> URL: https://issues.apache.org/jira/browse/HIVE-16023
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16023.01.patch, HIVE-16023.patch
>
>
> Code seems to be wrong, as we are factoring the number of rows to create the 
> multiplying factor, instead of NDV for given column(s) and NDV in IN clause.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883621#comment-15883621
 ] 

Siddharth Seth commented on HIVE-16013:
---

+1

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch, HIVE-16013.2.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16035) Investigate potential SQL injection vulnerability in Hive

2017-02-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883568#comment-15883568
 ] 

Vihang Karajgaonkar commented on HIVE-16035:


Thanks [~thejas] I was not aware. I tried closing this but there is no "close" 
option. Resolved it as Invalid for now.

> Investigate potential SQL injection vulnerability in Hive
> -
>
> Key: HIVE-16035
> URL: https://issues.apache.org/jira/browse/HIVE-16035
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Some of the queries in ObjectStore and MetastoreDirectSql classes append 
> Strings variables directly to the query text. This JIRA is to investigate the 
> possible vulnerabilities and fix them using parameterized queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16035) Investigate potential SQL injection vulnerability in Hive

2017-02-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar resolved HIVE-16035.

Resolution: Invalid

> Investigate potential SQL injection vulnerability in Hive
> -
>
> Key: HIVE-16035
> URL: https://issues.apache.org/jira/browse/HIVE-16035
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Some of the queries in ObjectStore and MetastoreDirectSql classes append 
> Strings variables directly to the query text. This JIRA is to investigate the 
> possible vulnerabilities and fix them using parameterized queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15995) Syncing metastore table with serde schema

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883545#comment-15883545
 ] 

Hive QA commented on HIVE-15995:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854469/HIVE-15995.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10226 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=161)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=129)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3762/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3762/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3762/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854469 - PreCommit-HIVE-Build

> Syncing metastore table with serde schema
> -
>
> Key: HIVE-15995
> URL: https://issues.apache.org/jira/browse/HIVE-15995
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Michal Ferlinski
>Assignee: Michal Ferlinski
> Fix For: 2.1.0
>
> Attachments: cx1.avsc, cx2.avsc, HIVE-15995.patch
>
>
> Hive enables table schema evolution via properties. For avro e.g. we can 
> alter the 'avro.schema.url' property to update table schema to the next 
> version. Updating properties however doesn't affect column list stored in 
> metastore DB so the table is not in the newest version when returned from 
> metastore API. This is problem for tools working with metastore (e.g. Presto).
> To solve this issue I suggest to introduce new DDL statement syncing 
> metastore columns with those from serde:
> {code}
> ALTER TABLE user_test1 UPDATE COLUMNS
> {code}
> Note that this is format independent solution. 
> To reproduce, follow the instructions below:
> - Create table based on avro schema version 1 (cxv1.avsc)
> {code}
> CREATE EXTERNAL TABLE user_test1
>   PARTITIONED BY (dt string)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION
>   '/tmp/schema-evolution/user_test1'
>   TBLPROPERTIES ('avro.schema.url'='/tmp/schema-evolution/cx1.avsc');
> {code}
> - Update schema to version 2 (cx2.avsc)
> {code}
> ALTER TABLE user_test1 SET TBLPROPERTIES ('avro.schema.url' = 
> '/tmp/schema-evolution/cx2.avsc');
> {code}
> - Print serde columns (top info) and metastore columns (Detailed Table 
> Information):
> {code}
> DESCRIBE EXTENDED user_test1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16035) Investigate potential SQL injection vulnerability in Hive

2017-02-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883543#comment-15883543
 ] 

Thejas M Nair commented on HIVE-16035:
--

[~vihangk1]
Please see - https://www.apache.org/security/committers.html
TLDR - Security vulnerabilities should not be investigated/discussed in public 
until a fix is out.
Please involve security mailing list secur...@hive.apache.org if you suspect 
there is an issue or to report one.

I think its better to close this jira and follow this process.


> Investigate potential SQL injection vulnerability in Hive
> -
>
> Key: HIVE-16035
> URL: https://issues.apache.org/jira/browse/HIVE-16035
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Some of the queries in ObjectStore and MetastoreDirectSql classes append 
> Strings variables directly to the query text. This JIRA is to investigate the 
> possible vulnerabilities and fix them using parameterized queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-16015.
-
Resolution: Fixed

Committed to master after discussion; also removed TezMerger in light of 
TEZ-3637.
The logs will not be gone until Tez 0.9. If you are bothered by spammy logs 
before then, you are encouraged to complain to [~sseth] personally ;)

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-16015.01.patch, HIVE-16015.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-12376) make hive.compactor.worker.threads use a thread pool, etc

2017-02-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883468#comment-15883468
 ] 

Eugene Koifman commented on HIVE-12376:
---

Together with HIVE-11685 it probably better to simply rely on 
_hive.compactor.job.queue_ for resource management and have the Worker just 
submit jobs to it.
Then we can have 1 polling thread to scan COMPACTION_QUEUE  for running jobs 
and check with Yarn if they are done.
Number of Worker thread could default to 5 or something since generating a 
compaction job may take time if there are very many deltas, it should not be 
used to throttle number of jobs.



> make hive.compactor.worker.threads use a thread pool, etc
> -
>
> Key: HIVE-12376
> URL: https://issues.apache.org/jira/browse/HIVE-12376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> # use a thread pool with core/max capacities instead of creating all threads 
> upfront
> # make sure there is a limit (1000 threads?)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15708) Upgrade calcite version to 1.12

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883452#comment-15883452
 ] 

Hive QA commented on HIVE-15708:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854555/HIVE-15708.12.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10244 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_timestamp] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite_multi_db]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ops_comparison] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reduce_deduplicate_extended2]
 (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_reflect2] 
(batchId=14)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_ppd_decimal]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=105)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=211)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
org.apache.hive.jdbc.TestJdbcDriver2.testPrepareSetTimestamp (batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3761/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3761/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3761/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854555 - PreCommit-HIVE-Build

> Upgrade calcite version to 1.12
> ---
>
> Key: HIVE-15708
> URL: https://issues.apache.org/jira/browse/HIVE-15708
> Project: Hive
>  Issue Type: Task
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Remus Rusanu
> Attachments: HIVE-15708.01.patch, HIVE-15708.02.patch, 
> HIVE-15708.03.patch, HIVE-15708.04.patch, HIVE-15708.05.patch, 
> HIVE-15708.06.patch, HIVE-15708.07.patch, HIVE-15708.08.patch, 
> HIVE-15708.09.patch, HIVE-15708.10.patch, HIVE-15708.11.patch, 
> HIVE-15708.12.patch
>
>
> Currently we are on 1.10 Need to upgrade calcite version to 1.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15166) Provide beeline option to set the jline history max size

2017-02-24 Thread Eric Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883444#comment-15883444
 ] 

Eric Lin commented on HIVE-15166:
-

Looks like I need to rebase my code, as a lot has changed. Will update again 
the patch on coming Monday.

> Provide beeline option to set the jline history max size
> 
>
> Key: HIVE-15166
> URL: https://issues.apache.org/jira/browse/HIVE-15166
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.1.0
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-15166.2.patch, HIVE-15166.patch
>
>
> Currently Beeline does not provide an option to limit the max size for 
> beeline history file, in the case that each query is very big, it will flood 
> the history file and slow down beeline on start up and shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16028) Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is used

2017-02-24 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-16028:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks Pengcheng for the review!

> Fail UPDATE/DELETE/MERGE queries when Ranger authorization manager is used
> --
>
> Key: HIVE-16028
> URL: https://issues.apache.org/jira/browse/HIVE-16028
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0
>
> Attachments: HIVE-16028.1.patch, HIVE-16028.2.patch
>
>
> This is a followup of HIVE-15891. In that jira an error-out logic was added, 
> but the assumption that we need to do row filtering/column masking for 
> entries in a non-empty list of tables returned by 
> applyRowFilterAndColumnMasking is wrong, because on Ranger side, 
> RangerHiveAuthorizer#applyRowFilterAndColumnMasking will unconditionally 
> return a list of tables no matter whether row filtering/column masking is 
> applicable on the tables.
> The fix for Hive for now will be to move the error-out logic after we figure 
> out there's no replacement text for the query. But ideally we should consider 
> modifying Ranger logic to only return tables that need to be masked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16005) miscellaneous small fixes to help with llap debuggability

2017-02-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883412#comment-15883412
 ] 

Prasanth Jayachandran commented on HIVE-16005:
--

+1

> miscellaneous small fixes to help with llap debuggability
> -
>
> Key: HIVE-16005
> URL: https://issues.apache.org/jira/browse/HIVE-16005
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16005.01.patch, HIVE-16005.02.patch, 
> HIVE-16005.03.patch
>
>
> - Include proc_ in cli, beeline, metastore, hs2 process args
> - LLAP history logger - log QueryId instead of dagName (dag name is free 
> flowing text)
> - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / 
> queued
> - Include thread name in TaskRunnerCallable so that it shows up in stack 
> traces (will cause extra output in logs)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15958) LLAP: IPC connections are not being reused for umbilical protocol

2017-02-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15958:
-
Attachment: HIVE-15958.4.patch

Rebased patch.

> LLAP: IPC connections are not being reused for umbilical protocol
> -
>
> Key: HIVE-15958
> URL: https://issues.apache.org/jira/browse/HIVE-15958
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch, 
> HIVE-15958.3.patch, HIVE-15958.4.patch, HIVE-15958.4.patch
>
>
> During concurrency testing, observed 1000s of ipc thread creations. Ideally, 
> the connections to same hosts should be reused.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14459) TestBeeLineDriver - migration and re-enable

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883377#comment-15883377
 ] 

Hive QA commented on HIVE-14459:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854459/HIVE-14459.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10259 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3760/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3760/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3760/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854459 - PreCommit-HIVE-Build

> TestBeeLineDriver - migration and re-enable
> ---
>
> Key: HIVE-14459
> URL: https://issues.apache.org/jira/browse/HIVE-14459
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
> Attachments: HIVE-14459.2.patch, HIVE-14459.3.patch, 
> HIVE-14459.4.patch, HIVE-14459.5.patch, HIVE-14459.patch
>
>
> this test have been left behind in HIVE-1 because it had some compile 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16035) Investigate potential SQL injection vulnerability in Hive

2017-02-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16035:
--


> Investigate potential SQL injection vulnerability in Hive
> -
>
> Key: HIVE-16035
> URL: https://issues.apache.org/jira/browse/HIVE-16035
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Some of the queries in ObjectStore and MetastoreDirectSql classes append 
> Strings variables directly to the query text. This JIRA is to investigate the 
> possible vulnerabilities and fix them using parameterized queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15879) Fix HiveMetaStoreChecker.checkPartitionDirs method

2017-02-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-15879:
---
Attachment: HIVE-15879.04.patch

> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --
>
> Key: HIVE-15879
> URL: https://issues.apache.org/jira/browse/HIVE-15879
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch, 
> HIVE-15879.03.patch, HIVE-15879.04.patch
>
>
> HIVE-15803 fixes the msck hang issue in 
> HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if 
> the Threadpool has any spare threads. If not it uses single threaded listing 
> of the files.
> {noformat}
> if (pool != null) {
>   synchronized (pool) {
> // In case of recursive calls, it is possible to deadlock with TP. 
> Check TP usage here.
> if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>   useThreadPool = true;
> }
> if (!useThreadPool) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Not using threadPool as active count:" + 
> pool.getActiveCount()
> + ", max:" + pool.getMaximumPoolSize());
>   }
> }
>   }
> }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing 
> tasks.
> it returns only approximate number of threads and it cannot be guaranteed 
> that it always returns the exact number of active threads. This still exposes 
> the method implementation to the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into 
> the possibility of changing the recursive implementation to an iterative 
> implementation where worker threads pick tasks from a queue until the queue 
> is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15708) Upgrade calcite version to 1.12

2017-02-24 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-15708:

Attachment: HIVE-15708.12.patch

12.patch updates many golden files. The HIVE-16027 changes modified many plans 
in trivial way (add CAST to BETWEEN).

> Upgrade calcite version to 1.12
> ---
>
> Key: HIVE-15708
> URL: https://issues.apache.org/jira/browse/HIVE-15708
> Project: Hive
>  Issue Type: Task
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Remus Rusanu
> Attachments: HIVE-15708.01.patch, HIVE-15708.02.patch, 
> HIVE-15708.03.patch, HIVE-15708.04.patch, HIVE-15708.05.patch, 
> HIVE-15708.06.patch, HIVE-15708.07.patch, HIVE-15708.08.patch, 
> HIVE-15708.09.patch, HIVE-15708.10.patch, HIVE-15708.11.patch, 
> HIVE-15708.12.patch
>
>
> Currently we are on 1.10 Need to upgrade calcite version to 1.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883365#comment-15883365
 ] 

Ashutosh Chauhan commented on HIVE-15702:
-

yeah.. this is a helper class. Funnily its used by DruidStorageHandlerTest 
whose name doesnt have test prefix :) 
These two classes needs to be renamed.

> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16013:
-
Attachment: HIVE-16013.2.patch

Addressed review comments. Added some more tests.
[~sseth] can you please take another look?

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch, HIVE-16013.2.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16006) Incremental REPL LOAD doesn't operate on the target database if name differs from source database.

2017-02-24 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16006:

Attachment: HIVE-16006.02.patch

[~sushanth]
Agreed to your comment. The new patch ensures backward compatibility with 
replv1 import. Also, modified an existing test case to verify the incremental 
insert.

Please review the patch.

> Incremental REPL LOAD doesn't operate on the target database if name differs 
> from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15999) Fix flakiness in TestDbTxnManager2

2017-02-24 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15999:
-
Component/s: Tests

> Fix flakiness in TestDbTxnManager2
> --
>
> Key: HIVE-15999
> URL: https://issues.apache.org/jira/browse/HIVE-15999
> Project: Hive
>  Issue Type: Bug
>  Components: Tests, Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0
>
> Attachments: HIVE-15999.1.patch
>
>
> Right now there is test flakiness wrt. TestDbTxnManager2. The error is like 
> this:
> {code}
> java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'.
>   at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
> Source)
>   at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
> Source)
>   at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90)
> {code}
> The failure is due to HiveConf used in the test being polluted by some test, 
> e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set 
> to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15999) Fix flakiness in TestDbTxnManager2

2017-02-24 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15999:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks Eugene for the review!

> Fix flakiness in TestDbTxnManager2
> --
>
> Key: HIVE-15999
> URL: https://issues.apache.org/jira/browse/HIVE-15999
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.2.0
>
> Attachments: HIVE-15999.1.patch
>
>
> Right now there is test flakiness wrt. TestDbTxnManager2. The error is like 
> this:
> {code}
> java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'.
>   at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
> Source)
>   at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
> Source)
>   at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90)
> {code}
> The failure is due to HiveConf used in the test being polluted by some test, 
> e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set 
> to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15879) Fix HiveMetaStoreChecker.checkPartitionDirs method

2017-02-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883338#comment-15883338
 ] 

Vihang Karajgaonkar commented on HIVE-15879:


I agree that the patch does not improve the case of have 1 level of partition. 
It performs similar to existing approach. Did a simple test with single 
partitioned key table with ~1800 partitions on S3. Both the implementations 
take about the same time ~60 sec. But we quickly start seeing the benefits of 
this approach as soon as the number of partition keys increase.

Repeated the test above with a 2 partition keys with 10*10 = 100 partitions. 
Results shown below show significant performance gain with the default configs.

|| Default pool size ||  Before || After ||
|| Time taken (sec) | 19.8 | 3.27 |

Hi [~rajesh.balamohan] I can change the JIRA description and category to 
"Improvement" if you think that is more appropriate. Thanks!

Also updating the review board with patch HIVE-15879.03.patch


> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --
>
> Key: HIVE-15879
> URL: https://issues.apache.org/jira/browse/HIVE-15879
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch, 
> HIVE-15879.03.patch
>
>
> HIVE-15803 fixes the msck hang issue in 
> HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if 
> the Threadpool has any spare threads. If not it uses single threaded listing 
> of the files.
> {noformat}
> if (pool != null) {
>   synchronized (pool) {
> // In case of recursive calls, it is possible to deadlock with TP. 
> Check TP usage here.
> if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>   useThreadPool = true;
> }
> if (!useThreadPool) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Not using threadPool as active count:" + 
> pool.getActiveCount()
> + ", max:" + pool.getMaximumPoolSize());
>   }
> }
>   }
> }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing 
> tasks.
> it returns only approximate number of threads and it cannot be guaranteed 
> that it always returns the exact number of active threads. This still exposes 
> the method implementation to the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into 
> the possibility of changing the recursive implementation to an iterative 
> implementation where worker threads pick tasks from a queue until the queue 
> is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15999) Fix flakiness in TestDbTxnManager2

2017-02-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883328#comment-15883328
 ] 

Eugene Koifman commented on HIVE-15999:
---

+1

> Fix flakiness in TestDbTxnManager2
> --
>
> Key: HIVE-15999
> URL: https://issues.apache.org/jira/browse/HIVE-15999
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15999.1.patch
>
>
> Right now there is test flakiness wrt. TestDbTxnManager2. The error is like 
> this:
> {code}
> java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'.
>   at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
> Source)
>   at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
> Source)
>   at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:75)
>   at 
> org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.setUp(TestDbTxnManager2.java:90)
> {code}
> The failure is due to HiveConf used in the test being polluted by some test, 
> e.g. in testDummyTxnManagerOnAcidTable(), conf entry HIVE_TXN_MANAGER is set 
> to "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager" but not switched back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16034) Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat

2017-02-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883318#comment-15883318
 ] 

Ashutosh Chauhan commented on HIVE-16034:
-

This can be made more robust by making sure type of dimensions is in 
PrimitiveGrouping.STRING_GROUP and type of metrics is in 
PrimitiveGrouping.NUMERIC_GROUP and throw exception if this is not the case. 
Outside of these groupings we cant handle other types any way.
 Current assumption of default assuming dimensions make code less predictable 
and results in hard to debug issues.

> Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat
> 
>
> Key: HIVE-16034
> URL: https://issues.apache.org/jira/browse/HIVE-16034
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16034.patch
>
>
> We are extracting the type name by String, which might cause issues, e.g., 
> for Decimal, where type includes precision and scale. Instead, we should 
> check the PrimitiveCategory enum.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14864) Distcp is not called from MoveTask when src is a directory

2017-02-24 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-14864:

Attachment: HIVE-14864.3.patch

[~spena] comments addressed, rebased the patch. While writes to S3 won't hit 
this code path, its still useful for other destination filesystems.

> Distcp is not called from MoveTask when src is a directory
> --
>
> Key: HIVE-14864
> URL: https://issues.apache.org/jira/browse/HIVE-14864
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Sahil Takiar
> Attachments: HIVE-14864.1.patch, HIVE-14864.2.patch, 
> HIVE-14864.3.patch, HIVE-14864.patch
>
>
> In FileUtils.java the following code does not get executed even when src 
> directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because 
> srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We 
> should use srcFS.getContentSummary(src).getLength() instead.
> {noformat}
> /* Run distcp if source file/dir is too big */
> if (srcFS.getUri().getScheme().equals("hdfs") &&
> srcFS.getFileStatus(src).getLen() > 
> conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) {
>   LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. 
> (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + 
> ")");
>   LOG.info("Launch distributed copy (distcp) job.");
>   HiveConfUtil.updateJobCredentialProviders(conf);
>   copied = shims.runDistCp(src, dst, conf);
>   if (copied && deleteSource) {
> srcFS.delete(src, true);
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16034) Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat

2017-02-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883272#comment-15883272
 ] 

Ashutosh Chauhan commented on HIVE-16034:
-

+1

> Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat
> 
>
> Key: HIVE-16034
> URL: https://issues.apache.org/jira/browse/HIVE-16034
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16034.patch
>
>
> We are extracting the type name by String, which might cause issues, e.g., 
> for Decimal, where type includes precision and scale. Instead, we should 
> check the PrimitiveCategory enum.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16018) Add more information for DynamicPartitionPruningOptimization

2017-02-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16018:
---
Attachment: HIVE-16018.03.patch

> Add more information for DynamicPartitionPruningOptimization
> 
>
> Key: HIVE-16018
> URL: https://issues.apache.org/jira/browse/HIVE-16018
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16018.01.patch, HIVE-16018.02.patch, 
> HIVE-16018.03.patch, qfile.q, qfile.q.out
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16018) Add more information for DynamicPartitionPruningOptimization

2017-02-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16018:
---
Status: Patch Available  (was: Open)

> Add more information for DynamicPartitionPruningOptimization
> 
>
> Key: HIVE-16018
> URL: https://issues.apache.org/jira/browse/HIVE-16018
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16018.01.patch, HIVE-16018.02.patch, 
> HIVE-16018.03.patch, qfile.q, qfile.q.out
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16018) Add more information for DynamicPartitionPruningOptimization

2017-02-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16018:
---
Status: Open  (was: Patch Available)

> Add more information for DynamicPartitionPruningOptimization
> 
>
> Key: HIVE-16018
> URL: https://issues.apache.org/jira/browse/HIVE-16018
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16018.01.patch, HIVE-16018.02.patch, 
> HIVE-16018.03.patch, qfile.q, qfile.q.out
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16034) Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883242#comment-15883242
 ] 

Hive QA commented on HIVE-16034:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854453/HIVE-16034.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10254 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3759/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3759/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3759/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854453 - PreCommit-HIVE-Build

> Hive/Druid integration: Fix type inference for Decimal DruidOutputFormat
> 
>
> Key: HIVE-16034
> URL: https://issues.apache.org/jira/browse/HIVE-16034
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16034.patch
>
>
> We are extracting the type name by String, which might cause issues, e.g., 
> for Decimal, where type includes precision and scale. Instead, we should 
> check the PrimitiveCategory enum.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-10924) add support for MERGE statement

2017-02-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883191#comment-15883191
 ] 

Eugene Koifman commented on HIVE-10924:
---

HIVE-15269 is key for performance of Merge with WHEN NOT MATCHED clauses and 
large target tables

> add support for MERGE statement
> ---
>
> Key: HIVE-10924
> URL: https://issues.apache.org/jira/browse/HIVE-10924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning, Query Processor, Transactions
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> add support for 
> MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16024) MSCK Repair Requires nonstrict hive.mapred.mode

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883156#comment-15883156
 ] 

Hive QA commented on HIVE-16024:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854451/HIVE-16024.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10258 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3758/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3758/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3758/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854451 - PreCommit-HIVE-Build

> MSCK Repair Requires nonstrict hive.mapred.mode
> ---
>
> Key: HIVE-16024
> URL: https://issues.apache.org/jira/browse/HIVE-16024
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-16024.01.patch, HIVE-16024.02.patch
>
>
> MSCK repair fails when hive.mapred.mode is set to strict
> HIVE-13788 modified the way we read up partitions for a table to improve 
> performance. Unfortunately it is using PartitionPruner to load the partitions 
> which in turn is checking hive.mapred.mode.
> The previous code did not check hive.mapred.mode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15935) ACL is not set in ATS data

2017-02-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883152#comment-15883152
 ] 

Thejas M Nair commented on HIVE-15935:
--

[~sseth] Can you please review ?


> ACL is not set in ATS data
> --
>
> Key: HIVE-15935
> URL: https://issues.apache.org/jira/browse/HIVE-15935
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15935.1.patch, HIVE-15935.2.patch, 
> HIVE-15935.3.patch, HIVE-15935.4.patch
>
>
> When publishing ATS info, Hive does not set ACL, that make Hive ATS entries 
> visible to all users. On the other hand, Tez ATS entires is using Tez DAG ACL 
> which limit both view/modify ACL to end user only. We shall make them 
> consistent. In the Jira, I am going to limit ACL to end user for both Tez ATS 
> and Hive ATS, also provide config "hive.view.acls" and "hive.modify.acls" if 
> user need to overridden.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16027) BETWEEN AND must cast to TIMESTMAP

2017-02-24 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883140#comment-15883140
 ] 

Remus Rusanu commented on HIVE-16027:
-

{{IN}} operator has the same problem. I have a fix for both in the HIVE-15708 
patch

>  BETWEEN  AND  must cast to TIMESTMAP
> 
>
> Key: HIVE-16027
> URL: https://issues.apache.org/jira/browse/HIVE-16027
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>
> This is the HIVE side fix for the issues manifesting itself as CALCITE-1626



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-15414) Fix batchSize for TestNegativeCliDriver

2017-02-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar resolved HIVE-15414.

Resolution: Fixed

Resolving this issue since we have changed the batch size of 
TestNegativeCliDriver and it leads to a improvement in execution time.

> Fix batchSize for TestNegativeCliDriver
> ---
>
> Key: HIVE-15414
> URL: https://issues.apache.org/jira/browse/HIVE-15414
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> While analyzing the console output of pre-commit console logs, I noticed that 
> TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right.
> 2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 
> PBatch: QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, 
> queryFilesProperty=qfile, 
> name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more..
>   
> I think {{qFileTest.clientNegative.batchSize = 1000}} in 
> {{test-configuration2.properties}} is probably the reason. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16016) Use same PersistenceManager for metadata and notification

2017-02-24 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-16016:
---
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Marking this as duplicate of HIVE-15766. 

[~vgumashta], it'd be great to commit that patch, because it's a critical fix.  
Fixed the failing tests in this patch and moved those over to HIVE-15305., 
appreciate if you could take a look at it as well. Thank you!

> Use same PersistenceManager for metadata and notification
> -
>
> Key: HIVE-16016
> URL: https://issues.apache.org/jira/browse/HIVE-16016
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-16016.patch
>
>
> HIVE-13966 added to support for persisting notification in the same JDO 
> transaction as the metadata event. However, the notification is currently 
> added using a different ObjectStore object from the one used to persist the 
> metadata event.  
> The notification is added using the ObjectStore constructed in 
> DbNotificationListener, whereas the metadata event is added via the thread 
> local ObjectStore (i.e. threadLocalMS in HiveMetaStore.HMSHandler).
> As a result, different PersistenceManagers (different transactions) are used 
> to persist notification and metadata events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-02-24 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883113#comment-15883113
 ] 

Mohit Sabharwal commented on HIVE-15305:


[~vgumashta], could you please a take a look ? Thanks!

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-02-24 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-15305:
---
Status: Patch Available  (was: Open)

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-02-24 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-15305:
---
Attachment: HIVE-15305.patch

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15702) Test timeout : TestDerbyConnector

2017-02-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883103#comment-15883103
 ] 

Thejas M Nair commented on HIVE-15702:
--

bq. looks like an utility class. Should we rename it?
Good point.
Is this really a test class ? [~bslim] [~jcamachorodriguez] [~ashutoshc] ?


> Test timeout : TestDerbyConnector 
> --
>
> Key: HIVE-15702
> URL: https://issues.apache.org/jira/browse/HIVE-15702
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: slim bouguerra
>
> TestDerbyConnector seems to be having timeout quite frequently (from a search 
> in hive-issues mailing list test output).
> This was seen with HIVE-15579 - 
> bq. TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
> https://builds.apache.org/job/PreCommit-HIVE-Build/3108/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-24 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883099#comment-15883099
 ] 

Aihua Xu commented on HIVE-13864:
-

The logic seems good that we are carrying the quote info from the previous line 
to the next line. But I feel a little confusing to use array to pass the single 
value and also we are really tracking if a char is within quotes (string), not 
really escape. Minor suggestions to make the code more readable: 1. can you add 
some comments to the function and some lines to explain what you are doing? 2. 
Check to see if you can use {{Character startQuote}} to carry such info and use 
Character.unsigned or a special value to mean it's unset? 



> Beeline ignores the command that follows a semicolon and comment
> 
>
> Key: HIVE-13864
> URL: https://issues.apache.org/jira/browse/HIVE-13864
> Project: Hive
>  Issue Type: Bug
>Reporter: Muthu Manickam
>Assignee: Yongzhi Chen
> Attachments: HIVE-13864.01.patch, HIVE-13864.02.patch, 
> HIVE-13864.3.patch, HIVE-13864.4.patch
>
>
> Beeline ignores the next line/command that follows a command with semicolon 
> and comments.
> Example 1:
> select *
> from table1; -- comments
> select * from table2;
> In this case, only the first command is executed.. second command "select * 
> from table2" is not executed.
> --
> Example 2:
> select *
> from table1; -- comments
> select * from table2;
> select * from table3;
> In this case, first command and third command is executed. second command 
> "select * from table2" is not executed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15879) Fix HiveMetaStoreChecker.checkPartitionDirs method

2017-02-24 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-15879:
---
Attachment: HIVE-15879.03.patch

One of the test cases which I added for testing deeply nested partitions was 
using real poolSize (numProcessors*2) which is very high (~20). The single 
threaded implementation of the testcase took too much time most likely leading 
to the time-out for that batch. Fixed the test case to use a smaller poolSize 
using mockito. Local test runs are taking around ~14 sec now. Hopefully this 
should work with pre-commit now.

> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --
>
> Key: HIVE-15879
> URL: https://issues.apache.org/jira/browse/HIVE-15879
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch, 
> HIVE-15879.03.patch
>
>
> HIVE-15803 fixes the msck hang issue in 
> HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if 
> the Threadpool has any spare threads. If not it uses single threaded listing 
> of the files.
> {noformat}
> if (pool != null) {
>   synchronized (pool) {
> // In case of recursive calls, it is possible to deadlock with TP. 
> Check TP usage here.
> if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>   useThreadPool = true;
> }
> if (!useThreadPool) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Not using threadPool as active count:" + 
> pool.getActiveCount()
> + ", max:" + pool.getMaximumPoolSize());
>   }
> }
>   }
> }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing 
> tasks.
> it returns only approximate number of threads and it cannot be guaranteed 
> that it always returns the exact number of active threads. This still exposes 
> the method implementation to the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into 
> the possibility of changing the recursive implementation to an iterative 
> implementation where worker threads pick tasks from a queue until the queue 
> is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15879) Fix HiveMetaStoreChecker.checkPartitionDirs method

2017-02-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883045#comment-15883045
 ] 

Hive QA commented on HIVE-15879:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854416/HIVE-15879.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10195 tests 
executed
*Failed tests:*
{noformat}
TestContext - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestHiveCopyFiles - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestHiveCredentialProviders - did not produce a TEST-*.xml file (likely timed 
out) (batchId=258)
TestHiveMetaStoreChecker - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestLog4j2Appenders - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestOperators - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestTableIterable - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestTxnCommands2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3757/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3757/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3757/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854416 - PreCommit-HIVE-Build

> Fix HiveMetaStoreChecker.checkPartitionDirs method
> --
>
> Key: HIVE-15879
> URL: https://issues.apache.org/jira/browse/HIVE-15879
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15879.01.patch, HIVE-15879.02.patch
>
>
> HIVE-15803 fixes the msck hang issue in 
> HiveMetaStoreChecker.checkPartitionDirs method by adding a check to see if 
> the Threadpool has any spare threads. If not it uses single threaded listing 
> of the files.
> {noformat}
> if (pool != null) {
>   synchronized (pool) {
> // In case of recursive calls, it is possible to deadlock with TP. 
> Check TP usage here.
> if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
>   useThreadPool = true;
> }
> if (!useThreadPool) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("Not using threadPool as active count:" + 
> pool.getActiveCount()
> + ", max:" + pool.getMaximumPoolSize());
>   }
> }
>   }
> }
> {noformat}
> Based on the java doc of getActiveCount() below 
> bq. Returns the approximate number of threads that are actively executing 
> tasks.
> it returns only approximate number of threads and it cannot be guaranteed 
> that it always returns the exact number of active threads. This still exposes 
> the method implementation to the msck hang bug in rare corner cases.
> We could either:
> 1. Use a atomic counter to track exactly how many threads are actively running
> 2. Relook at the method itself to make it much simpler. Like eg, look into 
> the possibility of changing the recursive implementation to an iterative 
> implementation where worker threads pick tasks from a queue until the queue 
> is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16029) COLLECT_SET and COLLECT_LIST does not return NULL in the result

2017-02-24 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882997#comment-15882997
 ] 

Aihua Xu commented on HIVE-16029:
-

[~csun] You worked on the original feature. Not sure if it makes sense to 
include NULL in the result. Can you take a look? The set in java actually will 
also remove null. 

> COLLECT_SET and COLLECT_LIST does not return NULL in the result
> ---
>
> Key: HIVE-16029
> URL: https://issues.apache.org/jira/browse/HIVE-16029
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-16029.patch
>
>
> See the test case below:
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from collect_set_test;
> +-+
> | collect_set_test.a  |
> +-+
> | 1   |
> | 2   |
> | NULL|
> | 4   |
> | NULL|
> +-+
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,4]  |
> +---+
> {code}
> The correct result should be:
> {code}
> 0: jdbc:hive2://localhost:1/default> select collect_set(a) from 
> collect_set_test;
> +---+
> |  _c0  |
> +---+
> | [1,2,null,4]  |
> +---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15995) Syncing metastore table with serde schema

2017-02-24 Thread Michal Ferlinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michal Ferlinski updated HIVE-15995:

Target Version/s: 2.2.0, 2.1.2

> Syncing metastore table with serde schema
> -
>
> Key: HIVE-15995
> URL: https://issues.apache.org/jira/browse/HIVE-15995
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Michal Ferlinski
>Assignee: Michal Ferlinski
> Fix For: 2.1.0
>
> Attachments: cx1.avsc, cx2.avsc, HIVE-15995.patch
>
>
> Hive enables table schema evolution via properties. For avro e.g. we can 
> alter the 'avro.schema.url' property to update table schema to the next 
> version. Updating properties however doesn't affect column list stored in 
> metastore DB so the table is not in the newest version when returned from 
> metastore API. This is problem for tools working with metastore (e.g. Presto).
> To solve this issue I suggest to introduce new DDL statement syncing 
> metastore columns with those from serde:
> {code}
> ALTER TABLE user_test1 UPDATE COLUMNS
> {code}
> Note that this is format independent solution. 
> To reproduce, follow the instructions below:
> - Create table based on avro schema version 1 (cxv1.avsc)
> {code}
> CREATE EXTERNAL TABLE user_test1
>   PARTITIONED BY (dt string)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION
>   '/tmp/schema-evolution/user_test1'
>   TBLPROPERTIES ('avro.schema.url'='/tmp/schema-evolution/cx1.avsc');
> {code}
> - Update schema to version 2 (cx2.avsc)
> {code}
> ALTER TABLE user_test1 SET TBLPROPERTIES ('avro.schema.url' = 
> '/tmp/schema-evolution/cx2.avsc');
> {code}
> - Print serde columns (top info) and metastore columns (Detailed Table 
> Information):
> {code}
> DESCRIBE EXTENDED user_test1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15995) Syncing metastore table with serde schema

2017-02-24 Thread Michal Ferlinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michal Ferlinski updated HIVE-15995:

Affects Version/s: 1.2.1
Fix Version/s: 2.1.0
  Component/s: Metastore

> Syncing metastore table with serde schema
> -
>
> Key: HIVE-15995
> URL: https://issues.apache.org/jira/browse/HIVE-15995
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Michal Ferlinski
>Assignee: Michal Ferlinski
> Fix For: 2.1.0
>
> Attachments: cx1.avsc, cx2.avsc, HIVE-15995.patch
>
>
> Hive enables table schema evolution via properties. For avro e.g. we can 
> alter the 'avro.schema.url' property to update table schema to the next 
> version. Updating properties however doesn't affect column list stored in 
> metastore DB so the table is not in the newest version when returned from 
> metastore API. This is problem for tools working with metastore (e.g. Presto).
> To solve this issue I suggest to introduce new DDL statement syncing 
> metastore columns with those from serde:
> {code}
> ALTER TABLE user_test1 UPDATE COLUMNS
> {code}
> Note that this is format independent solution. 
> To reproduce, follow the instructions below:
> - Create table based on avro schema version 1 (cxv1.avsc)
> {code}
> CREATE EXTERNAL TABLE user_test1
>   PARTITIONED BY (dt string)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION
>   '/tmp/schema-evolution/user_test1'
>   TBLPROPERTIES ('avro.schema.url'='/tmp/schema-evolution/cx1.avsc');
> {code}
> - Update schema to version 2 (cx2.avsc)
> {code}
> ALTER TABLE user_test1 SET TBLPROPERTIES ('avro.schema.url' = 
> '/tmp/schema-evolution/cx2.avsc');
> {code}
> - Print serde columns (top info) and metastore columns (Detailed Table 
> Information):
> {code}
> DESCRIBE EXTENDED user_test1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >