[jira] [Commented] (HIVE-10074) Ability to run HCat Client Unit tests in a system test setting

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381467#comment-14381467
 ] 

Hive QA commented on HIVE-10074:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707316/HIVE-10074.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8337 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3161/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3161/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3161/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707316 - PreCommit-HIVE-TRUNK-Build

> Ability to run HCat Client Unit tests in a system test setting
> --
>
> Key: HIVE-10074
> URL: https://issues.apache.org/jira/browse/HIVE-10074
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Deepesh Khandelwal
>Assignee: Deepesh Khandelwal
> Attachments: HIVE-10074.1.patch, HIVE-10074.patch
>
>
> Following testsuite 
> {{hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java}}
>  is a JUnit testsuite to test some basic HCat client API. During setup it 
> brings up a Hive Metastore with embedded Derby. The testsuite however will be 
> even more useful if it can be run against a running Hive Metastore 
> (transparent to whatever backing DB its running against).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-25 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381424#comment-14381424
 ] 

Chengxiang Li commented on HIVE-10073:
--

Hi, [~jxiang], I saw you only call checkOutputSpecs for ReduceWork, but there 
may be a FileSinkOperator in map-only job as well, so we may also need to 
checkOutputSpecs for MapWork. Besides, the checkOutputSpecs is invoked at 
SparkRecordHandler::init which would be executed for each task, 
SparkPlanGenerator::generate(BaseWork work) may be a better place to do this, 
we can checkOutputSpecs between clone jobconf and serialized jobconf, so this 
would only be checked once time at RSC side.

> Runtime exception when querying HBase with Spark [Spark Branch]
> ---
>
> Key: HIVE-10073
> URL: https://issues.apache.org/jira/browse/HIVE-10073
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-10073.1-spark.patch
>
>
> When querying HBase with Spark, we got 
> {noformat}
>  Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
> {noformat}
> But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10072) Add vectorization support for Hybrid Grace Hash Join

2015-03-25 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381397#comment-14381397
 ] 

Matt McCline commented on HIVE-10072:
-

I compared the 6th patch to the 3rd.  Still looks good for launch after tests 
succeed.

> Add vectorization support for Hybrid Grace Hash Join
> 
>
> Key: HIVE-10072
> URL: https://issues.apache.org/jira/browse/HIVE-10072
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.2.0
>
> Attachments: HIVE-10072.01.patch, HIVE-10072.02.patch, 
> HIVE-10072.03.patch, HIVE-10072.04.patch, HIVE-10072.05.patch, 
> HIVE-10072.06.patch
>
>
> This task is to enable vectorization support for Hybrid Grace Hash Join 
> feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-1575) get_json_object does not support JSON array at the root level

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381391#comment-14381391
 ] 

Hive QA commented on HIVE-1575:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707306/HIVE-1575.5.patch

{color:green}SUCCESS:{color} +1 8339 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3160/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3160/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3160/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707306 - PreCommit-HIVE-TRUNK-Build

> get_json_object does not support JSON array at the root level
> -
>
> Key: HIVE-1575
> URL: https://issues.apache.org/jira/browse/HIVE-1575
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.7.0
>Reporter: Steven Wong
>Assignee: Alexander Pivovarov
> Attachments: 
> 0001-Updated-UDFJson-to-allow-arrays-as-a-root-object.patch, 
> HIVE-1575.2.patch, HIVE-1575.3.patch, HIVE-1575.4.patch, HIVE-1575.5.patch
>
>
> Currently, get_json_object(json_txt, path) always returns null if json_txt is 
> not a JSON object (e.g. is a JSON array) at the root level.
> I have a table column of JSON arrays at the root level, but I can't parse it 
> because of that.
> get_json_object should accept any JSON value (string, number, object, array, 
> true, false, null), not just object, at the root level. In other words, it 
> should behave as if it were named get_json_value or simply get_json.
> Per the JSON RFC, an array is indeed legal top-level JSON-text



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10082) LLAP: UnwrappedRowContainer throws exceptions

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381355#comment-14381355
 ] 

Gunther Hagleitner commented on HIVE-10082:
---

I think patch .1 will fix the issue - although there might be more. Turns out 
gWorkMap is still a global map albeit restricted by thread now. That means 
different threads especially if they are quick to execute could step on each 
other.

> LLAP: UnwrappedRowContainer throws exceptions
> -
>
> Key: HIVE-10082
> URL: https://issues.apache.org/jira/browse/HIVE-10082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: llap
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Fix For: llap
>
> Attachments: HIVE-10082.1.patch
>
>
> TPC-DS Query27 runs with map-joins enabled results in errors originating from 
> these lines in UnwrappedRowContainer::unwrap() 
> {code}
>for (int index : valueIndex) {
>   if (index >= 0) {
> unwrapped.add(currentKey == null ? null : currentKey[index]);
>   } else {
> unwrapped.add(values.get(-index - 1));
>   }
> }
> {code}
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> at java.util.ArrayList.get(ArrayList.java:429)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:341)
> {code}
> This is intermittent and does not cause query failures as the retries 
> succeed, but slows down the query by an entire wave due to the retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10082) LLAP: UnwrappedRowContainer throws exceptions

2015-03-25 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10082:
--
Attachment: HIVE-10082.1.patch

> LLAP: UnwrappedRowContainer throws exceptions
> -
>
> Key: HIVE-10082
> URL: https://issues.apache.org/jira/browse/HIVE-10082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: llap
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Fix For: llap
>
> Attachments: HIVE-10082.1.patch
>
>
> TPC-DS Query27 runs with map-joins enabled results in errors originating from 
> these lines in UnwrappedRowContainer::unwrap() 
> {code}
>for (int index : valueIndex) {
>   if (index >= 0) {
> unwrapped.add(currentKey == null ? null : currentKey[index]);
>   } else {
> unwrapped.add(values.get(-index - 1));
>   }
> }
> {code}
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> at java.util.ArrayList.get(ArrayList.java:429)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:341)
> {code}
> This is intermittent and does not cause query failures as the retries 
> succeed, but slows down the query by an entire wave due to the retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5771) Constant propagation optimizer for Hive

2015-03-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381331#comment-14381331
 ] 

Lefty Leverenz commented on HIVE-5771:
--

Doc note:  *hive.optimize.constant.propagation* is now documented in the wiki, 
so I'm removing the TODOC14 label.

* [Configuration Properties -- hive.optimize.constant.propagation | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.optimize.constant.propagation]

> Constant propagation optimizer for Hive
> ---
>
> Key: HIVE-5771
> URL: https://issues.apache.org/jira/browse/HIVE-5771
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ted Xu
>Assignee: Ted Xu
> Fix For: 0.14.0
>
> Attachments: HIVE-5771.1.patch, HIVE-5771.10.patch, 
> HIVE-5771.11.patch, HIVE-5771.12.patch, HIVE-5771.14.patch, 
> HIVE-5771.16.patch, HIVE-5771.17.patch, HIVE-5771.2.patch, HIVE-5771.3.patch, 
> HIVE-5771.4.patch, HIVE-5771.5.patch, HIVE-5771.6.patch, HIVE-5771.7.patch, 
> HIVE-5771.8.patch, HIVE-5771.9.patch, HIVE-5771.patch, 
> HIVE-5771.patch.javaonly
>
>
> Currently there is no constant folding/propagation optimizer, all expressions 
> are evaluated at runtime. 
> HIVE-2470 did a great job on evaluating constants on UDF initializing phase, 
> however, it is still a runtime evaluation and it doesn't propagate constants 
> from a subquery to outside.
> It may reduce I/O and accelerate process if we introduce such an optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9839) HiveServer2 leaks OperationHandle on async queries which fail at compile phase

2015-03-25 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381329#comment-14381329
 ] 

Nemon Lou commented on HIVE-9839:
-

Seems that the failure is unrelated.
Running this on my local computer doesn't fail:
{quote}
"mvn test -Phadoop-2 -Dtest=TestCliDriver -Dqfile=schemeAuthority.q" 
{quote}
The result:
{quote}
Running org.apache.hadoop.hive.cli.TestCliDriver
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 66.08 sec - in 
org.apache.hadoop.hive.cli.TestCliDriver
{quote}
Submit again.

> HiveServer2 leaks OperationHandle on async queries which fail at compile phase
> --
>
> Key: HIVE-9839
> URL: https://issues.apache.org/jira/browse/HIVE-9839
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Critical
> Attachments: HIVE-9839.patch, HIVE-9839.patch, HIVE-9839.patch, 
> HIVE-9839.patch, OperationHandleMonitor.java
>
>
> Using beeline to connect to HiveServer2.And type the following:
> drop table if exists table_not_exists;
> select * from table_not_exists;
> There will be an OperationHandle object staying in HiveServer2's memory for 
> ever even after quit from beeline .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5771) Constant propagation optimizer for Hive

2015-03-25 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-5771:
-
Labels:   (was: TODOC14)

> Constant propagation optimizer for Hive
> ---
>
> Key: HIVE-5771
> URL: https://issues.apache.org/jira/browse/HIVE-5771
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Ted Xu
>Assignee: Ted Xu
> Fix For: 0.14.0
>
> Attachments: HIVE-5771.1.patch, HIVE-5771.10.patch, 
> HIVE-5771.11.patch, HIVE-5771.12.patch, HIVE-5771.14.patch, 
> HIVE-5771.16.patch, HIVE-5771.17.patch, HIVE-5771.2.patch, HIVE-5771.3.patch, 
> HIVE-5771.4.patch, HIVE-5771.5.patch, HIVE-5771.6.patch, HIVE-5771.7.patch, 
> HIVE-5771.8.patch, HIVE-5771.9.patch, HIVE-5771.patch, 
> HIVE-5771.patch.javaonly
>
>
> Currently there is no constant folding/propagation optimizer, all expressions 
> are evaluated at runtime. 
> HIVE-2470 did a great job on evaluating constants on UDF initializing phase, 
> however, it is still a runtime evaluation and it doesn't propagate constants 
> from a subquery to outside.
> It may reduce I/O and accelerate process if we introduce such an optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381320#comment-14381320
 ] 

Hive QA commented on HIVE-9976:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707303/HIVE-9976.2.patch

{color:green}SUCCESS:{color} +1 8347 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3159/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3159/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3159/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707303 - PreCommit-HIVE-TRUNK-Build

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Attachments: HIVE-9976.1.patch, HIVE-9976.2.patch, 
> llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls

2015-03-25 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381307#comment-14381307
 ] 

Alan Gates commented on HIVE-10091:
---

I've left some comments on review board.

Also a question.  If I read HBaseFilterPlanUtil correctly this can handle 
non-boolean expressions on initial keys right now, but not booleans or 
expressions on keys beyond the first one.  I think it's fine to get this 
checked in as is and and functionality as we go.  I just wanted to make sure I 
understood correctly.


> Generate Hbase execution plan for partition filter conditions in HbaseStore 
> api calls
> -
>
> Key: HIVE-10091
> URL: https://issues.apache.org/jira/browse/HIVE-10091
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-10091.1.patch
>
>
> RawStore functions that support partition filtering are the following - 
> getPartitionsByExpr
> getPartitionsByFilter (takes filter string as argument, used from hcatalog)
> We need to generate a query execution plan in terms of Hbase scan api calls 
> for a given filter condition.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data

2015-03-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10062:
---
Attachment: HIVE-10062.01.patch

a quick try to fix it.

> HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
> -
>
> Key: HIVE-10062
> URL: https://issues.apache.org/jira/browse/HIVE-10062
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Critical
> Attachments: HIVE-10062.01.patch
>
>
> In q.test environment with src table, execute the following query: 
> {code}
> CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
> CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
> FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
>  UNION all 
>   select s2.key as key, s2.value as value from src s2) unionsrc
> INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
> SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
> INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
> COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
> GROUP BY unionsrc.key, unionsrc.value;
> select * from DEST1;
> select * from DEST2;
> {code}
> DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row 
> "tst1500 1"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10090) Add connection manager for Tephra

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381255#comment-14381255
 ] 

Hive QA commented on HIVE-10090:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707280/HIVE-10090.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3158/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3158/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3158/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-3158/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'ql/src/test/results/clientpositive/show_functions.q.out'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target 
shims/0.23/target shims/aggregator/target shims/common/target 
shims/scheduler/target packaging/target hbase-handler/target testutils/target 
jdbc/target metastore/target itests/target itests/thirdparty 
itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target 
itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-jmh/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target itests/qtest-spark/target hcatalog/target 
hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target 
accumulo-handler/target hwi/target common/target common/src/gen 
spark-client/target contrib/target service/target serde/target beeline/target 
odbc/target cli/target ql/dependency-reduced-pom.xml ql/target 
ql/src/test/results/clientpositive/udf_months_between.q.out 
ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFMonthsBetween.java
 ql/src/test/queries/clientpositive/udf_months_between.q 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMonthsBetween.java
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1669247.

At revision 1669247.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707280 - PreCommit-HIVE-TRUNK-Build

> Add connection manager for Tephra
> -
>
> Key: HIVE-10090
> URL: https://issues.apache.org/jira/browse/HIVE-10090
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-10090.patch
>
>
> The task is to create an implementation of HBaseConnection that will use 
> Tephra for transaction management.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381252#comment-14381252
 ] 

Hive QA commented on HIVE-9518:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707266/HIVE-9518.5.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8341 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3157/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3157/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3157/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707266 - PreCommit-HIVE-TRUNK-Build

> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.
> Should accept date, timestamp and string arguments in the format '-MM-dd' 
> or '-MM-dd HH:mm:ss'. The time part should be ignored.
> The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10038) Add Calcite's ProjectMergeRule.

2015-03-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10038:

Attachment: HIVE-10038.4.patch

> Add Calcite's ProjectMergeRule.
> ---
>
> Key: HIVE-10038
> URL: https://issues.apache.org/jira/browse/HIVE-10038
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10038.2.patch, HIVE-10038.3.patch, 
> HIVE-10038.4.patch, HIVE-10038.patch
>
>
> Helps to improve latency by shortening operator pipeline. Folds adjacent 
> projections in one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9780) Add another level of explain for RDBMS audience

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381174#comment-14381174
 ] 

Hive QA commented on HIVE-9780:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707262/HIVE-9780.03.patch

{color:red}ERROR:{color} -1 due to 546 failed/errored test(s), 8340 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguitycheck
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_udaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_multi_partitions
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explainuser_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge_incompat2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_columns
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_indexes_edge_cases
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_indexes_syntax
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tablestatus
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_showparts
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_compare_java_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_logic_java_boolean
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_testlength
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_testlength2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join21
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29
org.apac

[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381168#comment-14381168
 ] 

Ashutosh Chauhan commented on HIVE-10085:
-

+1

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10077) Use new ParquetInputSplit constructor API

2015-03-25 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381165#comment-14381165
 ] 

Ferdinand Xu commented on HIVE-10077:
-

Hi [~spena], could you help me review it though it depends on HIVE-10076?

> Use new ParquetInputSplit constructor API
> -
>
> Key: HIVE-10077
> URL: https://issues.apache.org/jira/browse/HIVE-10077
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10077.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10076) Update parquet-hadoop-bundle and parquet-column to the version of 1.6.0rc6

2015-03-25 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381164#comment-14381164
 ] 

Ferdinand Xu commented on HIVE-10076:
-

Hi [~spena], can you help me review it? Failed case is not related to my patch. 
Thank you!

> Update parquet-hadoop-bundle and parquet-column to the version of 1.6.0rc6
> --
>
> Key: HIVE-10076
> URL: https://issues.apache.org/jira/browse/HIVE-10076
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-10076.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10095) format_number udf throws NPE

2015-03-25 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-10095:
---
Attachment: HIVE-10095.1.patch

patch #1

> format_number udf throws NPE
> 
>
> Key: HIVE-10095
> URL: https://issues.apache.org/jira/browse/HIVE-10095
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-10095.1.patch
>
>
> For example
> {code}
> select format_number(cast(null as int), 0);
> FAILED: NullPointerException null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9766) Add JavaConstantXXXObjectInspector

2015-03-25 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-9766:
-
Attachment: HIVE-9766.3.patch

You are right. Add the check to all.

> Add JavaConstantXXXObjectInspector
> --
>
> Key: HIVE-9766
> URL: https://issues.apache.org/jira/browse/HIVE-9766
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-9766.1.patch, HIVE-9766.2.patch, HIVE-9766.3.patch
>
>
> Need JavaConstantXXXObjectInspector when implementing PIG-3294. There are two 
> approaches:
> 1. Add those classes in Pig. However, most construct of the base class 
> JavaXXXObjectInspector is default scope, need to change them to protected
> 2. Add those classes in Hive
> Approach 2 should be better since those classes might be useful to Hive as 
> well. Attach a patch to provide them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9767) Fixes in Hive UDF to be usable in Pig

2015-03-25 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-9767:
-
Attachment: HIVE-9767.3.patch

There is a NPE, fixed in HIVE-9767.3.patch.

> Fixes in Hive UDF to be usable in Pig
> -
>
> Key: HIVE-9767
> URL: https://issues.apache.org/jira/browse/HIVE-9767
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-9767.1.patch, HIVE-9767.2.patch, HIVE-9767.3.patch
>
>
> There are issues in UDF never get exposed because the execution path is never 
> tested:
> # Assume the ObjectInspector to be WritableObjectInspector not the 
> ObjectInspector pass to UDF
> # Assume the input parameter to be Writable not respecting the 
> ObjectInspector pass to UDF
> # Assume ConstantObjectInspector to be WritableConstantXXXObjectInspector
> # The InputObjectInspector does not match OutputObjectInspector of previous 
> stage in UDAF
> # The execution path involving convertIfNecessary is never been tested
> Attach a patch to fix those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10072) Add vectorization support for Hybrid Grace Hash Join

2015-03-25 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10072:
-
Attachment: HIVE-10072.06.patch

Upload 6th patch for testing

> Add vectorization support for Hybrid Grace Hash Join
> 
>
> Key: HIVE-10072
> URL: https://issues.apache.org/jira/browse/HIVE-10072
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.2.0
>
> Attachments: HIVE-10072.01.patch, HIVE-10072.02.patch, 
> HIVE-10072.03.patch, HIVE-10072.04.patch, HIVE-10072.05.patch, 
> HIVE-10072.06.patch
>
>
> This task is to enable vectorization support for Hybrid Grace Hash Join 
> feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data

2015-03-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-10062:
--

Assignee: Pengcheng Xiong

> HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
> -
>
> Key: HIVE-10062
> URL: https://issues.apache.org/jira/browse/HIVE-10062
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Critical
>
> In q.test environment with src table, execute the following query: 
> {code}
> CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
> CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
> FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
>  UNION all 
>   select s2.key as key, s2.value as value from src s2) unionsrc
> INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
> SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
> INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
> COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
> GROUP BY unionsrc.key, unionsrc.value;
> select * from DEST1;
> select * from DEST2;
> {code}
> DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row 
> "tst1500 1"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381066#comment-14381066
 ] 

Hive QA commented on HIVE-10085:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707263/HIVE-10085.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8338 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_mapjoin6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt10
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3155/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3155/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3155/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707263 - PreCommit-HIVE-TRUNK-Build

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-03-25 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381040#comment-14381040
 ] 

Sushanth Sowmyan commented on HIVE-9582:


Admittedly, I also agree that we should slowly push people to use HCatClient, 
and away from HiveMetaStoreClient for our interface purposes, and so, 
eventually, we should internalize HCatUtil.

> HCatalog should use IMetaStoreClient interface
> --
>
> Key: HIVE-9582
> URL: https://issues.apache.org/jira/browse/HIVE-9582
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>  Labels: hcatalog, metastore, rolling_upgrade
> Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
> HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9583.1.patch
>
>
> Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
> Hence during a failure, the client retries and possibly succeeds. But 
> HCatalog has long been using HiveMetaStoreClient directly and hence failures 
> are costly, especially if they are during the commit stage of a job. Its also 
> not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-03-25 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381036#comment-14381036
 ] 

Sushanth Sowmyan commented on HIVE-9582:


I think that was the goal behind InternalUtil, while HCatUtil was intended to 
be public-ish-facing. And even if there are functions in HCatUtil that might 
still be worth putting behind a InterfaceAudience.Private wall, getHiveClient, 
closeHiveClientQuietly and getHiveConf were intended to be public.

> HCatalog should use IMetaStoreClient interface
> --
>
> Key: HIVE-9582
> URL: https://issues.apache.org/jira/browse/HIVE-9582
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>  Labels: hcatalog, metastore, rolling_upgrade
> Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
> HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9583.1.patch
>
>
> Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
> Hence during a failure, the client retries and possibly succeeds. But 
> HCatalog has long been using HiveMetaStoreClient directly and hence failures 
> are costly, especially if they are during the commit stage of a job. Its also 
> not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-03-25 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381023#comment-14381023
 ] 

Thiruvel Thirumoolan commented on HIVE-9582:


Thanks [~sushanth] for your time and inputs. Much appreciated.

I was of the opinion getHiveClient() was internal and not supposed to be used 
outside (yeah, we have lot of public methods). I asked Oozie team at Yahoo! to 
use HCatClient going forward, I am not sure those changes are in trunk. I 
didn't know falcon/sqoop might be using it.

I will change as you suggested and update another patch.

> HCatalog should use IMetaStoreClient interface
> --
>
> Key: HIVE-9582
> URL: https://issues.apache.org/jira/browse/HIVE-9582
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>  Labels: hcatalog, metastore, rolling_upgrade
> Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
> HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9583.1.patch
>
>
> Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
> Hence during a failure, the client retries and possibly succeeds. But 
> HCatalog has long been using HiveMetaStoreClient directly and hence failures 
> are costly, especially if they are during the commit stage of a job. Its also 
> not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-03-25 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381017#comment-14381017
 ] 

Mithun Radhakrishnan commented on HIVE-9582:


I wonder, should {{HCatUtils}} be package-protected?

> HCatalog should use IMetaStoreClient interface
> --
>
> Key: HIVE-9582
> URL: https://issues.apache.org/jira/browse/HIVE-9582
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>  Labels: hcatalog, metastore, rolling_upgrade
> Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
> HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9583.1.patch
>
>
> Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
> Hence during a failure, the client retries and possibly succeeds. But 
> HCatalog has long been using HiveMetaStoreClient directly and hence failures 
> are costly, especially if they are during the commit stage of a job. Its also 
> not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-03-25 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381007#comment-14381007
 ] 

Sushanth Sowmyan commented on HIVE-9582:


+cc [~venkatnrangan]


> HCatalog should use IMetaStoreClient interface
> --
>
> Key: HIVE-9582
> URL: https://issues.apache.org/jira/browse/HIVE-9582
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>  Labels: hcatalog, metastore, rolling_upgrade
> Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
> HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9583.1.patch
>
>
> Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
> Hence during a failure, the client retries and possibly succeeds. But 
> HCatalog has long been using HiveMetaStoreClient directly and hence failures 
> are costly, especially if they are during the commit stage of a job. Its also 
> not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381006#comment-14381006
 ] 

Sergio Peña commented on HIVE-10086:


[~rdblue] Could you help me review this?

> Hive throws error when accessing Parquet file schema using field name match
> ---
>
> Key: HIVE-10086
> URL: https://issues.apache.org/jira/browse/HIVE-10086
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-10086.1.patch, HiveGroup.parquet
>
>
> When Hive table schema contains a portion of the schema of a Parquet file, 
> then the access to the values should work if the field names match the 
> schema. This does not work when a struct<> data type is in the schema, and 
> the Hive schema contains just a portion of the struct elements. Hive throws 
> an error instead.
> This is the example and how to reproduce:
> First, create a parquet table, and add some values on it:
> {code}
> CREATE TABLE test1 (id int, name string, address 
> struct) STORED AS PARQUET;
> INSERT INTO TABLE test1 SELECT 1, 'Roger', 
> named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
> srcpart LIMIT 1;
> {code}
> Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
> statement.
> The above table example generates the following Parquet file schema:
> {code}
> message hive_schema {
>   optional int32 id;
>   optional binary name (UTF8);
>   optional group address {
> optional int32 number;
> optional binary street (UTF8);
> optional binary zip (UTF8);
>   }
> }
> {code} 
> Afterwards, I create a table that contains just a portion of the schema, and 
> load the Parquet file generated above, a query will fail on that table:
> {code}
> CREATE TABLE test1 (name string, address struct) STORED AS 
> PARQUET;
> LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
> hive> SELECT name FROM test1;
> OK
> Roger
> Time taken: 0.071 seconds, Fetched: 1 row(s)
> hive> SELECT address FROM test1;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.UnsupportedOperationException: Cannot inspect 
> org.apache.hadoop.io.IntWritable
> Time taken: 0.085 seconds
> {code}
> I would expect that Parquet can access the matched names, but Hive throws an 
> error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9891) LLAP: disable plan caching

2015-03-25 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-9891.
--
Resolution: Fixed

Committed to branch.

> LLAP: disable plan caching
> --
>
> Key: HIVE-9891
> URL: https://issues.apache.org/jira/browse/HIVE-9891
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-9891.1.patch
>
>
> Can't share the same plan objects in LLAP as they are used concurrently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10087) Beeline's --silent option should suppress query from being echoed when running with -f option

2015-03-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380945#comment-14380945
 ] 

Xuefu Zhang commented on HIVE-10087:


+1

> Beeline's --silent option should suppress query from being echoed when 
> running with -f option
> -
>
> Key: HIVE-10087
> URL: https://issues.apache.org/jira/browse/HIVE-10087
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-10087.patch
>
>
> The {{-e}} and the {{-f}} options behave differently. 
> {code}
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -f select.sql
> 0: jdbc:hive2://localhost:1/default> select * from sample_07 limit 5;
> --
> 00- All Occupations 134354250 40690
> 11- Management occupations 6003930 96150
> 11-1011 Chief executives 299160 151370
> 11-1021 General and operations managers 1655410 103780
> 11-1031 Legislators 61110 33880
> --
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -e "select * from sample_07 limit 5;"
> --
> 00-   All Occupations 134354250   40690
> 11-   Management occupations  6003930 96150
> 11-1011   Chief executives299160  151370
> 11-1021   General and operations managers 1655410 103780
> 11-1031   Legislators 61110   33880
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380937#comment-14380937
 ] 

Sergio Peña commented on HIVE-10086:


[~xuefuz] [~szehon] Could you help me review this patch?

> Hive throws error when accessing Parquet file schema using field name match
> ---
>
> Key: HIVE-10086
> URL: https://issues.apache.org/jira/browse/HIVE-10086
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-10086.1.patch, HiveGroup.parquet
>
>
> When Hive table schema contains a portion of the schema of a Parquet file, 
> then the access to the values should work if the field names match the 
> schema. This does not work when a struct<> data type is in the schema, and 
> the Hive schema contains just a portion of the struct elements. Hive throws 
> an error instead.
> This is the example and how to reproduce:
> First, create a parquet table, and add some values on it:
> {code}
> CREATE TABLE test1 (id int, name string, address 
> struct) STORED AS PARQUET;
> INSERT INTO TABLE test1 SELECT 1, 'Roger', 
> named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
> srcpart LIMIT 1;
> {code}
> Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
> statement.
> The above table example generates the following Parquet file schema:
> {code}
> message hive_schema {
>   optional int32 id;
>   optional binary name (UTF8);
>   optional group address {
> optional int32 number;
> optional binary street (UTF8);
> optional binary zip (UTF8);
>   }
> }
> {code} 
> Afterwards, I create a table that contains just a portion of the schema, and 
> load the Parquet file generated above, a query will fail on that table:
> {code}
> CREATE TABLE test1 (name string, address struct) STORED AS 
> PARQUET;
> LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
> hive> SELECT name FROM test1;
> OK
> Roger
> Time taken: 0.071 seconds, Fetched: 1 row(s)
> hive> SELECT address FROM test1;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.UnsupportedOperationException: Cannot inspect 
> org.apache.hadoop.io.IntWritable
> Time taken: 0.085 seconds
> {code}
> I would expect that Parquet can access the matched names, but Hive throws an 
> error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: HiveGroup.parquet

> Hive throws error when accessing Parquet file schema using field name match
> ---
>
> Key: HIVE-10086
> URL: https://issues.apache.org/jira/browse/HIVE-10086
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-10086.1.patch, HiveGroup.parquet
>
>
> When Hive table schema contains a portion of the schema of a Parquet file, 
> then the access to the values should work if the field names match the 
> schema. This does not work when a struct<> data type is in the schema, and 
> the Hive schema contains just a portion of the struct elements. Hive throws 
> an error instead.
> This is the example and how to reproduce:
> First, create a parquet table, and add some values on it:
> {code}
> CREATE TABLE test1 (id int, name string, address 
> struct) STORED AS PARQUET;
> INSERT INTO TABLE test1 SELECT 1, 'Roger', 
> named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
> srcpart LIMIT 1;
> {code}
> Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
> statement.
> The above table example generates the following Parquet file schema:
> {code}
> message hive_schema {
>   optional int32 id;
>   optional binary name (UTF8);
>   optional group address {
> optional int32 number;
> optional binary street (UTF8);
> optional binary zip (UTF8);
>   }
> }
> {code} 
> Afterwards, I create a table that contains just a portion of the schema, and 
> load the Parquet file generated above, a query will fail on that table:
> {code}
> CREATE TABLE test1 (name string, address struct) STORED AS 
> PARQUET;
> LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
> hive> SELECT name FROM test1;
> OK
> Roger
> Time taken: 0.071 seconds, Fetched: 1 row(s)
> hive> SELECT address FROM test1;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.UnsupportedOperationException: Cannot inspect 
> org.apache.hadoop.io.IntWritable
> Time taken: 0.085 seconds
> {code}
> I would expect that Parquet can access the matched names, but Hive throws an 
> error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: HIVE-10086.1.patch

> Hive throws error when accessing Parquet file schema using field name match
> ---
>
> Key: HIVE-10086
> URL: https://issues.apache.org/jira/browse/HIVE-10086
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-10086.1.patch
>
>
> When Hive table schema contains a portion of the schema of a Parquet file, 
> then the access to the values should work if the field names match the 
> schema. This does not work when a struct<> data type is in the schema, and 
> the Hive schema contains just a portion of the struct elements. Hive throws 
> an error instead.
> This is the example and how to reproduce:
> First, create a parquet table, and add some values on it:
> {code}
> CREATE TABLE test1 (id int, name string, address 
> struct) STORED AS PARQUET;
> INSERT INTO TABLE test1 SELECT 1, 'Roger', 
> named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
> srcpart LIMIT 1;
> {code}
> Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
> statement.
> The above table example generates the following Parquet file schema:
> {code}
> message hive_schema {
>   optional int32 id;
>   optional binary name (UTF8);
>   optional group address {
> optional int32 number;
> optional binary street (UTF8);
> optional binary zip (UTF8);
>   }
> }
> {code} 
> Afterwards, I create a table that contains just a portion of the schema, and 
> load the Parquet file generated above, a query will fail on that table:
> {code}
> CREATE TABLE test1 (name string, address struct) STORED AS 
> PARQUET;
> LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
> hive> SELECT name FROM test1;
> OK
> Roger
> Time taken: 0.071 seconds, Fetched: 1 row(s)
> hive> SELECT address FROM test1;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.UnsupportedOperationException: Cannot inspect 
> org.apache.hadoop.io.IntWritable
> Time taken: 0.085 seconds
> {code}
> I would expect that Parquet can access the matched names, but Hive throws an 
> error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10087) Beeline's --silent option should suppress query from being echoed when running with -f option

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380933#comment-14380933
 ] 

Hive QA commented on HIVE-10087:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707246/HIVE-10087.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8337 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_multi_distinct
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3154/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3154/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3154/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707246 - PreCommit-HIVE-TRUNK-Build

> Beeline's --silent option should suppress query from being echoed when 
> running with -f option
> -
>
> Key: HIVE-10087
> URL: https://issues.apache.org/jira/browse/HIVE-10087
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-10087.patch
>
>
> The {{-e}} and the {{-f}} options behave differently. 
> {code}
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -f select.sql
> 0: jdbc:hive2://localhost:1/default> select * from sample_07 limit 5;
> --
> 00- All Occupations 134354250 40690
> 11- Management occupations 6003930 96150
> 11-1011 Chief executives 299160 151370
> 11-1021 General and operations managers 1655410 103780
> 11-1031 Legislators 61110 33880
> --
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -e "select * from sample_07 limit 5;"
> --
> 00-   All Occupations 134354250   40690
> 11-   Management occupations  6003930 96150
> 11-1011   Chief executives299160  151370
> 11-1021   General and operations managers 1655410 103780
> 11-1031   Legislators 61110   33880
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9664) Hive "add jar" command should be able to download and add jars from a repository

2015-03-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380928#comment-14380928
 ] 

Anthony Hsu commented on HIVE-9664:
---

+1 (though I don't have Hive commit access)

> Hive "add jar" command should be able to download and add jars from a 
> repository
> 
>
> Key: HIVE-9664
> URL: https://issues.apache.org/jira/browse/HIVE-9664
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Anant Nag
>Assignee: Anant Nag
>  Labels: hive, patch
> Attachments: HIVE-9664.4.patch, HIVE-9664.5.patch, HIVE-9664.patch, 
> HIVE-9664.patch, HIVE-9664.patch
>
>
> Currently Hive's "add jar" command takes a local path to the dependency jar. 
> This clutters the local file-system as users may forget to remove this jar 
> later
> It would be nice if Hive supported a Gradle like notation to download the jar 
> from a repository.
> Example:  add jar org:module:version
> 
> It should also be backward compatible and should take jar from the local 
> file-system as well. 
> RB:  https://reviews.apache.org/r/31628/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9664) Hive "add jar" command should be able to download and add jars from a repository

2015-03-25 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-9664:
--
Description: 
Currently Hive's "add jar" command takes a local path to the dependency jar. 
This clutters the local file-system as users may forget to remove this jar later
It would be nice if Hive supported a Gradle like notation to download the jar 
from a repository.

Example:  add jar org:module:version

It should also be backward compatible and should take jar from the local 
file-system as well. 

RB:  https://reviews.apache.org/r/31628/

  was:
Currently Hive's "add jar" command takes a local path to the dependency jar. 
This clutters the local file-system as users may forget to remove this jar later
It would be nice if Hive supported a Gradle like notation to download the jar 
from a repository.

Example:  add jar org:module:version

It should also be backward compatible and should take jar from the local 
file-system as well. 




> Hive "add jar" command should be able to download and add jars from a 
> repository
> 
>
> Key: HIVE-9664
> URL: https://issues.apache.org/jira/browse/HIVE-9664
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Anant Nag
>Assignee: Anant Nag
>  Labels: hive, patch
> Attachments: HIVE-9664.4.patch, HIVE-9664.5.patch, HIVE-9664.patch, 
> HIVE-9664.patch, HIVE-9664.patch
>
>
> Currently Hive's "add jar" command takes a local path to the dependency jar. 
> This clutters the local file-system as users may forget to remove this jar 
> later
> It would be nice if Hive supported a Gradle like notation to download the jar 
> from a repository.
> Example:  add jar org:module:version
> 
> It should also be backward compatible and should take jar from the local 
> file-system as well. 
> RB:  https://reviews.apache.org/r/31628/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4941) PTest2 Investigate Ignores

2015-03-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380823#comment-14380823
 ] 

Anthony Hsu commented on HIVE-4941:
---

I figured it out. I'm not supposed to run TestHiveMetaStore directly. I'm 
supposed to run the subclasses TestEmbeddedHiveMetaStore and 
TestRemoteHiveMetaStore instead. Running the subclasses works.

> PTest2 Investigate Ignores
> --
>
> Key: HIVE-4941
> URL: https://issues.apache.org/jira/browse/HIVE-4941
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
>
> Currently we excluding the following tests:
> unitTests.exclude = TestHiveMetaStore TestSerDe TestBeeLineDriver 
> TestHiveServer2Concurrency TestJdbcDriver2 TestHiveServer2Concurrency 
> TestBeeLineDriver
> some of them we got from the build files but I am not sure about 
> TestJdbcDriver2 for example. We should investigate why these are excluded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-03-25 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9582:
---
Attachment: HIVE-9582.5.patch

(Attaching rebased patch)

> HCatalog should use IMetaStoreClient interface
> --
>
> Key: HIVE-9582
> URL: https://issues.apache.org/jira/browse/HIVE-9582
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>  Labels: hcatalog, metastore, rolling_upgrade
> Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
> HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9583.1.patch
>
>
> Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
> Hence during a failure, the client retries and possibly succeeds. But 
> HCatalog has long been using HiveMetaStoreClient directly and hence failures 
> are costly, especially if they are during the commit stage of a job. Its also 
> not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-03-25 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380812#comment-14380812
 ] 

Sushanth Sowmyan commented on HIVE-9582:


I've gone through this patch, and am +1 on the intent of the .4.patch. It did 
not apply cleanly on trunk, however, so I've rebased the patch slightly to make 
it patch. I'm uploading the .5.patch to reflect this change so we can get one 
more unit test run complete with the latest patch.

That said, I do have one point of concern. Changing HCatUtil.getHiveClient to 
return IMetaStoreClient instead of a HMSC can potentially break oozie, falcon 
and sqoop compilation, not to mention other external user code that uses that 
method to get a HiveMetaStoreClient. I agree that it should have been IMSC all 
along, rather than HMSC, but it is now published interface.

We could solve this as follows:

a) create a new function getMetaStoreClient() which returns a IMSC. Change all 
code in HCat to refer to this function instead of the current getHiveClient()

b) retain the existing getHiveClient() and have it continue to return a HMSC, 
and mark it for deprecation in 2 releases - i.e. deprecated in 1.2, gone in 
1.3. I worried if this was possible given that you change the Cache class, but 
luckily, your underlying implementation of ICacheableMetaStoreClient is a 
CacheableHiveMetaStoreClient which is a HMSC, so this is still possible with 
minimal changes.

Once we do this, we should communicate to oozie/falcon/sqoop developers to 
change their usage of this function to the more generic one.

> HCatalog should use IMetaStoreClient interface
> --
>
> Key: HIVE-9582
> URL: https://issues.apache.org/jira/browse/HIVE-9582
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Metastore
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>  Labels: hcatalog, metastore, rolling_upgrade
> Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
> HIVE-9582.4.patch, HIVE-9583.1.patch
>
>
> Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
> Hence during a failure, the client retries and possibly succeeds. But 
> HCatalog has long been using HiveMetaStoreClient directly and hence failures 
> are costly, especially if they are during the commit stage of a job. Its also 
> not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4941) PTest2 Investigate Ignores

2015-03-25 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380793#comment-14380793
 ] 

Anthony Hsu commented on HIVE-4941:
---

I'm trying to run TestHiveMetaStore but I'm having trouble getting it to run.  
I made the following changes to the root pom.xml file
{code}
diff --git a/pom.xml b/pom.xml
index 5d4f13c..a8098ce 100644
--- a/pom.xml
+++ b/pom.xml
@@ -712,12 +712,12 @@
 
   
 **/TestSerDe.java
-**/TestHiveMetaStore.java
+
 **/ql/exec/vector/util/*.java
 **/ql/exec/vector/udf/legacy/*.java
 **/ql/exec/vector/udf/generic/*.java
 **/TestHiveServer2Concurrency.java
-**/TestHiveMetaStore.java
+
   
   true
   false
{code}
and then ran
{code}
mvn clean install -DskipTests -Phadoop-2
cd itests
mvn clean install -DskipTests -Phadoop-2
mvn test -Phadoop-2 -Dtest=TestHiveMetaStore
{code}

However, in the output, I see no tests were run:
{code}
---
 T E S T S
---

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
{code}

A couple questions:
# Why is the TestHiveMetaStore test excluded?
# How do I run the TestHiveMetaStore test?

> PTest2 Investigate Ignores
> --
>
> Key: HIVE-4941
> URL: https://issues.apache.org/jira/browse/HIVE-4941
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
>
> Currently we excluding the following tests:
> unitTests.exclude = TestHiveMetaStore TestSerDe TestBeeLineDriver 
> TestHiveServer2Concurrency TestJdbcDriver2 TestHiveServer2Concurrency 
> TestBeeLineDriver
> some of them we got from the build files but I am not sure about 
> TestJdbcDriver2 for example. We should investigate why these are excluded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9664) Hive "add jar" command should be able to download and add jars from a repository

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380792#comment-14380792
 ] 

Hive QA commented on HIVE-9664:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707187/HIVE-9664.5.patch

{color:green}SUCCESS:{color} +1 8344 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3153/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3153/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3153/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707187 - PreCommit-HIVE-TRUNK-Build

> Hive "add jar" command should be able to download and add jars from a 
> repository
> 
>
> Key: HIVE-9664
> URL: https://issues.apache.org/jira/browse/HIVE-9664
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Anant Nag
>Assignee: Anant Nag
>  Labels: hive, patch
> Attachments: HIVE-9664.4.patch, HIVE-9664.5.patch, HIVE-9664.patch, 
> HIVE-9664.patch, HIVE-9664.patch
>
>
> Currently Hive's "add jar" command takes a local path to the dependency jar. 
> This clutters the local file-system as users may forget to remove this jar 
> later
> It would be nice if Hive supported a Gradle like notation to download the jar 
> from a repository.
> Example:  add jar org:module:version
> 
> It should also be backward compatible and should take jar from the local 
> file-system as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10078) Optionally allow logging of records processed in fixed intervals

2015-03-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380780#comment-14380780
 ] 

Prasanth Jayachandran commented on HIVE-10078:
--

LGTM, +1. Pending unit tests.

> Optionally allow logging of records processed in fixed intervals
> 
>
> Key: HIVE-10078
> URL: https://issues.apache.org/jira/browse/HIVE-10078
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10078.1.patch, HIVE-10078.2.patch
>
>
> Tasks today log progress (records in/records out) on an exponential scale (1, 
> 10, 100, ...). Sometimes it's helpful to be able to switch to fixed interval. 
> That can help debugging certain issues that look like a hang, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10078) Optionally allow logging of records processed in fixed intervals

2015-03-25 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10078:
--
Attachment: HIVE-10078.2.patch

> Optionally allow logging of records processed in fixed intervals
> 
>
> Key: HIVE-10078
> URL: https://issues.apache.org/jira/browse/HIVE-10078
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10078.1.patch, HIVE-10078.2.patch
>
>
> Tasks today log progress (records in/records out) on an exponential scale (1, 
> 10, 100, ...). Sometimes it's helpful to be able to switch to fixed interval. 
> That can help debugging certain issues that look like a hang, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10078) Optionally allow logging of records processed in fixed intervals

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380746#comment-14380746
 ] 

Gunther Hagleitner commented on HIVE-10078:
---

Thanks [~prasanth_j]. I thought I had already done that. Sigh.

> Optionally allow logging of records processed in fixed intervals
> 
>
> Key: HIVE-10078
> URL: https://issues.apache.org/jira/browse/HIVE-10078
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10078.1.patch
>
>
> Tasks today log progress (records in/records out) on an exponential scale (1, 
> 10, 100, ...). Sometimes it's helpful to be able to switch to fixed interval. 
> That can help debugging certain issues that look like a hang, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380740#comment-14380740
 ] 

Gunther Hagleitner commented on HIVE-9976:
--

+1

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Attachments: HIVE-9976.1.patch, HIVE-9976.2.patch, 
> llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380733#comment-14380733
 ] 

Aihua Xu commented on HIVE-10085:
-

Please try again. First time to use the rbt tool. Seems I still need to set 
permission and publish.

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10074) Ability to run HCat Client Unit tests in a system test setting

2015-03-25 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10074:

Attachment: HIVE-10074.1.patch

> Ability to run HCat Client Unit tests in a system test setting
> --
>
> Key: HIVE-10074
> URL: https://issues.apache.org/jira/browse/HIVE-10074
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Deepesh Khandelwal
>Assignee: Deepesh Khandelwal
> Attachments: HIVE-10074.1.patch, HIVE-10074.patch
>
>
> Following testsuite 
> {{hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java}}
>  is a JUnit testsuite to test some basic HCat client API. During setup it 
> brings up a Hive Metastore with embedded Derby. The testsuite however will be 
> even more useful if it can be run against a running Hive Metastore 
> (transparent to whatever backing DB its running against).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10074) Ability to run HCat Client Unit tests in a system test setting

2015-03-25 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380727#comment-14380727
 ] 

Sushanth Sowmyan commented on HIVE-10074:
-

The test failures do not look related to this patch. Still, might be worth 
rerunning the tests to confirm. I'm going to cancel patch, reupload the same 
patch and set available again.



> Ability to run HCat Client Unit tests in a system test setting
> --
>
> Key: HIVE-10074
> URL: https://issues.apache.org/jira/browse/HIVE-10074
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Deepesh Khandelwal
>Assignee: Deepesh Khandelwal
> Attachments: HIVE-10074.patch
>
>
> Following testsuite 
> {{hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java}}
>  is a JUnit testsuite to test some basic HCat client API. During setup it 
> brings up a Hive Metastore with embedded Derby. The testsuite however will be 
> even more useful if it can be run against a running Hive Metastore 
> (transparent to whatever backing DB its running against).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-1575) get_json_object does not support JSON array at the root level

2015-03-25 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380723#comment-14380723
 ] 

Jason Dere commented on HIVE-1575:
--

+1 if tests look good

> get_json_object does not support JSON array at the root level
> -
>
> Key: HIVE-1575
> URL: https://issues.apache.org/jira/browse/HIVE-1575
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.7.0
>Reporter: Steven Wong
>Assignee: Alexander Pivovarov
> Attachments: 
> 0001-Updated-UDFJson-to-allow-arrays-as-a-root-object.patch, 
> HIVE-1575.2.patch, HIVE-1575.3.patch, HIVE-1575.4.patch, HIVE-1575.5.patch
>
>
> Currently, get_json_object(json_txt, path) always returns null if json_txt is 
> not a JSON object (e.g. is a JSON array) at the root level.
> I have a table column of JSON arrays at the root level, but I can't parse it 
> because of that.
> get_json_object should accept any JSON value (string, number, object, array, 
> true, false, null), not just object, at the root level. In other words, it 
> should behave as if it were named get_json_value or simply get_json.
> Per the JSON RFC, an array is indeed legal top-level JSON-text



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10087) Beeline's --silent option should suppress query from being echoed when running with -f option

2015-03-25 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380721#comment-14380721
 ] 

Naveen Gangam commented on HIVE-10087:
--

I have filed a new jira against jline to have this fixed.
https://github.com/jline/jline2/issues/181


> Beeline's --silent option should suppress query from being echoed when 
> running with -f option
> -
>
> Key: HIVE-10087
> URL: https://issues.apache.org/jira/browse/HIVE-10087
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-10087.patch
>
>
> The {{-e}} and the {{-f}} options behave differently. 
> {code}
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -f select.sql
> 0: jdbc:hive2://localhost:1/default> select * from sample_07 limit 5;
> --
> 00- All Occupations 134354250 40690
> 11- Management occupations 6003930 96150
> 11-1011 Chief executives 299160 151370
> 11-1021 General and operations managers 1655410 103780
> 11-1031 Legislators 61110 33880
> --
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -e "select * from sample_07 limit 5;"
> --
> 00-   All Occupations 134354250   40690
> 11-   Management occupations  6003930 96150
> 11-1011   Chief executives299160  151370
> 11-1021   General and operations managers 1655410 103780
> 11-1031   Legislators 61110   33880
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-1575) get_json_object does not support JSON array at the root level

2015-03-25 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-1575:
--
Attachment: HIVE-1575.5.patch

patch #5.
fixed 2 issues with initial json path validation

> get_json_object does not support JSON array at the root level
> -
>
> Key: HIVE-1575
> URL: https://issues.apache.org/jira/browse/HIVE-1575
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.7.0
>Reporter: Steven Wong
>Assignee: Alexander Pivovarov
> Attachments: 
> 0001-Updated-UDFJson-to-allow-arrays-as-a-root-object.patch, 
> HIVE-1575.2.patch, HIVE-1575.3.patch, HIVE-1575.4.patch, HIVE-1575.5.patch
>
>
> Currently, get_json_object(json_txt, path) always returns null if json_txt is 
> not a JSON object (e.g. is a JSON array) at the root level.
> I have a table column of JSON arrays at the root level, but I can't parse it 
> because of that.
> get_json_object should accept any JSON value (string, number, object, array, 
> true, false, null), not just object, at the root level. In other words, it 
> should behave as if it were named get_json_value or simply get_json.
> Per the JSON RFC, an array is indeed legal top-level JSON-text



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10087) Beeline's --silent option should suppress query from being echoed when running with -f option

2015-03-25 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380690#comment-14380690
 ] 

Naveen Gangam commented on HIVE-10087:
--

This is coming from the jline code. As it reads from the script file character 
by character, it adds it to the StringBuffer and also echoes it to the 
terminal, if mask character is null. In this case, we have set the mask 
character to null character value to prevent this echo. When it encounters a 
EOL character in the file aka ";", it completes the line and prints a newline 
character and returns the line to beeline to be executed. 

{code}

case ACCEPT_LINE:
return accept();


public String accept() throws IOException {
moveToEnd();
println(); // output newline
flush();
return finishBuffer();
}


public final void println() throws IOException {
print(CR);
}

{code}

If this newline were to be turned off, the first line of the result set would 
be printed on the same line as the query. Something like this
{code}
$ beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
--silent=false --outputformat=tsv2 -f query.sql 
scan complete in 2ms
Connecting to jdbc:hive2://localhost:1/default
0: jdbc:hive2://localhost> select * from booleantest limit 5;true   t
false   f
TRUET
FALSE   F
ZERO0
{code}

jline code does not provide any APIs to allow me to dictate this. Something 
like this should fix this. I will file a jira against jline.
{code}
public final void println() throws IOException {
if (mask == null)
  print(CR);
}
{code}

For now, we may have to get by with what we have.

> Beeline's --silent option should suppress query from being echoed when 
> running with -f option
> -
>
> Key: HIVE-10087
> URL: https://issues.apache.org/jira/browse/HIVE-10087
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-10087.patch
>
>
> The {{-e}} and the {{-f}} options behave differently. 
> {code}
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -f select.sql
> 0: jdbc:hive2://localhost:1/default> select * from sample_07 limit 5;
> --
> 00- All Occupations 134354250 40690
> 11- Management occupations 6003930 96150
> 11-1011 Chief executives 299160 151370
> 11-1021 General and operations managers 1655410 103780
> 11-1031 Legislators 61110 33880
> --
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -e "select * from sample_07 limit 5;"
> --
> 00-   All Occupations 134354250   40690
> 11-   Management occupations  6003930 96150
> 11-1011   Chief executives299160  151370
> 11-1021   General and operations managers 1655410 103780
> 11-1031   Legislators 61110   33880
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-9976:
-
Attachment: HIVE-9976.2.patch

Thanks for the review. Updated patch with comments addressed, and some more 
changes.

bq. Not your fault - but there are 2 paths through HiveSplitGenerator.
Moved the methods into SplitGrouper. There's a static cache in there which 
seems a little strange. Will create a follow up jira to investigate this. For 
now I've changed that to a ConcurrentMap since split generation can run in 
parallel.

bq. i see you've fixed calling close consistently on the data input stream. 
maybe use try{}finally there?
Fixed. There was a bug with some of the other conditions which I'd changed. 
Fixed that as well.

bq. it seems you're setting numexpectedevents to 0 first and then turn around 
and call decrement. Why not just set to -1? Also - why atomic integers? as far 
as i can tell all access to these maps is synchronized.
numExpectedEvents is decremented for each column for which a source will send 
events. That's used to track total number of expected events from that source. 
Added a comment for this.
Moved from AtomicIntegers to MutableInt - this was just to avoid re-inserting 
the Integer into the map, and not for thread safety.

bq. does it make sense to make initialize in the pruner private now? (can't be 
used to init anymore - only from the constr). Also, the parameters aren't used 
anymore, right?
Done, along with some other methods.


> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Attachments: HIVE-9976.1.patch, HIVE-9976.2.patch, 
> llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380680#comment-14380680
 ] 

Ashutosh Chauhan commented on HIVE-10085:
-

Above link says:
{noformat}
You don't have access to this review request.
This review request is private. You must be a requested reviewer, either 
directly or on a requested group, and have permission to access the repository 
in order to view this review request.
{noformat}

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls

2015-03-25 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10091:
-
Attachment: HIVE-10091.1.patch

> Generate Hbase execution plan for partition filter conditions in HbaseStore 
> api calls
> -
>
> Key: HIVE-10091
> URL: https://issues.apache.org/jira/browse/HIVE-10091
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-10091.1.patch
>
>
> RawStore functions that support partition filtering are the following - 
> getPartitionsByExpr
> getPartitionsByFilter (takes filter string as argument, used from hcatalog)
> We need to generate a query execution plan in terms of Hbase scan api calls 
> for a given filter condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10091) Generate Hbase execution plan for partition filter conditions in HbaseStore api calls

2015-03-25 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10091:
-
Description: 
RawStore functions that support partition filtering are the following - 
getPartitionsByExpr
getPartitionsByFilter (takes filter string as argument, used from hcatalog)

We need to generate a query execution plan in terms of Hbase scan api calls for 
a given filter condition.


NO PRECOMMIT TESTS

  was:
RawStore functions that support partition filtering are the following - 
getPartitionsByExpr
getPartitionsByFilter (takes filter string as argument, used from hcatalog)

We need to generate a query execution plan in terms of Hbase scan api calls for 
a given filter condition.



> Generate Hbase execution plan for partition filter conditions in HbaseStore 
> api calls
> -
>
> Key: HIVE-10091
> URL: https://issues.apache.org/jira/browse/HIVE-10091
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-10091.1.patch
>
>
> RawStore functions that support partition filtering are the following - 
> getPartitionsByExpr
> getPartitionsByFilter (takes filter string as argument, used from hcatalog)
> We need to generate a query execution plan in terms of Hbase scan api calls 
> for a given filter condition.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380625#comment-14380625
 ] 

Aihua Xu commented on HIVE-10085:
-

RB entry: https://reviews.apache.org/r/32491/

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8817) Create unit test where we insert into an encrypted table and then read from it with pig

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380623#comment-14380623
 ] 

Hive QA commented on HIVE-8817:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12704794/HIVE-8817.patch

{color:green}SUCCESS:{color} +1 8342 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3151/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3151/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3151/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12704794 - PreCommit-HIVE-TRUNK-Build

> Create unit test where we insert into an encrypted table and then read from 
> it with pig
> ---
>
> Key: HIVE-8817
> URL: https://issues.apache.org/jira/browse/HIVE-8817
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: encryption-branch
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
> Fix For: encryption-branch
>
> Attachments: HIVE-8817.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9563) CBO(Calcite Return Path): Translate GB to Hive OP [CBO branch]

2015-03-25 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran resolved HIVE-9563.
--
Resolution: Fixed

> CBO(Calcite Return Path): Translate GB to Hive OP [CBO branch]
> --
>
> Key: HIVE-9563
> URL: https://issues.apache.org/jira/browse/HIVE-9563
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: cbo-branch
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Fix For: 1.2.0
>
> Attachments: HIVE-9563.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380536#comment-14380536
 ] 

Ashutosh Chauhan commented on HIVE-10085:
-

Can you create a RB entry for this?

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380529#comment-14380529
 ] 

Aihua Xu commented on HIVE-10085:
-

[~ashutoshc] Seems you have worked on this. Can you help me review the change? 

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10089) RCFile: lateral view explode caused ConcurrentModificationException

2015-03-25 Thread Selina Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380511#comment-14380511
 ] 

Selina Zhang commented on HIVE-10089:
-

Three conditions triggers this problem: 1. RCFIle (tested on ORC/AVRO/TEXT, all 
good) 2. map column and explode columns both included in projection list 3. map 
column has more than one entry. 

The reason HIVE-2540 does not fix this issue is because RCFile uses 
LazyBinaryMap and LazyBinaryArray, which are missing from the HIVE-2540 patch. 

> RCFile: lateral view explode caused ConcurrentModificationException
> ---
>
> Key: HIVE-10089
> URL: https://issues.apache.org/jira/browse/HIVE-10089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>
> CREATE TABLE test_table123 (a INT, b MAP) STORED AS RCFILE;
> INSERT OVERWRITE TABLE test_table123 SELECT 1, MAP("a1", "b1", "c1", "d1") 
> FROM src LIMIT 1;
> The following query will lead to ConcurrentModificationException
> SELECT * FROM (SELECT b FROM test_table123) t1 LATERAL VIEW explode(b) x AS 
> b,c LIMIT 1;
> Failed with exception 
> java.io.IOException:java.util.ConcurrentModificationException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10069) CBO (Calcite Return Path): Ambiguity table name causes problem in field trimmer [CBO Branch]

2015-03-25 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380507#comment-14380507
 ] 

Jesus Camacho Rodriguez commented on HIVE-10069:


[~jpullokkaran], can you check it? Thanks

> CBO (Calcite Return Path): Ambiguity table name causes problem in field 
> trimmer [CBO Branch]
> 
>
> Key: HIVE-10069
> URL: https://issues.apache.org/jira/browse/HIVE-10069
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: cbo-branch
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: cbo-branch
>
> Attachments: HIVE-10069.cbo.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10090) Add connection manager for Tephra

2015-03-25 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380488#comment-14380488
 ] 

Alan Gates commented on HIVE-10090:
---

I have not yet run this on a stand alone instance of Tephra, but I have run it 
with the unit tests and an in memory version of Tephra and it works.  To run 
with this rather than the no-transactions connections just add 
{{-Dhive.metastore.hbase.connection.class=org.apache.hadoop.hive.metastore.hbase.TephraHBaseConnection}}
 to your mvn test command.

> Add connection manager for Tephra
> -
>
> Key: HIVE-10090
> URL: https://issues.apache.org/jira/browse/HIVE-10090
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-10090.patch
>
>
> The task is to create an implementation of HBaseConnection that will use 
> Tephra for transaction management.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10090) Add connection manager for Tephra

2015-03-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-10090:
--
Attachment: HIVE-10090.patch

> Add connection manager for Tephra
> -
>
> Key: HIVE-10090
> URL: https://issues.apache.org/jira/browse/HIVE-10090
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-10090.patch
>
>
> The task is to create an implementation of HBaseConnection that will use 
> Tephra for transaction management.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-25 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380477#comment-14380477
 ] 

Jason Dere commented on HIVE-9518:
--

+1 pending tests

> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.
> Should accept date, timestamp and string arguments in the format '-MM-dd' 
> or '-MM-dd HH:mm:ss'. The time part should be ignored.
> The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9976:
-
Fix Version/s: (was: 1.0.1)

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Attachments: HIVE-9976.1.patch, llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380451#comment-14380451
 ] 

Gunther Hagleitner commented on HIVE-9976:
--

other than the above looks good to me. like the extra comments and conditions 
you've put in!

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Fix For: 1.0.1
>
> Attachments: HIVE-9976.1.patch, llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380446#comment-14380446
 ] 

Gunther Hagleitner commented on HIVE-9976:
--

does it make sense to make initialize in the pruner private now? (can't be used 
to init anymore - only from the constr). Also, the parameters aren't used 
anymore, right?

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Fix For: 1.0.1
>
> Attachments: HIVE-9976.1.patch, llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9563) CBO(Calcite Return Path): Translate GB to Hive OP [CBO branch]

2015-03-25 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-9563:
-
Attachment: HIVE-9563.patch

> CBO(Calcite Return Path): Translate GB to Hive OP [CBO branch]
> --
>
> Key: HIVE-9563
> URL: https://issues.apache.org/jira/browse/HIVE-9563
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: cbo-branch
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Fix For: 1.2.0
>
> Attachments: HIVE-9563.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380438#comment-14380438
 ] 

Gunther Hagleitner commented on HIVE-9976:
--

it seems you're setting numexpectedevents to 0 first and then turn around and 
call decrement. Why not just set to -1? Also - why atomic integers? as far as i 
can tell all access to these maps is synchronized.

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Fix For: 1.0.1
>
> Attachments: HIVE-9976.1.patch, llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380426#comment-14380426
 ] 

Gunther Hagleitner commented on HIVE-9976:
--

i see you've fixed calling close consistently on the data input stream. maybe 
use try{}finally there?

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Fix For: 1.0.1
>
> Attachments: HIVE-9976.1.patch, llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9839) HiveServer2 leaks OperationHandle on async queries which fail at compile phase

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380428#comment-14380428
 ] 

Hive QA commented on HIVE-9839:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707139/HIVE-9839.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8338 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3150/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3150/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3150/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707139 - PreCommit-HIVE-TRUNK-Build

> HiveServer2 leaks OperationHandle on async queries which fail at compile phase
> --
>
> Key: HIVE-9839
> URL: https://issues.apache.org/jira/browse/HIVE-9839
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Critical
> Attachments: HIVE-9839.patch, HIVE-9839.patch, HIVE-9839.patch, 
> HIVE-9839.patch, OperationHandleMonitor.java
>
>
> Using beeline to connect to HiveServer2.And type the following:
> drop table if exists table_not_exists;
> select * from table_not_exists;
> There will be an OperationHandle object staying in HiveServer2's memory for 
> ever even after quit from beeline .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10087) Beeline's --silent option should suppress query from being echoed when running with -f option

2015-03-25 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380419#comment-14380419
 ] 

Naveen Gangam commented on HIVE-10087:
--

Thanks Xuefu, I will take a look where this is from. I suspect this might be 
from jline printing an empty buffer that holds the prompt & the query string.

> Beeline's --silent option should suppress query from being echoed when 
> running with -f option
> -
>
> Key: HIVE-10087
> URL: https://issues.apache.org/jira/browse/HIVE-10087
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-10087.patch
>
>
> The {{-e}} and the {{-f}} options behave differently. 
> {code}
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -f select.sql
> 0: jdbc:hive2://localhost:1/default> select * from sample_07 limit 5;
> --
> 00- All Occupations 134354250 40690
> 11- Management occupations 6003930 96150
> 11-1011 Chief executives 299160 151370
> 11-1021 General and operations managers 1655410 103780
> 11-1031 Legislators 61110 33880
> --
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -e "select * from sample_07 limit 5;"
> --
> 00-   All Occupations 134354250   40690
> 11-   Management occupations  6003930 96150
> 11-1011   Chief executives299160  151370
> 11-1021   General and operations managers 1655410 103780
> 11-1031   Legislators 61110   33880
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9486) Use session classloader instead of application loader

2015-03-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380424#comment-14380424
 ] 

Ashutosh Chauhan commented on HIVE-9486:


Changes in JobState was made for completeness sake (of having an uniform way of 
class loading in Hive), no other particular reason AFAIK. I am fine with just 
using Class.forName() there instead, since this call is on remote short lived 
launcher process, where issue described here is likely irrelevant.

> Use session classloader instead of application loader
> -
>
> Key: HIVE-9486
> URL: https://issues.apache.org/jira/browse/HIVE-9486
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt
>
>
> From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
> Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-25 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Description: 
This is used to track work to build Oracle like months_between. Here's 
semantics:
MONTHS_BETWEEN returns number of months between dates date1 and date2. If date1 
is later than date2, then the result is positive. If date1 is earlier than 
date2, then the result is negative. If date1 and date2 are either the same days 
of the month or both last days of months, then the result is always an integer. 
Otherwise Oracle Database calculates the fractional portion of the result based 
on a 31-day month and considers the difference in time components date1 and 
date2.
Should accept date, timestamp and string arguments in the format '-MM-dd' 
or '-MM-dd HH:mm:ss'. The time part should be ignored.
The result should be rounded to 8 decimal places.

  was:
This is used to track work to build Oracle like months_between. Here's 
semantics:
MONTHS_BETWEEN returns number of months between dates date1 and date2. If date1 
is later than date2, then the result is positive. If date1 is earlier than 
date2, then the result is negative. If date1 and date2 are either the same days 
of the month or both last days of months, then the result is always an integer. 
Otherwise Oracle Database calculates the fractional portion of the result based 
on a 31-day month and considers the difference in time components date1 and 
date2.


> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.
> Should accept date, timestamp and string arguments in the format '-MM-dd' 
> or '-MM-dd HH:mm:ss'. The time part should be ignored.
> The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-25 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Component/s: UDF

> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-25 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Issue Type: Improvement  (was: Bug)

> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380408#comment-14380408
 ] 

Hive QA commented on HIVE-10073:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707243/HIVE-10073.1-spark.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7644 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_memcheck
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/805/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/805/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-805/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707243 - PreCommit-HIVE-SPARK-Build

> Runtime exception when querying HBase with Spark [Spark Branch]
> ---
>
> Key: HIVE-10073
> URL: https://issues.apache.org/jira/browse/HIVE-10073
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-10073.1-spark.patch
>
>
> When querying HBase with Spark, we got 
> {noformat}
>  Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
> {noformat}
> But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9976) Possible race condition in DynamicPartitionPruner for <200ms tasks

2015-03-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380407#comment-14380407
 ] 

Gunther Hagleitner commented on HIVE-9976:
--

Not your fault - but there are 2 paths through HiveSplitGenerator. The class is 
used once without calling init and once being properly init'd. The reason is 
that some other code needs to use the "group splits" method. Since you've moved 
init to the constr now, this has gotten even uglier. Could you move the split 
grouper methods to a separate util class (static) and leave the pruner to just 
prune.

Also - I think you've moved the initialization of the dynamic pruner to the 
constr of the input initializer, in order to not miss any events. Can you add a 
comment to the code explaining this?

Very cool to see a real unit test  :-) thanks.

> Possible race condition in DynamicPartitionPruner for <200ms tasks
> --
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Fix For: 1.0.1
>
> Attachments: HIVE-9976.1.patch, llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between 
> DynamicPartitionPruner::processVertex() and 
> DynamicPartitionpruner::addEvent() for tasks which respond with both the 
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0] 
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Vertex Input: store_sales initializer failed, 
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in 
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger 
> this, which seems to be consistently happening with LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-25 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Attachment: HIVE-9518.5.patch

patch #5
refactored the code

> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10085) Lateral view on top of a view throws RuntimeException

2015-03-25 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10085:

Attachment: HIVE-10085.patch

When the sizes RowSchema and the columns after pruning are the same, right now, 
we don't update SEL operator info in ColumnPrunerLateralViewForwardProc while 
the order of the columns may get reordered and require to update SEL operator 
as well in this issue. 

> Lateral view on top of a view throws RuntimeException
> -
>
> Key: HIVE-10085
> URL: https://issues.apache.org/jira/browse/HIVE-10085
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10085.patch
>
>
> Following the following sqls to create table and view and execute a select 
> statement. It will throw the runtime exception:
> {noformat}
> FAILED: RuntimeException 
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: "map" or "list" is 
> expected at function SIZE, but "int" is found
> {noformat}
> {noformat} 
> CREATE TABLE t1( symptom STRING,  pattern ARRAY,  occurrence INT, index 
> INT);
> CREATE OR REPLACE VIEW v1 AS
> SELECT TRIM(pd.symptom) AS symptom, pd.index, pd.pattern, pd.occurrence, 
> pd.occurrence as cnt from t1 pd;
> SELECT pattern_data.symptom, pattern_data.index, pattern_data.occurrence, 
> pattern_data.cnt, size(pattern_data.pattern) as pattern_length, 
> pattern.pattern_id
> FROM v1 pattern_data LATERAL VIEW explode(pattern) pattern AS pattern_id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9780) Add another level of explain for RDBMS audience

2015-03-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9780:
--
Attachment: HIVE-9780.03.patch

> Add another level of explain for RDBMS audience
> ---
>
> Key: HIVE-9780
> URL: https://issues.apache.org/jira/browse/HIVE-9780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-9780.01.patch, HIVE-9780.02.patch, 
> HIVE-9780.03.patch
>
>
> Current Hive Explain (default) is targeted at MR Audience. We need a new 
> level of explain plan to be targeted at RDBMS audience. The explain requires 
> these:
> 1) The focus needs to be on what part of the query is being executed rather 
> than internals of the engines
> 2) There needs to be a clearly readable tree of operations
> 3) Examples - Table scan should mention the table being scanned, the Sarg, 
> the size of table and expected cardinality after the Sarg'ed read. The join 
> should mention the table being joined with and the join condition. The 
> aggregate should mention the columns in the group-by. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10087) Beeline's --silent option should suppress query from being echoed when running with -f option

2015-03-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380336#comment-14380336
 ] 

Xuefu Zhang commented on HIVE-10087:


Patch looks good. One minor thing, I noticed there is a bland line in the 
console output for -f when --silent=true. Is there a way to get rid of that?

> Beeline's --silent option should suppress query from being echoed when 
> running with -f option
> -
>
> Key: HIVE-10087
> URL: https://issues.apache.org/jira/browse/HIVE-10087
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-10087.patch
>
>
> The {{-e}} and the {{-f}} options behave differently. 
> {code}
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -f select.sql
> 0: jdbc:hive2://localhost:1/default> select * from sample_07 limit 5;
> --
> 00- All Occupations 134354250 40690
> 11- Management occupations 6003930 96150
> 11-1011 Chief executives 299160 151370
> 11-1021 General and operations managers 1655410 103780
> 11-1031 Legislators 61110 33880
> --
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -e "select * from sample_07 limit 5;"
> --
> 00-   All Occupations 134354250   40690
> 11-   Management occupations  6003930 96150
> 11-1011   Chief executives299160  151370
> 11-1021   General and operations managers 1655410 103780
> 11-1031   Legislators 61110   33880
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10078) Optionally allow logging of records processed in fixed intervals

2015-03-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380323#comment-14380323
 ] 

Prasanth Jayachandran commented on HIVE-10078:
--

[~hagleitn] Can you take a look at the failures? I suspect its because of 
hiveconf.getLongVar(). The default value should be "0L" instead of "0" I guess. 
Here is the exception
{code}
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
... 23 more
Caused by: java.lang.AssertionError: hive.log.every.n.records
at org.apache.hadoop.hive.conf.HiveConf.getLongVar(HiveConf.java:2408)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:432)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:119)
... 23 more
{code}


> Optionally allow logging of records processed in fixed intervals
> 
>
> Key: HIVE-10078
> URL: https://issues.apache.org/jira/browse/HIVE-10078
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10078.1.patch
>
>
> Tasks today log progress (records in/records out) on an exponential scale (1, 
> 10, 100, ...). Sometimes it's helpful to be able to switch to fixed interval. 
> That can help debugging certain issues that look like a hang, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9674) *DropPartitionEvent should handle partition-sets.

2015-03-25 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380320#comment-14380320
 ] 

Mithun Radhakrishnan commented on HIVE-9674:


Sush, could you please review this one? I'd like to avoid another rebase.

> *DropPartitionEvent should handle partition-sets.
> -
>
> Key: HIVE-9674
> URL: https://issues.apache.org/jira/browse/HIVE-9674
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-9674.2.patch, HIVE-9736.3.patch, HIVE-9736.4.patch
>
>
> Dropping a set of N partitions from a table currently results in N 
> DropPartitionEvents (and N PreDropPartitionEvents) being fired serially. This 
> is wasteful, especially so for large N. It also makes it impossible to even 
> try to run authorization-checks on all partitions in a batch.
> Taking the cue from HIVE-9609, we should compose an {{Iterable}} 
> in the event, and expose them via an {{Iterator}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10087) Beeline's --silent option should suppress query from being echoed when running with -f option

2015-03-25 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10087:
-
Attachment: HIVE-10087.patch

The following is the output with the proposed fix with --silent=true and 
--silent=false (query is echoed back)
{code}
$ beeline -u jdbc:hive2://localhost.com:1/default --showHeader=false 
--silent=true -f query.sql 

+---++--+
| true  | t  |
| false | f  |
| TRUE  | T  |
| FALSE | F  |
| ZERO  | 0  |
+---++--+

$ beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
--silent=false -f query.sql 
0: jdbc:hive2://localhost> select * from booleantest limit 5;
+---++--+
| true  | t  |
| false | f  |
| TRUE  | T  |
| FALSE | F  |
| ZERO  | 0  |
+---++--+
5 rows selected (1.087 seconds)
0: jdbc:hive2://localhost> 
Closing: 0: jdbc:hive2://localhost:1/default
{code}


> Beeline's --silent option should suppress query from being echoed when 
> running with -f option
> -
>
> Key: HIVE-10087
> URL: https://issues.apache.org/jira/browse/HIVE-10087
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 0.13.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-10087.patch
>
>
> The {{-e}} and the {{-f}} options behave differently. 
> {code}
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -f select.sql
> 0: jdbc:hive2://localhost:1/default> select * from sample_07 limit 5;
> --
> 00- All Occupations 134354250 40690
> 11- Management occupations 6003930 96150
> 11-1011 Chief executives 299160 151370
> 11-1021 General and operations managers 1655410 103780
> 11-1031 Legislators 61110 33880
> --
> beeline -u jdbc:hive2://localhost:1/default --showHeader=false 
> --silent=true -e "select * from sample_07 limit 5;"
> --
> 00-   All Occupations 134354250   40690
> 11-   Management occupations  6003930 96150
> 11-1011   Chief executives299160  151370
> 11-1021   General and operations managers 1655410 103780
> 11-1031   Legislators 61110   33880
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380246#comment-14380246
 ] 

Hive QA commented on HIVE-9937:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707114/HIVE-9937.06.patch

{color:green}SUCCESS:{color} +1 8342 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3149/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3149/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3149/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707114 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, HIVE-9937.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380213#comment-14380213
 ] 

Jimmy Xiang commented on HIVE-10073:


Attached a path that invokes checkOutputSpecs for Spark.

> Runtime exception when querying HBase with Spark [Spark Branch]
> ---
>
> Key: HIVE-10073
> URL: https://issues.apache.org/jira/browse/HIVE-10073
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: spark-branch
>
> Attachments: HIVE-10073.1-spark.patch
>
>
> When querying HBase with Spark, we got 
> {noformat}
>  Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
> {noformat}
> But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-10073:
---
Attachment: HIVE-10073.1-spark.patch

> Runtime exception when querying HBase with Spark [Spark Branch]
> ---
>
> Key: HIVE-10073
> URL: https://issues.apache.org/jira/browse/HIVE-10073
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-10073.1-spark.patch
>
>
> When querying HBase with Spark, we got 
> {noformat}
>  Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
> {noformat}
> But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HIVE-10073:
--

Assignee: Jimmy Xiang

> Runtime exception when querying HBase with Spark [Spark Branch]
> ---
>
> Key: HIVE-10073
> URL: https://issues.apache.org/jira/browse/HIVE-10073
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> When querying HBase with Spark, we got 
> {noformat}
>  Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
> {noformat}
> But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10083) SMBJoin fails in case one table is uninitialized

2015-03-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alain Schröder updated HIVE-10083:
--
Description: 
We experience IndexOutOfBoundsException in a SMBJoin in the case on the tables 
used for the JOIN is uninitialized. Everything works if both are uninitialized 
or initialized.

{code}
2015-03-24 09:12:58,967 ERROR [main]: ql.Driver 
(SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException Index: 
0, Size: 0
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
at 
org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
[...]
{code}

Simplest way to reproduce:

{code}
SET hive.enforce.sorting=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition=true;
SET mapreduce.reduce.import.limit=-1;

SET hive.optimize.bucketmapjoin=true;
SET hive.optimize.bucketmapjoin.sortedmerge=true;
SET hive.auto.convert.join=true;
SET hive.auto.convert.sortmerge.join=true;
SET hive.auto.convert.sortmerge.join.noconditionaltask=true;

CREATE DATABASE IF NOT EXISTS tmp;
USE tmp;

CREATE  TABLE `test1` (
  `foo` bigint )
CLUSTERED BY (
  foo)
SORTED BY (
  foo ASC)
INTO 384 BUCKETS
stored as orc;

CREATE  TABLE `test2`(
  `foo` bigint )
CLUSTERED BY (
  foo)
SORTED BY (
  foo ASC)
INTO 384 BUCKETS
STORED AS ORC;

-- Initialize ONE table of the two tables with any data.
INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;

SELECT t1.foo, t2.foo
FROM test1 t1 INNER JOIN test2 t2 
ON (t1.foo = t2.foo);
{code}

I took a look at the Procedure 
fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in 
AbstractBucketJoinProc.java and it does not seem to have changed from our MapR 
Hive 0.13 to current snapshot, so this should be also an error in the current 
Version.

  was:
We experience IndexOutOfBoundsException in a SMBJoin in the case on the tables 
used for the JOIN is uninitialized. Everything works if both are uninitialized 
or initialized.

{code}
2015-03-24 09:12:58,967 ERROR [main]: ql.Driver 
(SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException Index: 
0, Size: 0
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
at 
org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
at 
org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
{code}

Simplest way to reproduce:

{code}
SET hive.enforce.sorting=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition=true;
SET mapreduce.reduce.import.limit=-1;

SET hive.optimize.bucketmapjoin=true;
SET hive.optimize.bucketmapjoin.sortedmerge=true;
SET hive.auto.convert.join=true;
SET hive.auto.convert.sortmerge.join=true;
SET hive.auto.convert.sortmerge.join.noconditionaltask=true;

CREATE DATABASE IF NOT EXISTS tmp;
USE tmp;

CREATE  TABLE `test1` (
  `foo` bigint )
CLUSTERED BY (
  foo)
SORTED BY (
  foo ASC)
INTO 384 BUCKETS
stored as orc;

CREATE  TABLE `test2`(
  `foo` bigint )
CLUSTERED BY (
  foo)
SORTED BY (
  foo ASC)
INTO 384 BUCKETS
STORED AS ORC;

-- Initialize ONE table of the two tables with any data.
INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;

SELECT t1.foo, t2.foo
FROM test1 t1 INNER JOIN test2 t2 
ON (t1.foo = t2.foo);
{code}

I took a look at the Procedure 
fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in 
AbstractBucketJoinProc.java and it does not seem to have changed from our MapR 
Hive 0.13 to current snapshot, so this should be also an error in the current 
Version.


> SMBJoin fails in case one table is uninitialized
> 
>
> Key

[jira] [Commented] (HIVE-1575) get_json_object does not support JSON array at the root level

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380071#comment-14380071
 ] 

Hive QA commented on HIVE-1575:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707117/HIVE-1575.4.patch

{color:green}SUCCESS:{color} +1 8339 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3148/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3148/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3148/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707117 - PreCommit-HIVE-TRUNK-Build

> get_json_object does not support JSON array at the root level
> -
>
> Key: HIVE-1575
> URL: https://issues.apache.org/jira/browse/HIVE-1575
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.7.0
>Reporter: Steven Wong
>Assignee: Alexander Pivovarov
> Attachments: 
> 0001-Updated-UDFJson-to-allow-arrays-as-a-root-object.patch, 
> HIVE-1575.2.patch, HIVE-1575.3.patch, HIVE-1575.4.patch
>
>
> Currently, get_json_object(json_txt, path) always returns null if json_txt is 
> not a JSON object (e.g. is a JSON array) at the root level.
> I have a table column of JSON arrays at the root level, but I can't parse it 
> because of that.
> get_json_object should accept any JSON value (string, number, object, array, 
> true, false, null), not just object, at the root level. In other words, it 
> should behave as if it were named get_json_value or simply get_json.
> Per the JSON RFC, an array is indeed legal top-level JSON-text



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10083) SMBJoin fails in case one table is uninitialized

2015-03-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alain Schröder updated HIVE-10083:
--
Affects Version/s: (was: 0.13.1)
   0.13.0

> SMBJoin fails in case one table is uninitialized
> 
>
> Key: HIVE-10083
> URL: https://issues.apache.org/jira/browse/HIVE-10083
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.13.0
> Environment: MapR Hive 0.13
>Reporter: Alain Schröder
>Priority: Minor
>
> We experience IndexOutOfBoundsException in a SMBJoin in the case on the 
> tables used for the JOIN is uninitialized. Everything works if both are 
> uninitialized or initialized.
> {code}
> 2015-03-24 09:12:58,967 ERROR [main]: ql.Driver 
> (SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException 
> Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
> at 
> org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
> {code}
> Simplest way to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> SET hive.auto.convert.sortmerge.join.noconditionaltask=true;
> CREATE DATABASE IF NOT EXISTS tmp;
> USE tmp;
> CREATE  TABLE `test1` (
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> stored as orc;
> CREATE  TABLE `test2`(
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> STORED AS ORC;
> -- Initialize ONE table of the two tables with any data.
> INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;
> SELECT t1.foo, t2.foo
> FROM test1 t1 INNER JOIN test2 t2 
> ON (t1.foo = t2.foo);
> {code}
> I took a look at the Procedure 
> fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in 
> AbstractBucketJoinProc.java and it does not seem to have changed from our 
> MapR Hive 0.13 to current snapshot, so this should be also an error in the 
> current Version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9839) HiveServer2 leaks OperationHandle on async queries which fail at compile phase

2015-03-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379998#comment-14379998
 ] 

Jimmy Xiang commented on HIVE-9839:
---

Thanks for making the change. +1 pending on test.

> HiveServer2 leaks OperationHandle on async queries which fail at compile phase
> --
>
> Key: HIVE-9839
> URL: https://issues.apache.org/jira/browse/HIVE-9839
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0, 0.13.1, 1.0.0, 1.1.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
>Priority: Critical
> Attachments: HIVE-9839.patch, HIVE-9839.patch, HIVE-9839.patch, 
> HIVE-9839.patch, OperationHandleMonitor.java
>
>
> Using beeline to connect to HiveServer2.And type the following:
> drop table if exists table_not_exists;
> select * from table_not_exists;
> There will be an OperationHandle object staying in HiveServer2's memory for 
> ever even after quit from beeline .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10078) Optionally allow logging of records processed in fixed intervals

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379914#comment-14379914
 ] 

Hive QA commented on HIVE-10078:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707111/HIVE-10078.1.patch

{color:red}ERROR:{color} -1 due to 1711 failed/errored test(s), 8337 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_project
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_exist
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_concatenate_indexed_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_index
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_update_status
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_invalidate_column_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_update_status
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_view_as_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguitycheck
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguous_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_tbl_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_array_map_access_nonconstant
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriv

[jira] [Commented] (HIVE-10077) Use new ParquetInputSplit constructor API

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379844#comment-14379844
 ] 

Hive QA commented on HIVE-10077:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707110/HIVE-10077.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3146/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3146/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3146/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-svn-trunk-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-svn-trunk-source/spark-client/target/spark-client-1.2.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/spark-client/target/spark-client-1.2.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/spark-client/1.2.0-SNAPSHOT/spark-client-1.2.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/spark-client/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/spark-client/1.2.0-SNAPSHOT/spark-client-1.2.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 1.2.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/ql (includes = 
[datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-exec ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
Generating vector expression code
Generating vector expression test code
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/gen/protobuf/gen-java 
added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/gen/thrift/gen-javabean 
added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-svn-trunk-source/ql/target/generated-sources/java
 added.
[INFO] 
[INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[INFO] ANTLR: Processing source directory 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java
ANTLR Parser Generator  Version 3.4
org/apache/hadoop/hive/ql/parse/HiveLexer.g
org/apache/hadoop/hive/ql/parse/HiveParser.g
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_ORDER KW_BY" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_FROM" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_REDUCE" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_MAP" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input su

  1   2   >