[jira] [Commented] (HIVE-9293) Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268948#comment-14268948
 ] 

Hive QA commented on HIVE-9293:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690718/HIVE-9293.1-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7283 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/618/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/618/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-618/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690718 - PreCommit-HIVE-SPARK-Build

> Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]
> ---
>
> Key: HIVE-9293
> URL: https://issues.apache.org/jira/browse/HIVE-9293
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>Priority: Minor
> Attachments: HIVE-9293.1-spark.patch
>
>
> As we don't have UnionWork anymore, we can simplify the logic to get root 
> mapworks from the SparkWork.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9289) TODO : Store user name in session [Spark Branch]

2015-01-07 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268928#comment-14268928
 ] 

Chengxiang Li commented on HIVE-9289:
-

We does reuse session for queries from same client, [~xuefuz], I verified 
before and i just did again, just may not work in the way we thought before.
HiveServer2 introduce a new approach to manager session, every RPC call 
references a session ID which the server then maps to persistent session 
state,the linear mapping is like: Hive Client->SessionHandler(session id 
inside)\->HiveSessionImpl->SessionState.  For Hive on Spark, as we would like 
to share the singleton SparkContext for a user session, we may expand the 
linear mapping like: Hive Client->SessionHandler(session id inside)\-> 
HiveSessionImpl->SessionState->SparkSession->SparkClient->RemoteDriver->SparkContext,
 with only exception: create new SparkSession while spark configuration update.
Hive Client->SessionHandler(session id inside)->HiveSessionImpl mapping should 
already ensure that whether Hive session should reused, I think we do not need 
to check again by user name in SparkSessionManager unless we have other reasons 
to create new SparkSession beside spark configuration update.
[~chinnalalam], I just read related code today, not sure if i fully understand 
it, do you have any idea about it?

> TODO : Store user name in session [Spark Branch]
> 
>
> Key: HIVE-9289
> URL: https://issues.apache.org/jira/browse/HIVE-9289
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
> Attachments: HIVE-9289.1-spark.patch
>
>
> TODO  : this we need to store the session username somewhere else as 
> getUGIForConf never used the conf SparkSessionManagerImpl.java 
> /hive-exec/src/java/org/apache/hadoop/hive/ql/exec/spark/session line 145 
> Java Task



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9188) BloomFilter in ORC row group index

2015-01-07 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268907#comment-14268907
 ] 

Gopal V commented on HIVE-9188:
---

Left some comments, particularly about the encoding of the bloom filter itself.

The List is a bad idea as the 2nd long in the list is actually a double 
containing the fpp value.

Otherwise the patch looks good. I've added it to the build right now, will ETL 
in a bunch of the NYC taxi data with this and run some point-scan queries.

> BloomFilter in ORC row group index
> --
>
> Key: HIVE-9188
> URL: https://issues.apache.org/jira/browse/HIVE-9188
> Project: Hive
>  Issue Type: New Feature
>  Components: File Formats
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: orcfile
> Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, 
> HIVE-9188.4.patch
>
>
> BloomFilters are well known probabilistic data structure for set membership 
> checking. We can use bloom filters in ORC index for better row group pruning. 
> Currently, ORC row group index uses min/max statistics to eliminate row 
> groups (stripes as well) that do not satisfy predicate condition specified in 
> the query. But in some cases, the efficiency of min/max based elimination is 
> not optimal (unsorted columns with wide range of entries). Bloom filters can 
> be an effective and efficient alternative for row group/split elimination for 
> point queries or queries with IN clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9299) Reuse Configuration in AvroSerdeUtils

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268903#comment-14268903
 ] 

Hive QA commented on HIVE-9299:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690643/HIVE-9299.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6732 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2285/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2285/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2285/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690643 - PreCommit-HIVE-TRUNK-Build

> Reuse Configuration in AvroSerdeUtils
> -
>
> Key: HIVE-9299
> URL: https://issues.apache.org/jira/browse/HIVE-9299
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.14.0, 0.13.1, 0.15.0
>Reporter: Nitay Joffe
>Assignee: Nitay Joffe
> Fix For: 0.15.0
>
> Attachments: HIVE-9299.patch
>
>
> I am getting an issue where the original Configuration has some parameters 
> needed to read the remote Avro schema (specifically S3 keys).
> Doing new Configuration doesn't pick it up because the keys are not on the 
> classpath.
> We should reuse the Configuration already present in callers.
> I'm using Hive/Avro from Spark so it'd be nice if we could put this into Hive 
> 0.13 since that's what Spark's built against.
> See also https://github.com/jghoman/haivvreo/pull/30



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268896#comment-14268896
 ] 

Hive QA commented on HIVE-9306:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690716/HIVE-9306.1-spark.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7283 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/617/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/617/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-617/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690716 - PreCommit-HIVE-SPARK-Build

> Let Context.isLocalOnlyExecutionMode() return false if execution engine is 
> Spark [Spark Branch]
> ---
>
> Key: HIVE-9306
> URL: https://issues.apache.org/jira/browse/HIVE-9306
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9306.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]

2015-01-07 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-9305:

   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Committed to spark-branch.  Thanks Chengxiang!

> Set default miniClusterType back to none in QTestUtil.[Spark branch]
> 
>
> Key: HIVE-9305
> URL: https://issues.apache.org/jira/browse/HIVE-9305
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Fix For: spark-branch
>
> Attachments: HIVE-9305.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4639) Add has null flag to ORC internal index

2015-01-07 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268884#comment-14268884
 ] 

Gopal V commented on HIVE-4639:
---

Added this patch to my daily TPC-H 1Tb ETL & reloaded lineitem with the new 
format.

Testing {{select * from lineitem where l_shipdate is null;}}.

Before: 66.728 seconds (208774320430 bytes read)
After: 7.87 seconds  (539046900 bytes read)

LGTM - +1.

> Add has null flag to ORC internal index
> ---
>
> Key: HIVE-4639
> URL: https://issues.apache.org/jira/browse/HIVE-4639
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-4639.1.patch, HIVE-4639.2.patch
>
>
> It would enable more predicate pushdown if we added a flag to the index entry 
> recording if there were any null values in the column for the 10k rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9038) Join tests fail on Tez

2015-01-07 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268881#comment-14268881
 ] 

Navis commented on HIVE-9038:
-

Unlike MR, MapJoin in TEZ does not make filterTag in previous vertice but still 
want to read it. At first it sounded simple but I couldn't get all of those 
complex codes in MapJoinTableContainer, ReusableGetAdaptor, KeyValueHelper, 
KvSource,etc. Maybe [~sershe] can fix this at a glance.

> Join tests fail on Tez
> --
>
> Key: HIVE-9038
> URL: https://issues.apache.org/jira/browse/HIVE-9038
> Project: Hive
>  Issue Type: Bug
>  Components: Tests, Tez
>Reporter: Ashutosh Chauhan
>Assignee: Vikram Dixit K
>
> Tez doesn't run all tests. But, if you run them, following tests fail with 
> runt time exception pointing to bugs. 
> {{auto_join21.q,auto_join29.q,auto_join30.q
> ,auto_join_filters.q,auto_join_nulls.q}} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs

2015-01-07 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268870#comment-14268870
 ] 

Gopal V commented on HIVE-8327:
---

I tested this now and findbugs adds about 4 mins to the build, but regular "mvn 
site" takes 40+ mins without this patch because it tries to print out a 
dependency report and throws up warnings like

{code}
[WARNING] The repository url 
'http://s3.amazonaws.com/maven.springframework.org/milestone' is invalid - 
Repository 'spring-milestone' will be blacklisted.
[WARNING] The repository url 
'https://nexus.codehaus.org/content/repositories/snapshots/' is invalid - 
Repository 'codehaus-nexus-snapshots' will be blacklisted.
{code}


> mvn site -Pfindbugs
> ---
>
> Key: HIVE-8327
> URL: https://issues.apache.org/jira/browse/HIVE-8327
> Project: Hive
>  Issue Type: Test
>  Components: Diagnosability
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 0.15.0
>
> Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html
>
>
> HIVE-3099 originally added findbugs into the old ant build.
> Get basic findbugs working for the maven build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java

2015-01-07 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-9308:
---
Attachment: HIVE-9308-encryption.patch

> Refine the logic for the isSub method to support local file in HIVE.java
> 
>
> Key: HIVE-9308
> URL: https://issues.apache.org/jira/browse/HIVE-9308
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-9308-encryption.patch
>
>
> Refine the isSubDir method from Hive to support the local file instead of 
> using the method from FileUtil which only supports files in hdfs schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java

2015-01-07 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-9308:
---
Status: Patch Available  (was: Open)

> Refine the logic for the isSub method to support local file in HIVE.java
> 
>
> Key: HIVE-9308
> URL: https://issues.apache.org/jira/browse/HIVE-9308
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-9308-encryption.patch
>
>
> Refine the isSubDir method from Hive to support the local file instead of 
> using the method from FileUtil which only supports files in hdfs schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java

2015-01-07 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-9308:
---
Fix Version/s: encryption-branch

> Refine the logic for the isSub method to support local file in HIVE.java
> 
>
> Key: HIVE-9308
> URL: https://issues.apache.org/jira/browse/HIVE-9308
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: encryption-branch
>
> Attachments: HIVE-9308-encryption.patch
>
>
> Refine the isSubDir method from Hive to support the local file instead of 
> using the method from FileUtil which only supports files in hdfs schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java

2015-01-07 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reassigned HIVE-9308:
--

Assignee: Ferdinand Xu

> Refine the logic for the isSub method to support local file in HIVE.java
> 
>
> Key: HIVE-9308
> URL: https://issues.apache.org/jira/browse/HIVE-9308
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>
> Refine the isSubDir method from Hive to support the local file instead of 
> using the method from FileUtil which only supports files in hdfs schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java

2015-01-07 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-9308:
--

 Summary: Refine the logic for the isSub method to support local 
file in HIVE.java
 Key: HIVE-9308
 URL: https://issues.apache.org/jira/browse/HIVE-9308
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu


Refine the isSubDir method from Hive to support the local file instead of using 
the method from FileUtil which only supports files in hdfs schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9290) Make some test results deterministic

2015-01-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268843#comment-14268843
 ] 

Rui Li commented on HIVE-9290:
--

Thanks [~xuefuz] for the explanation!

> Make some test results deterministic
> 
>
> Key: HIVE-9290
> URL: https://issues.apache.org/jira/browse/HIVE-9290
> Project: Hive
>  Issue Type: Test
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9290.1.patch
>
>
> {noformat}
> limit_pushdown.q
> optimize_nullscan.q
> ppd_gby_join.q
> vector_string_concat.q
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268841#comment-14268841
 ] 

Xuefu Zhang commented on HIVE-9251:
---

It should be okay. Limit is still pushed down in the extra stage introduced by 
order by.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268837#comment-14268837
 ] 

Hive QA commented on HIVE-8327:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690640/HIVE-8327.2.patch

{color:green}SUCCESS:{color} +1 6732 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2284/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2284/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2284/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690640 - PreCommit-HIVE-TRUNK-Build

> mvn site -Pfindbugs
> ---
>
> Key: HIVE-8327
> URL: https://issues.apache.org/jira/browse/HIVE-8327
> Project: Hive
>  Issue Type: Test
>  Components: Diagnosability
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 0.15.0
>
> Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html
>
>
> HIVE-3099 originally added findbugs into the old ant build.
> Get basic findbugs working for the maven build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9283) Improve encryption related test cases

2015-01-07 Thread Dong Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268836#comment-14268836
 ] 

Dong Chen commented on HIVE-9283:
-

Good idea!. Thanks for your review, [~spena], [~brocknoland]. I filed HIVE-9307 
to track this.

> Improve encryption related test cases
> -
>
> Key: HIVE-9283
> URL: https://issues.apache.org/jira/browse/HIVE-9283
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Dong Chen
>Assignee: Dong Chen
> Fix For: encryption-branch
>
> Attachments: HIVE-9283.patch
>
>
> NO PRECOMMIT TESTS
> I found some test cases .q file could be improved by:
> 1. change the table location from {{/user/hive/warehouse...}} to 
> {{/build/ql/test/data/warehouse/...}}.
> The reason is that the default warehouse dir defined in QTestUtil is the 
> latter one, and the partial mask is based on it. I think it is better to make 
> test cases consistent with code. Also the .hive_staging location we want in 
> .out will be shown then.
> 2. add cleanup at the end.
> drop table and delete key. Otherwise, some cases will fail caused by cannot 
> create existed key. (Put in HIVE-9286)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9290) Make some test results deterministic

2015-01-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268835#comment-14268835
 ] 

Xuefu Zhang commented on HIVE-9290:
---

It should be okay. Even though an extra stage is introduced, I see the limit is 
still pushed down in the second stage according to the plan.

> Make some test results deterministic
> 
>
> Key: HIVE-9290
> URL: https://issues.apache.org/jira/browse/HIVE-9290
> Project: Hive
>  Issue Type: Test
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9290.1.patch
>
>
> {noformat}
> limit_pushdown.q
> optimize_nullscan.q
> ppd_gby_join.q
> vector_string_concat.q
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9307) Use MetaStore dir variable from conf instead of hard coded dir in encryption test

2015-01-07 Thread Dong Chen (JIRA)
Dong Chen created HIVE-9307:
---

 Summary: Use MetaStore dir variable from conf instead of hard 
coded dir in encryption test
 Key: HIVE-9307
 URL: https://issues.apache.org/jira/browse/HIVE-9307
 Project: Hive
  Issue Type: Sub-task
Reporter: Dong Chen
Assignee: Dong Chen


Use the following variable to get the metastore directory 
$\{hiveconf:hive.metastore.warehouse.dir\} in test cases.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268833#comment-14268833
 ] 

Hive QA commented on HIVE-9305:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690713/HIVE-9305.1-spark.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7283 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/616/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/616/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-616/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690713 - PreCommit-HIVE-SPARK-Build

> Set default miniClusterType back to none in QTestUtil.[Spark branch]
> 
>
> Key: HIVE-9305
> URL: https://issues.apache.org/jira/browse/HIVE-9305
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Attachments: HIVE-9305.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9194) Support select distinct *

2015-01-07 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268820#comment-14268820
 ] 

Pengcheng Xiong commented on HIVE-9194:
---

[~jpullokkaran], could you please take a look? It is ready to go. Thanks.

> Support select distinct *
> -
>
> Key: HIVE-9194
> URL: https://issues.apache.org/jira/browse/HIVE-9194
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9194.00.patch
>
>
> As per [~jpullokkaran]'s review comments, implement select distinct *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9293) Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]

2015-01-07 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9293:
---
Status: Patch Available  (was: Open)

> Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]
> ---
>
> Key: HIVE-9293
> URL: https://issues.apache.org/jira/browse/HIVE-9293
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>Priority: Minor
> Attachments: HIVE-9293.1-spark.patch
>
>
> As we don't have UnionWork anymore, we can simplify the logic to get root 
> mapworks from the SparkWork.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9293) Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]

2015-01-07 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9293:
---
Attachment: HIVE-9293.1-spark.patch

> Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]
> ---
>
> Key: HIVE-9293
> URL: https://issues.apache.org/jira/browse/HIVE-9293
> Project: Hive
>  Issue Type: Task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Chao
>Priority: Minor
> Attachments: HIVE-9293.1-spark.patch
>
>
> As we don't have UnionWork anymore, we can simplify the logic to get root 
> mapworks from the SparkWork.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]

2015-01-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268805#comment-14268805
 ] 

Szehon Ho commented on HIVE-9305:
-

+1

> Set default miniClusterType back to none in QTestUtil.[Spark branch]
> 
>
> Key: HIVE-9305
> URL: https://issues.apache.org/jira/browse/HIVE-9305
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Attachments: HIVE-9305.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]

2015-01-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268803#comment-14268803
 ] 

Szehon Ho commented on HIVE-9306:
-

+1

> Let Context.isLocalOnlyExecutionMode() return false if execution engine is 
> Spark [Spark Branch]
> ---
>
> Key: HIVE-9306
> URL: https://issues.apache.org/jira/browse/HIVE-9306
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9306.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]

2015-01-07 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9306:
--
Status: Patch Available  (was: Open)

> Let Context.isLocalOnlyExecutionMode() return false if execution engine is 
> Spark [Spark Branch]
> ---
>
> Key: HIVE-9306
> URL: https://issues.apache.org/jira/browse/HIVE-9306
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9306.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]

2015-01-07 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9306:
--
Attachment: HIVE-9306.1-spark.patch

> Let Context.isLocalOnlyExecutionMode() return false if execution engine is 
> Spark [Spark Branch]
> ---
>
> Key: HIVE-9306
> URL: https://issues.apache.org/jira/browse/HIVE-9306
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-9306.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9194) Support select distinct *

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268795#comment-14268795
 ] 

Hive QA commented on HIVE-9194:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690634/HIVE-9194.00.patch

{color:green}SUCCESS:{color} +1 6734 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2283/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2283/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2283/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690634 - PreCommit-HIVE-TRUNK-Build

> Support select distinct *
> -
>
> Key: HIVE-9194
> URL: https://issues.apache.org/jira/browse/HIVE-9194
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9194.00.patch
>
>
> As per [~jpullokkaran]'s review comments, implement select distinct *



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]

2015-01-07 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-9306:
-

 Summary: Let Context.isLocalOnlyExecutionMode() return false if 
execution engine is Spark [Spark Branch]
 Key: HIVE-9306
 URL: https://issues.apache.org/jira/browse/HIVE-9306
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]

2015-01-07 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9305:

Status: Patch Available  (was: Open)

> Set default miniClusterType back to none in QTestUtil.[Spark branch]
> 
>
> Key: HIVE-9305
> URL: https://issues.apache.org/jira/browse/HIVE-9305
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Attachments: HIVE-9305.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]

2015-01-07 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9305:

Attachment: HIVE-9305.1-spark.patch

> Set default miniClusterType back to none in QTestUtil.[Spark branch]
> 
>
> Key: HIVE-9305
> URL: https://issues.apache.org/jira/browse/HIVE-9305
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Attachments: HIVE-9305.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]

2015-01-07 Thread Chengxiang Li (JIRA)
Chengxiang Li created HIVE-9305:
---

 Summary: Set default miniClusterType back to none in 
QTestUtil.[Spark branch]
 Key: HIVE-9305
 URL: https://issues.apache.org/jira/browse/HIVE-9305
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Commented] (HIVE-9259) Fix ClassCastException when CBO is enabled for HOS [Spark Branch]

2015-01-07 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268777#comment-14268777
 ] 

Chao commented on HIVE-9259:


After updating the cluster with latest jars, I cannot reproduce this exception 
anymore.
The only exception I found in hiveserver2.log is:

{noformat}
2015-01-07 22:01:22,421 ERROR [HiveServer2-Handler-Pool: Thread-29]: 
server.TThreadPoolServer (TThreadPoolServer.java:run(296)) - Error occurred 
during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: 
Invalid status 71
at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: Invalid status 71
at 
org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at 
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184)
at 
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 4 more
{noformat}

Not sure if it's related.

> Fix ClassCastException when CBO is enabled for HOS [Spark Branch]
> -
>
> Key: HIVE-9259
> URL: https://issues.apache.org/jira/browse/HIVE-9259
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Brock Noland
>Assignee: Chao
>
> {noformat}
> 2015-01-05 22:10:19,414 ERROR [HiveServer2-Handler-Pool: Thread-33]: 
> parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10109)) - CBO 
> failed, skipping CBO.
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveTypeSystemImpl cannot be cast 
> to org.eigenbase.reltype.RelDataTypeSystem
> at 
> net.hydromatic.optiq.jdbc.OptiqConnectionImpl.(OptiqConnectionImpl.java:92)
> at 
> net.hydromatic.optiq.jdbc.OptiqJdbc41Factory$OptiqJdbc41Connection.(OptiqJdbc41Factory.java:103)
> at 
> net.hydromatic.optiq.jdbc.OptiqJdbc41Factory.newConnection(OptiqJdbc41Factory.java:49)
> at 
> net.hydromatic.optiq.jdbc.OptiqJdbc41Factory.newConnection(OptiqJdbc41Factory.java:34)
> at 
> net.hydromatic.optiq.jdbc.OptiqFactory.newConnection(OptiqFactory.java:52)
> at 
> net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:135)
> at java.sql.DriverManager.getConnection(DriverManager.java:571)
> at java.sql.DriverManager.getConnection(DriverManager.java:187)
> at 
> org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:140)
> at 
> org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$CalciteBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:12560)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$CalciteBasedPlanner.access$400(SemanticAnalyzer.java:12540)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10070)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1102)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388)
> at 
> org.apache.hive.service.

[jira] [Commented] (HIVE-9242) Many places in CBO code eat exceptions

2015-01-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268776#comment-14268776
 ] 

Brock Noland commented on HIVE-9242:


Thx Navis, +1

> Many places in CBO code eat exceptions
> --
>
> Key: HIVE-9242
> URL: https://issues.apache.org/jira/browse/HIVE-9242
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Priority: Blocker
> Attachments: HIVE-9242.1.patch.txt
>
>
> I've noticed that there are a number of places in the CBO code which eat 
> exceptions. This is not acceptable. Example:
> https://github.com/apache/hive/blob/357b473a354aace3bd59b522ad7108be561e9d0f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L274



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9242) Many places in CBO code eat exceptions

2015-01-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9242:

Status: Patch Available  (was: Open)

> Many places in CBO code eat exceptions
> --
>
> Key: HIVE-9242
> URL: https://issues.apache.org/jira/browse/HIVE-9242
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Priority: Blocker
> Attachments: HIVE-9242.1.patch.txt
>
>
> I've noticed that there are a number of places in the CBO code which eat 
> exceptions. This is not acceptable. Example:
> https://github.com/apache/hive/blob/357b473a354aace3bd59b522ad7108be561e9d0f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L274



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9242) Many places in CBO code eat exceptions

2015-01-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9242:

Attachment: HIVE-9242.1.patch.txt

> Many places in CBO code eat exceptions
> --
>
> Key: HIVE-9242
> URL: https://issues.apache.org/jira/browse/HIVE-9242
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Priority: Blocker
> Attachments: HIVE-9242.1.patch.txt
>
>
> I've noticed that there are a number of places in the CBO code which eat 
> exceptions. This is not acceptable. Example:
> https://github.com/apache/hive/blob/357b473a354aace3bd59b522ad7108be561e9d0f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L274



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9272) Tests for utf-8 support

2015-01-07 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-9272:
---
Attachment: HIVE-9272.1.patch

> Tests for utf-8 support
> ---
>
> Key: HIVE-9272
> URL: https://issues.apache.org/jira/browse/HIVE-9272
> Project: Hive
>  Issue Type: Test
>  Components: Tests, WebHCat
>Reporter: Aswathy Chellammal Sreekumar
>Priority: Minor
> Attachments: HIVE-9272.1.patch, HIVE-9272.patch
>
>
> Including some test cases for utf8 support in webhcat. The first four tests 
> invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
> data processed, file names and job name. The last test case tests the 
> filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9272) Tests for utf-8 support

2015-01-07 Thread Aswathy Chellammal Sreekumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268764#comment-14268764
 ] 

Aswathy Chellammal Sreekumar commented on HIVE-9272:


Please find the updated patch for utf-8 tests for review.

> Tests for utf-8 support
> ---
>
> Key: HIVE-9272
> URL: https://issues.apache.org/jira/browse/HIVE-9272
> Project: Hive
>  Issue Type: Test
>  Components: Tests, WebHCat
>Reporter: Aswathy Chellammal Sreekumar
>Priority: Minor
> Attachments: HIVE-9272.patch
>
>
> Including some test cases for utf8 support in webhcat. The first four tests 
> invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
> data processed, file names and job name. The last test case tests the 
> filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9302) Beeline add jar local to client

2015-01-07 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reassigned HIVE-9302:
--

Assignee: Ferdinand Xu

> Beeline add jar local to client
> ---
>
> Key: HIVE-9302
> URL: https://issues.apache.org/jira/browse/HIVE-9302
> Project: Hive
>  Issue Type: New Feature
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>
> At present if a beeline user uses {{add jar}} the path they give is actually 
> on the HS2 server. It'd be great to allow beeline users to add local jars as 
> well.
> It might be useful to do this in the jdbc driver itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9290) Make some test results deterministic

2015-01-07 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9290:
-
Status: Patch Available  (was: Open)

> Make some test results deterministic
> 
>
> Key: HIVE-9290
> URL: https://issues.apache.org/jira/browse/HIVE-9290
> Project: Hive
>  Issue Type: Test
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9290.1.patch
>
>
> {noformat}
> limit_pushdown.q
> optimize_nullscan.q
> ppd_gby_join.q
> vector_string_concat.q
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9290) Make some test results deterministic

2015-01-07 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9290:
-
Attachment: HIVE-9290.1.patch

Not sure if it's correct to make limit_pushdown.q deterministic.
cc [~xuefuz]

> Make some test results deterministic
> 
>
> Key: HIVE-9290
> URL: https://issues.apache.org/jira/browse/HIVE-9290
> Project: Hive
>  Issue Type: Test
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9290.1.patch
>
>
> {noformat}
> limit_pushdown.q
> optimize_nullscan.q
> ppd_gby_join.q
> vector_string_concat.q
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268728#comment-14268728
 ] 

Rui Li commented on HIVE-9251:
--

[~xuefuz] - you're right. I think we should fix HIVE-9290 first and merge it to 
spark.
One thing I'm not sure is about limit_pushdown.q. To make it deterministic, I 
have to add order by to the query. Will that somehow make the limit pushdown 
not work?

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9281) Code cleanup [Spark Branch]

2015-01-07 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-9281:

   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Committed to spark.  Thanks Xuefu for the heavy reading.

> Code cleanup [Spark Branch]
> ---
>
> Key: HIVE-9281
> URL: https://issues.apache.org/jira/browse/HIVE-9281
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: spark-branch
>
> Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch
>
>
> In preparation for merge, we need to cleanup the codes.
> This includes removing TODO's, fixing checkstyles, removing commented or 
> unused code, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268689#comment-14268689
 ] 

Hive QA commented on HIVE-9281:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690647/HIVE-9281.2-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7283 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/615/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/615/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-615/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690647 - PreCommit-HIVE-SPARK-Build

> Code cleanup [Spark Branch]
> ---
>
> Key: HIVE-9281
> URL: https://issues.apache.org/jira/browse/HIVE-9281
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch
>
>
> In preparation for merge, we need to cleanup the codes.
> This includes removing TODO's, fixing checkstyles, removing commented or 
> unused code, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9304) [Refactor] remove unused method in SemAly

2015-01-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9304:
---
Attachment: HIVE-9304.patch

> [Refactor] remove unused method in SemAly
> -
>
> Key: HIVE-9304
> URL: https://issues.apache.org/jira/browse/HIVE-9304
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9304.patch
>
>
> Seems like method {{genConversionOps}} don't serve any purpose any longer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9304) [Refactor] remove unused method in SemAly

2015-01-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9304:
---
Status: Patch Available  (was: Open)

> [Refactor] remove unused method in SemAly
> -
>
> Key: HIVE-9304
> URL: https://issues.apache.org/jira/browse/HIVE-9304
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-9304.patch
>
>
> Seems like method {{genConversionOps}} don't serve any purpose any longer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9304) [Refactor] remove unused method in SemAly

2015-01-07 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-9304:
--

 Summary: [Refactor] remove unused method in SemAly
 Key: HIVE-9304
 URL: https://issues.apache.org/jira/browse/HIVE-9304
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Seems like method {{genConversionOps}} don't serve any purpose any longer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9278) Cached expression feature broken in one case

2015-01-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268680#comment-14268680
 ] 

Ashutosh Chauhan commented on HIVE-9278:


Aah.. I see. +1

> Cached expression feature broken in one case
> 
>
> Key: HIVE-9278
> URL: https://issues.apache.org/jira/browse/HIVE-9278
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Navis
>Priority: Blocker
> Attachments: HIVE-9278.1.patch.txt
>
>
> Different query result depending on whether hive.cache.expr.evaluation is 
> true or false.  When true, no query results are produced (this is wrong).
> The q file:
> {noformat}
> set hive.cache.expr.evaluation=true;
> CREATE TABLE cache_expr_repro (date_str STRING);
> LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE 
> cache_expr_repro;
> SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) 
> AS `quarter`,   YEAR(date_str) AS `year` FROM cache_expr_repro WHERE 
> ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = 
> 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int),  
>  YEAR(date_str) ;
> {noformat}
> cache_expr_repro.txt
> {noformat}
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4639) Add has null flag to ORC internal index

2015-01-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-4639:

Attachment: HIVE-4639.2.patch

Fixes test failures. All of them are file size diffs.

> Add has null flag to ORC internal index
> ---
>
> Key: HIVE-4639
> URL: https://issues.apache.org/jira/browse/HIVE-4639
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-4639.1.patch, HIVE-4639.2.patch
>
>
> It would enable more predicate pushdown if we added a flag to the index entry 
> recording if there were any null values in the column for the 10k rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9290) Make some test results deterministic

2015-01-07 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9290:
-
Description: 
{noformat}
limit_pushdown.q
optimize_nullscan.q
ppd_gby_join.q
vector_string_concat.q
{noformat}

  was:
{noformat}
limit_pushdown.q
ppd_gby_join.q
vector_string_concat.q
{noformat}


> Make some test results deterministic
> 
>
> Key: HIVE-9290
> URL: https://issues.apache.org/jira/browse/HIVE-9290
> Project: Hive
>  Issue Type: Test
>Reporter: Rui Li
>Assignee: Rui Li
>
> {noformat}
> limit_pushdown.q
> optimize_nullscan.q
> ppd_gby_join.q
> vector_string_concat.q
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9303) Parquet files are written with incorrect definition levels

2015-01-07 Thread Skye Wanderman-Milne (JIRA)
Skye Wanderman-Milne created HIVE-9303:
--

 Summary: Parquet files are written with incorrect definition levels
 Key: HIVE-9303
 URL: https://issues.apache.org/jira/browse/HIVE-9303
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Skye Wanderman-Milne


The definition level, which determines which level of nesting is NULL, appears 
to always be n or n-1, where n is the maximum definition level. This means that 
only the innermost level of nesting can be NULL. This is only relevant for 
Parquet files. For example:

{code:sql}
CREATE TABLE text_tbl (a STRUCT>)
STORED AS TEXTFILE;

INSERT OVERWRITE TABLE text_tbl
SELECT IF(false, named_struct("b", named_struct("c", 1)), NULL)
FROM tbl LIMIT 1;

CREATE TABLE parq_tbl
STORED AS PARQUET
AS SELECT * FROM text_tbl;

SELECT * FROM text_tbl;
=> NULL # right

SELECT * FROM parq_tbl;
=> {"b":{"c":null}} # wrong
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9224) CBO (Calcite Return Path): Inline Table, Properties

2015-01-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268676#comment-14268676
 ] 

Hive QA commented on HIVE-9224:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12690584/HIVE-9224.2.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6732 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast_qualified_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_acid
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_main
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2281/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2281/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2281/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12690584 - PreCommit-HIVE-TRUNK-Build

> CBO (Calcite Return Path): Inline Table, Properties
> ---
>
> Key: HIVE-9224
> URL: https://issues.apache.org/jira/browse/HIVE-9224
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Fix For: 0.15.0
>
> Attachments: HIVE-9224.1.patch, HIVE-9224.2.patch, HIVE-9224.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9278) Cached expression feature broken in one case

2015-01-07 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268669#comment-14268669
 ] 

Navis commented on HIVE-9278:
-

Should be included in hive-0.14.1

> Cached expression feature broken in one case
> 
>
> Key: HIVE-9278
> URL: https://issues.apache.org/jira/browse/HIVE-9278
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Navis
>Priority: Blocker
> Attachments: HIVE-9278.1.patch.txt
>
>
> Different query result depending on whether hive.cache.expr.evaluation is 
> true or false.  When true, no query results are produced (this is wrong).
> The q file:
> {noformat}
> set hive.cache.expr.evaluation=true;
> CREATE TABLE cache_expr_repro (date_str STRING);
> LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE 
> cache_expr_repro;
> SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) 
> AS `quarter`,   YEAR(date_str) AS `year` FROM cache_expr_repro WHERE 
> ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = 
> 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int),  
>  YEAR(date_str) ;
> {noformat}
> cache_expr_repro.txt
> {noformat}
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9278) Cached expression feature broken in one case

2015-01-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9278:

Priority: Blocker  (was: Critical)

> Cached expression feature broken in one case
> 
>
> Key: HIVE-9278
> URL: https://issues.apache.org/jira/browse/HIVE-9278
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Navis
>Priority: Blocker
> Attachments: HIVE-9278.1.patch.txt
>
>
> Different query result depending on whether hive.cache.expr.evaluation is 
> true or false.  When true, no query results are produced (this is wrong).
> The q file:
> {noformat}
> set hive.cache.expr.evaluation=true;
> CREATE TABLE cache_expr_repro (date_str STRING);
> LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE 
> cache_expr_repro;
> SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) 
> AS `quarter`,   YEAR(date_str) AS `year` FROM cache_expr_repro WHERE 
> ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = 
> 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int),  
>  YEAR(date_str) ;
> {noformat}
> cache_expr_repro.txt
> {noformat}
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7550) Extend cached evaluation to multiple expressions

2015-01-07 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268664#comment-14268664
 ] 

Navis commented on HIVE-7550:
-

Sure, but HIVE-9278 should be included first.

> Extend cached evaluation to multiple expressions
> 
>
> Key: HIVE-7550
> URL: https://issues.apache.org/jira/browse/HIVE-7550
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-7550.1.patch.txt
>
>
> Currently, hive.cache.expr.evaluation caches per expression. But cache 
> context might be shared for multiple expressions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9278) Cached expression feature broken in one case

2015-01-07 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268662#comment-14268662
 ] 

Navis commented on HIVE-9278:
-

[~ashutoshc] In caching, identity was checked by comparing toString(). But for 
UDFs (not GenericUDF), these return always same class name 
(org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge) making them shared 
between different expressions. In the testcase, "length(key)" and 
"reverse(key)" are both UDFs, resulting length(key)=reverse(key) always. Now 
it's checked correctly with ExprNodeDesc itself (with isSame() method).

> Cached expression feature broken in one case
> 
>
> Key: HIVE-9278
> URL: https://issues.apache.org/jira/browse/HIVE-9278
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Matt McCline
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-9278.1.patch.txt
>
>
> Different query result depending on whether hive.cache.expr.evaluation is 
> true or false.  When true, no query results are produced (this is wrong).
> The q file:
> {noformat}
> set hive.cache.expr.evaluation=true;
> CREATE TABLE cache_expr_repro (date_str STRING);
> LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE 
> cache_expr_repro;
> SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) 
> AS `quarter`,   YEAR(date_str) AS `year` FROM cache_expr_repro WHERE 
> ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = 
> 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int),  
>  YEAR(date_str) ;
> {noformat}
> cache_expr_repro.txt
> {noformat}
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> 2015-02-01 00:00:00
> 2015-02-01 00:00:00
> 2015-01-01 00:00:00
> 2015-01-01 00:00:00
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9296) Need to add schema upgrade changes for queueing events in the database

2015-01-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-9296:
-
Fix Version/s: 0.15.0
   Status: Patch Available  (was: Open)

> Need to add schema upgrade changes for queueing events in the database
> --
>
> Key: HIVE-9296
> URL: https://issues.apache.org/jira/browse/HIVE-9296
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.15.0
>
> Attachments: HIVE-9296.patch
>
>
> HIVE-9174 added the ability to queue notification events in the database, but 
> did not include the schema upgrade scripts.
> Also, in the thrift changes the convention was not followed properly in 
> naming the thrift methods.  HIVE-9174 used camel case, where the thrift 
> methods use all lower case separated by underscores.
> Both of these issues should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4790) MapredLocalTask task does not make virtual columns

2015-01-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4790:

Attachment: HIVE-4790.13.patch.txt

> MapredLocalTask task does not make virtual columns
> --
>
> Key: HIVE-4790
> URL: https://issues.apache.org/jira/browse/HIVE-4790
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: D11511.3.patch, D11511.4.patch, HIVE-4790.10.patch.txt, 
> HIVE-4790.11.patch.txt, HIVE-4790.12.patch.txt, HIVE-4790.13.patch.txt, 
> HIVE-4790.5.patch.txt, HIVE-4790.6.patch.txt, HIVE-4790.7.patch.txt, 
> HIVE-4790.8.patch.txt, HIVE-4790.9.patch.txt, HIVE-4790.D11511.1.patch, 
> HIVE-4790.D11511.2.patch
>
>
> From mailing list, 
> http://www.mail-archive.com/user@hive.apache.org/msg08264.html
> {noformat}
> SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON 
> b.rownumber = a.number;
> fails with this error:
>  
> > SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = 
> a.number;
> Automatically selecting local only mode for query
> Total MapReduce jobs = 1
> setting HADOOP_USER_NAMEpmarron
> 13/06/25 10:52:56 WARN conf.HiveConf: DEPRECATED: Configuration property 
> hive.metastore.local no longer has any effect. Make sure to provide a valid 
> value for hive.metastore.uris if you are connecting to a remote metastore.
> Execution log at: /tmp/pmarron/.log
> 2013-06-25 10:52:56 Starting to launch local task to process map join;
>   maximum memory = 932118528
> java.lang.RuntimeException: cannot find field block__offset__inside__file 
> from [0:rownumber, 1:offset]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.DelegatedStructObjectInspector.getStructFieldRef(DelegatedStructObjectInspector.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
> at 
> org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:222)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:186)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
> at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:394)
> at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Execution failed with exit status: 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9296) Need to add schema upgrade changes for queueing events in the database

2015-01-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-9296:
-
Attachment: HIVE-9296.patch

This patch fixes the thrift function name issues and adds the two new tables to 
the metastore creation and upgrade scripts.

> Need to add schema upgrade changes for queueing events in the database
> --
>
> Key: HIVE-9296
> URL: https://issues.apache.org/jira/browse/HIVE-9296
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.15.0
>
> Attachments: HIVE-9296.patch
>
>
> HIVE-9174 added the ability to queue notification events in the database, but 
> did not include the schema upgrade scripts.
> Also, in the thrift changes the convention was not followed properly in 
> naming the thrift methods.  HIVE-9174 used camel case, where the thrift 
> methods use all lower case separated by underscores.
> Both of these issues should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9299) Reuse Configuration in AvroSerdeUtils

2015-01-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268646#comment-14268646
 ] 

Ashutosh Chauhan commented on HIVE-9299:


+1

> Reuse Configuration in AvroSerdeUtils
> -
>
> Key: HIVE-9299
> URL: https://issues.apache.org/jira/browse/HIVE-9299
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.14.0, 0.13.1, 0.15.0
>Reporter: Nitay Joffe
>Assignee: Nitay Joffe
> Fix For: 0.15.0
>
> Attachments: HIVE-9299.patch
>
>
> I am getting an issue where the original Configuration has some parameters 
> needed to read the remote Avro schema (specifically S3 keys).
> Doing new Configuration doesn't pick it up because the keys are not on the 
> classpath.
> We should reuse the Configuration already present in callers.
> I'm using Hive/Avro from Spark so it'd be nice if we could put this into Hive 
> 0.13 since that's what Spark's built against.
> See also https://github.com/jghoman/haivvreo/pull/30



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs

2015-01-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268637#comment-14268637
 ] 

Szehon Ho commented on HIVE-8327:
-

Sorry I dont have the env right now to give it a try.

If it doesnt increase build time too much, should be possible for HiveQA to run 
it before and after, and parse the summary report to do comparison.

Or alternatively, much simpler if HiveQA was to just print out the summary 
report.

> mvn site -Pfindbugs
> ---
>
> Key: HIVE-8327
> URL: https://issues.apache.org/jira/browse/HIVE-8327
> Project: Hive
>  Issue Type: Test
>  Components: Diagnosability
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 0.15.0
>
> Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html
>
>
> HIVE-3099 originally added findbugs into the old ant build.
> Get basic findbugs working for the maven build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9188) BloomFilter in ORC row group index

2015-01-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268632#comment-14268632
 ] 

Prasanth Jayachandran commented on HIVE-9188:
-

[~owen.omalley] Current patch has bloom filters at all 3 levels. The size is 
kept constant for all 3 levels. But fpp for stripe will be >0.05 (assuming >10k 
unique items) and for file it will be much worse. With this we will get good 
row group elimination and considerably good stripe elimination. I can drop the 
file level bloom filter which we don't use for any purpose.

The merging of disk ranges happens after we pick the row groups that satisfy 
the SARG (readPartialDataStreams() happens after pickRowGroups()). But we need 
bloom filter before that for eliminating row groups.

> BloomFilter in ORC row group index
> --
>
> Key: HIVE-9188
> URL: https://issues.apache.org/jira/browse/HIVE-9188
> Project: Hive
>  Issue Type: New Feature
>  Components: File Formats
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: orcfile
> Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, 
> HIVE-9188.4.patch
>
>
> BloomFilters are well known probabilistic data structure for set membership 
> checking. We can use bloom filters in ORC index for better row group pruning. 
> Currently, ORC row group index uses min/max statistics to eliminate row 
> groups (stripes as well) that do not satisfy predicate condition specified in 
> the query. But in some cases, the efficiency of min/max based elimination is 
> not optimal (unsorted columns with wide range of entries). Bloom filters can 
> be an effective and efficient alternative for row group/split elimination for 
> point queries or queries with IN clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements

2015-01-07 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268625#comment-14268625
 ] 

Alexander Behm commented on HIVE-4022:
--

Easier workaround:
IF(false, named_struct("a", 1), NULL)


> Structs and struct fields cannot be NULL in INSERT statements
> -
>
> Key: HIVE-4022
> URL: https://issues.apache.org/jira/browse/HIVE-4022
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Michael Malak
>
> Originally thought to be Avro-specific, and first noted with respect to 
> HIVE-3528 "Avro SerDe doesn't handle serializing Nullable types that require 
> access to a Schema", it turns out even native Hive tables cannot store NULL 
> in a STRUCT field or for the entire STRUCT itself, at least when the NULL is 
> specified directly in the INSERT statement.
> Again, this affects both Avro-backed tables and native Hive tables.
> ***For native Hive tables:
> The following:
> echo 1,2 >twovalues.csv
> hive
> CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc;
> CREATE TABLE oc (z STRUCT);
> INSERT INTO TABLE oc SELECT null FROM tc;
> produces the error
> FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target 
> table because column number/types are different 'oc': Cannot convert column 0 
> from void to struct.
> The following:
> INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc;
> produces the error:
> FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target 
> table because column number/types are different 'oc': Cannot convert column 0 
> from struct to struct.
> ***For Avro:
> In HIVE-3528, there is in fact a null-struct test case in line 14 of
> https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt
> The test script at
> https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q
> does indeed work.  But in that test, the query gets all of its data from a 
> test table verbatim:
> INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer;
> If instead we stick in a hard-coded null for the struct directly into the 
> query, it fails:
> INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, 
> bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, 
> bytes1, fixed1 FROM test_serializer;
> with the following error:
> FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target 
> table because column number/types are different 'as_avro': Cannot convert 
> column 10 from void to struct.
> Note, though, that substituting a hard-coded null for string1 (and restoring 
> struct1 into the query) does work:
> INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, 
> bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, 
> bytes1, fixed1 FROM test_serializer;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Attachment: HIVE-9039.10.patch

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Status: Patch Available  (was: Open)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Attachment: HIVE-9039.10.patch

After discussing with [~jpullokkaran], we are going to separate the union 
distinct with union order by/limit bug. This patch is for union distinct 
implementation only and should be committed after select distinct *

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Attachment: (was: HIVE-9039.10.patch)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-01-07 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Status: Open  (was: Patch Available)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch
>
>
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8814) Support custom virtual columns from serde implementation

2015-01-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8814:

Attachment: HIVE-8814.7.patch.txt

> Support custom virtual columns from serde implementation
> 
>
> Key: HIVE-8814
> URL: https://issues.apache.org/jira/browse/HIVE-8814
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-8814.1.patch.txt, HIVE-8814.2.patch.txt, 
> HIVE-8814.3.patch.txt, HIVE-8814.4.patch.txt, HIVE-8814.5.patch.txt, 
> HIVE-8814.6.patch.txt, HIVE-8814.7.patch.txt
>
>
> Currently, virtual columns are fixed in hive. But some serdes can provide 
> more virtual columns if needed. Idea from 
> https://issues.apache.org/jira/browse/HIVE-7513?focusedCommentId=14073912&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14073912



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable

2015-01-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268593#comment-14268593
 ] 

Brock Noland commented on HIVE-9300:


+1 pending tests

Thank you Prasanth!

> Revert HIVE-9049 and make TCompactProtocol configurable
> ---
>
> Key: HIVE-9300
> URL: https://issues.apache.org/jira/browse/HIVE-9300
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9300.1.patch, HIVE-9300.2.patch
>
>
> Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol 
> configurable with default disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one

2015-01-07 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9104:
---
Attachment: HIVE-9104.patch

The issue seems to be that, in {{FirstValStreamingFixedWindow::terminate}}, it 
doesn't expect {{fb.skipNulls}} to be false AND {{s.valueChain.size() == 0}}, 
but this could happen in case there are multiple reduce tasks, some of which 
will get 0 rows. I changed the code to set ValIndexPair to null in such case.

Also, I added SORT_QUERY_RESULTS to the qfile, since I get some ordering 
problem after regenerating the golden file.

[~rhbutani] Can you take a look at this patch, and give some suggestions? 
Thanks.

> windowing.q failed when mapred.reduce.tasks is set to larger than one
> -
>
> Key: HIVE-9104
> URL: https://issues.apache.org/jira/browse/HIVE-9104
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9104.patch
>
>
> Test {{windowing.q}} is actually not enabled in Spark branch - in test 
> configurations it is {{windowing.q.q}}.
> I just run this test, and query
> {code}
> -- 12. testFirstLastWithWhere
> select  p_mfgr,p_name, p_size,
> rank() over(distribute by p_mfgr sort by p_name) as r,
> sum(p_size) over (distribute by p_mfgr sort by p_name rows between current 
> row and current row) as s2,
> first_value(p_size) over w1 as f,
> last_value(p_size, false) over w1 as l
> from part
> where p_mfgr = 'Manufacturer#3'
> window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding 
> and 2 following);
> {code}
> failed with the following exception:
> {noformat}
> java.lang.RuntimeException: Hive Runtime Error while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.NoSuchElementException
>   at java.util.ArrayDeque.getFirst(ArrayDeque.java:318)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290)
>   at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>   at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431)
>   ... 15 more
> {noformat}
> We need to find out:
> - Since which commit this test started failing, and
> - Why it fails



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one

2015-01-07 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9104:
---
Status: Patch Available  (was: Open)

> windowing.q failed when mapred.reduce.tasks is set to larger than one
> -
>
> Key: HIVE-9104
> URL: https://issues.apache.org/jira/browse/HIVE-9104
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9104.patch
>
>
> Test {{windowing.q}} is actually not enabled in Spark branch - in test 
> configurations it is {{windowing.q.q}}.
> I just run this test, and query
> {code}
> -- 12. testFirstLastWithWhere
> select  p_mfgr,p_name, p_size,
> rank() over(distribute by p_mfgr sort by p_name) as r,
> sum(p_size) over (distribute by p_mfgr sort by p_name rows between current 
> row and current row) as s2,
> first_value(p_size) over w1 as f,
> last_value(p_size, false) over w1 as l
> from part
> where p_mfgr = 'Manufacturer#3'
> window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding 
> and 2 following);
> {code}
> failed with the following exception:
> {noformat}
> java.lang.RuntimeException: Hive Runtime Error while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.NoSuchElementException
>   at java.util.ArrayDeque.getFirst(ArrayDeque.java:318)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290)
>   at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>   at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431)
>   ... 15 more
> {noformat}
> We need to find out:
> - Since which commit this test started failing, and
> - Why it fails



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()

2015-01-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268580#comment-14268580
 ] 

Ted Yu commented on HIVE-9301:
--

I stepped through similar code in debugger - the single ampersand prevents 
short circuit evaluation of the expression, leading to NPE.

> Potential null dereference in MoveTask#createTargetPath()
> -
>
> Key: HIVE-9301
> URL: https://issues.apache.org/jira/browse/HIVE-9301
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HIVE-9301.patch
>
>
> {code}
> if (mkDirPath != null & !fs.exists(mkDirPath)) {
> {code}
> '&&' should be used instead of single ampersand.
> If mkDirPath is null, fs.exists() would still be called - resulting in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9188) BloomFilter in ORC row group index

2015-01-07 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268573#comment-14268573
 ] 

Owen O'Malley commented on HIVE-9188:
-

[~prasanth_j] Ok, I thought that you said that you were going to have bloom 
filters at row group, stripe, and file level. I agree completely that ORC 
should only have bloom filters at the row group level.

Having the bloom filter as a separate stream means the reader does *far* less 
IO. It will still go through the code that merges adjacent ranges together into 
a single read. So if you need all of the indexes and bloom filters for all of 
the columns the reader should read them in a single IO operation. On the other 
hand, if it doesn't need any bloom filter it shouldn't have to load the extra 
mb of data it doesn't need.

> BloomFilter in ORC row group index
> --
>
> Key: HIVE-9188
> URL: https://issues.apache.org/jira/browse/HIVE-9188
> Project: Hive
>  Issue Type: New Feature
>  Components: File Formats
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: orcfile
> Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, 
> HIVE-9188.4.patch
>
>
> BloomFilters are well known probabilistic data structure for set membership 
> checking. We can use bloom filters in ORC index for better row group pruning. 
> Currently, ORC row group index uses min/max statistics to eliminate row 
> groups (stripes as well) that do not satisfy predicate condition specified in 
> the query. But in some cases, the efficiency of min/max based elimination is 
> not optimal (unsorted columns with wide range of entries). Bloom filters can 
> be an effective and efficient alternative for row group/split elimination for 
> point queries or queries with IN clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()

2015-01-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268564#comment-14268564
 ] 

Xuefu Zhang commented on HIVE-9301:
---

+1

> Potential null dereference in MoveTask#createTargetPath()
> -
>
> Key: HIVE-9301
> URL: https://issues.apache.org/jira/browse/HIVE-9301
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HIVE-9301.patch
>
>
> {code}
> if (mkDirPath != null & !fs.exists(mkDirPath)) {
> {code}
> '&&' should be used instead of single ampersand.
> If mkDirPath is null, fs.exists() would still be called - resulting in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9302) Beeline add jar local to client

2015-01-07 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9302:
---
Description: 
At present if a beeline user uses {{add jar}} the path they give is actually on 
the HS2 server. It'd be great to allow beeline users to add local jars as well.

It might be useful to do this in the jdbc driver itself.

  was:At present if a beeline user uses {{add jar}} the path they give is 
actually on the HS2 server. It'd be great to allow beeline users to add local 
jars as well.


> Beeline add jar local to client
> ---
>
> Key: HIVE-9302
> URL: https://issues.apache.org/jira/browse/HIVE-9302
> Project: Hive
>  Issue Type: New Feature
>Reporter: Brock Noland
>
> At present if a beeline user uses {{add jar}} the path they give is actually 
> on the HS2 server. It'd be great to allow beeline users to add local jars as 
> well.
> It might be useful to do this in the jdbc driver itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9302) Beeline add jar local to client

2015-01-07 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9302:
---
Summary: Beeline add jar local to client  (was: Beeline add jar)

> Beeline add jar local to client
> ---
>
> Key: HIVE-9302
> URL: https://issues.apache.org/jira/browse/HIVE-9302
> Project: Hive
>  Issue Type: New Feature
>Reporter: Brock Noland
>
> At present if a beeline user uses {{add jar}} the path they give is actually 
> on the HS2 server. It'd be great to allow beeline users to add local jars as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9302) Beeline add jar

2015-01-07 Thread Brock Noland (JIRA)
Brock Noland created HIVE-9302:
--

 Summary: Beeline add jar
 Key: HIVE-9302
 URL: https://issues.apache.org/jira/browse/HIVE-9302
 Project: Hive
  Issue Type: New Feature
Reporter: Brock Noland


At present if a beeline user uses {{add jar}} the path they give is actually on 
the HS2 server. It'd be great to allow beeline users to add local jars as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-6179) OOM occurs when query spans to a large number of partitions

2015-01-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268550#comment-14268550
 ] 

Brock Noland edited comment on HIVE-6179 at 1/8/15 12:09 AM:
-

I think this is actually about having an API so either (1) the client can get a 
remote iterator over the list of partitions (2) providing an API which gives 
only the information required by the client so as to remove bloat or (3) 
finding a more compact way to transfer this data.

The iterator idea is discussed here: HIVE-7195


was (Author: brocknoland):
I think this is actually about having an API so either (1) the client can get a 
remote iterator over the list of partitions (2) providing an API which gives 
only the information required by the client so as to remove bloat or (3) 
finding a more compact way to transfer this data.

> OOM occurs when query spans to a large number of partitions
> ---
>
> Key: HIVE-6179
> URL: https://issues.apache.org/jira/browse/HIVE-6179
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> When executing a query against a large number of partitions, such as "select 
> count(\*) from table", OOM error may occur because Hive fetches the metadata 
> for all partitions involved and tries to store it in memory.
> {code}
> 2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap 
> space
> at java.util.Arrays.copyOf(Arrays.java:2367)
> at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
> at java.lang.StringBuffer.append(StringBuffer.java:237)
> at 
> org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown
>  Source)
> at 
> org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown 
> Source)
> at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
> at 
> org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) 
> at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:191)
> at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:379)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.loopJoinOrderedResult(MetaStoreDirectSql.java:641)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:410)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:205)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1433)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1420)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:122)
> at com.sun.proxy.$Proxy7.getPartitions(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2128)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
> {code}
> The above error happened when executing "select count(\*)" on a table with 
> 40K partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6179) OOM occurs when query spans to a large number of partitions

2015-01-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268550#comment-14268550
 ] 

Brock Noland commented on HIVE-6179:


I think this is actually about having an API so either (1) the client can get a 
remote iterator over the list of partitions (2) providing an API which gives 
only the information required by the client so as to remove bloat or (3) 
finding a more compact way to transfer this data.

> OOM occurs when query spans to a large number of partitions
> ---
>
> Key: HIVE-6179
> URL: https://issues.apache.org/jira/browse/HIVE-6179
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> When executing a query against a large number of partitions, such as "select 
> count(\*) from table", OOM error may occur because Hive fetches the metadata 
> for all partitions involved and tries to store it in memory.
> {code}
> 2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap 
> space
> at java.util.Arrays.copyOf(Arrays.java:2367)
> at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
> at java.lang.StringBuffer.append(StringBuffer.java:237)
> at 
> org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown
>  Source)
> at 
> org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown 
> Source)
> at 
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
> Source)
> at 
> org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown 
> Source)
> at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) 
> at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:191)
> at 
> org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:379)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.loopJoinOrderedResult(MetaStoreDirectSql.java:641)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:410)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:205)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1433)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1420)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:122)
> at com.sun.proxy.$Proxy7.getPartitions(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2128)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
> {code}
> The above error happened when executing "select count(\*)" on a table with 
> 40K partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()

2015-01-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-9301:
-
Assignee: Ted Yu
  Status: Patch Available  (was: Open)

> Potential null dereference in MoveTask#createTargetPath()
> -
>
> Key: HIVE-9301
> URL: https://issues.apache.org/jira/browse/HIVE-9301
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HIVE-9301.patch
>
>
> {code}
> if (mkDirPath != null & !fs.exists(mkDirPath)) {
> {code}
> '&&' should be used instead of single ampersand.
> If mkDirPath is null, fs.exists() would still be called - resulting in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()

2015-01-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-9301:
-
Attachment: HIVE-9301.patch

> Potential null dereference in MoveTask#createTargetPath()
> -
>
> Key: HIVE-9301
> URL: https://issues.apache.org/jira/browse/HIVE-9301
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
> Attachments: HIVE-9301.patch
>
>
> {code}
> if (mkDirPath != null & !fs.exists(mkDirPath)) {
> {code}
> '&&' should be used instead of single ampersand.
> If mkDirPath is null, fs.exists() would still be called - resulting in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable

2015-01-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9300:

Attachment: HIVE-9300.2.patch

Updated to fix import ordering difference.

> Revert HIVE-9049 and make TCompactProtocol configurable
> ---
>
> Key: HIVE-9300
> URL: https://issues.apache.org/jira/browse/HIVE-9300
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9300.1.patch, HIVE-9300.2.patch
>
>
> Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol 
> configurable with default disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()

2015-01-07 Thread Ted Yu (JIRA)
Ted Yu created HIVE-9301:


 Summary: Potential null dereference in MoveTask#createTargetPath()
 Key: HIVE-9301
 URL: https://issues.apache.org/jira/browse/HIVE-9301
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu


{code}
if (mkDirPath != null & !fs.exists(mkDirPath)) {
{code}
'&&' should be used instead of single ampersand.

If mkDirPath is null, fs.exists() would still be called - resulting in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable

2015-01-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268528#comment-14268528
 ] 

Prasanth Jayachandran commented on HIVE-9300:
-

Thats import order difference between IntelliJ vs Eclipse.. I will see if 
IntelliJ can be configured to follow eclipse import order.

> Revert HIVE-9049 and make TCompactProtocol configurable
> ---
>
> Key: HIVE-9300
> URL: https://issues.apache.org/jira/browse/HIVE-9300
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9300.1.patch
>
>
> Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol 
> configurable with default disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable

2015-01-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268527#comment-14268527
 ] 

Brock Noland commented on HIVE-9300:


Thank you [~prasanth_j]! LGTM but can we remove the re-ordering of the imports 
in {{HiveConf}}? I don't see a need for that?

> Revert HIVE-9049 and make TCompactProtocol configurable
> ---
>
> Key: HIVE-9300
> URL: https://issues.apache.org/jira/browse/HIVE-9300
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9300.1.patch
>
>
> Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol 
> configurable with default disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one

2015-01-07 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9104:
--
Affects Version/s: (was: spark-branch)

> windowing.q failed when mapred.reduce.tasks is set to larger than one
> -
>
> Key: HIVE-9104
> URL: https://issues.apache.org/jira/browse/HIVE-9104
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chao
>Assignee: Chao
>
> Test {{windowing.q}} is actually not enabled in Spark branch - in test 
> configurations it is {{windowing.q.q}}.
> I just run this test, and query
> {code}
> -- 12. testFirstLastWithWhere
> select  p_mfgr,p_name, p_size,
> rank() over(distribute by p_mfgr sort by p_name) as r,
> sum(p_size) over (distribute by p_mfgr sort by p_name rows between current 
> row and current row) as s2,
> first_value(p_size) over w1 as f,
> last_value(p_size, false) over w1 as l
> from part
> where p_mfgr = 'Manufacturer#3'
> window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding 
> and 2 following);
> {code}
> failed with the following exception:
> {noformat}
> java.lang.RuntimeException: Hive Runtime Error while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.NoSuchElementException
>   at java.util.ArrayDeque.getFirst(ArrayDeque.java:318)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290)
>   at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>   at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431)
>   ... 15 more
> {noformat}
> We need to find out:
> - Since which commit this test started failing, and
> - Why it fails



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]

2015-01-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268523#comment-14268523
 ] 

Xuefu Zhang commented on HIVE-9281:
---

+1

> Code cleanup [Spark Branch]
> ---
>
> Key: HIVE-9281
> URL: https://issues.apache.org/jira/browse/HIVE-9281
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch
>
>
> In preparation for merge, we need to cleanup the codes.
> This includes removing TODO's, fixing checkstyles, removing commented or 
> unused code, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one

2015-01-07 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9104:
---
Summary: windowing.q failed when mapred.reduce.tasks is set to larger than 
one  (was: windowing.q failed [Spark Branch])

> windowing.q failed when mapred.reduce.tasks is set to larger than one
> -
>
> Key: HIVE-9104
> URL: https://issues.apache.org/jira/browse/HIVE-9104
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
>
> Test {{windowing.q}} is actually not enabled in Spark branch - in test 
> configurations it is {{windowing.q.q}}.
> I just run this test, and query
> {code}
> -- 12. testFirstLastWithWhere
> select  p_mfgr,p_name, p_size,
> rank() over(distribute by p_mfgr sort by p_name) as r,
> sum(p_size) over (distribute by p_mfgr sort by p_name rows between current 
> row and current row) as s2,
> first_value(p_size) over w1 as f,
> last_value(p_size, false) over w1 as l
> from part
> where p_mfgr = 'Manufacturer#3'
> window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding 
> and 2 following);
> {code}
> failed with the following exception:
> {noformat}
> java.lang.RuntimeException: Hive Runtime Error while closing operators: null
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.NoSuchElementException
>   at java.util.ArrayDeque.getFirst(ArrayDeque.java:318)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290)
>   at 
> org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413)
>   at 
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>   at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431)
>   ... 15 more
> {noformat}
> We need to find out:
> - Since which commit this test started failing, and
> - Why it fails



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]

2015-01-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268517#comment-14268517
 ] 

Szehon Ho commented on HIVE-9281:
-

Hope this works: 
[https://reviews.apache.org/r/29686/|https://reviews.apache.org/r/29686/]

> Code cleanup [Spark Branch]
> ---
>
> Key: HIVE-9281
> URL: https://issues.apache.org/jira/browse/HIVE-9281
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch
>
>
> In preparation for merge, we need to cleanup the codes.
> This includes removing TODO's, fixing checkstyles, removing commented or 
> unused code, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable

2015-01-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9300:

Status: Patch Available  (was: Open)

> Revert HIVE-9049 and make TCompactProtocol configurable
> ---
>
> Key: HIVE-9300
> URL: https://issues.apache.org/jira/browse/HIVE-9300
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9300.1.patch
>
>
> Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol 
> configurable with default disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable

2015-01-07 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9300:

Attachment: HIVE-9300.1.patch

[~brocknoland]/[~ashutoshc] can someone take a look?


> Revert HIVE-9049 and make TCompactProtocol configurable
> ---
>
> Key: HIVE-9300
> URL: https://issues.apache.org/jira/browse/HIVE-9300
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9300.1.patch
>
>
> Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol 
> configurable with default disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable

2015-01-07 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-9300:
---

 Summary: Revert HIVE-9049 and make TCompactProtocol configurable
 Key: HIVE-9300
 URL: https://issues.apache.org/jira/browse/HIVE-9300
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.15.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol configurable 
with default disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9049) Metastore should use TCompactProtocol as opposed to TBinaryProtocol

2015-01-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268499#comment-14268499
 ] 

Prasanth Jayachandran commented on HIVE-9049:
-

Created HIVE-9300 to track the change.

> Metastore should use TCompactProtocol as opposed to TBinaryProtocol
> ---
>
> Key: HIVE-9049
> URL: https://issues.apache.org/jira/browse/HIVE-9049
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.15.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
> Fix For: 0.15.0
>
> Attachments: HIVE-9049.1.patch
>
>
> Hive metastore server/client uses TBinaryProtocol. Although binary protocol 
> is better than simple text/json protocol it is not as effective as 
> TCompactProtocol. TCompactProtocol is typically more efficient in terms of 
> space and processing (CPU). As seen from this benchmark TCompactProtocol is 
> better in almost all aspect when compared to TBinaryProtocol
> https://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8485) HMS on Oracle incompatibility

2015-01-07 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-8485:
---
Status: Patch Available  (was: Open)

> HMS on Oracle incompatibility
> -
>
> Key: HIVE-8485
> URL: https://issues.apache.org/jira/browse/HIVE-8485
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
> Environment: Oracle as metastore DB
>Reporter: Ryan Pridgeon
>Assignee: Chaoyu Tang
> Attachments: HIVE-8485.2.patch, HIVE-8485.patch
>
>
> Oracle does not distinguish between empty strings and NULL,which proves 
> problematic for DataNucleus.
> In the event a user creates a table with some property stored as an empty 
> string the table will no longer be accessible.
> i.e. TBLPROPERTIES ('serialization.null.format'='')
> If they try to select, describe, drop, etc the client prints the following 
> exception.
> ERROR ql.Driver: FAILED: SemanticException [Error 10001]: Table not found 
> 
> The work around for this was to go into the hive metastore on the Oracle 
> database and replace NULL with some other string. Users could then drop the 
> tables or alter their data to use the new null format they just set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8485) HMS on Oracle incompatibility

2015-01-07 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-8485:
---
Attachment: HIVE-8485.2.patch

Updated patch.

> HMS on Oracle incompatibility
> -
>
> Key: HIVE-8485
> URL: https://issues.apache.org/jira/browse/HIVE-8485
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
> Environment: Oracle as metastore DB
>Reporter: Ryan Pridgeon
>Assignee: Chaoyu Tang
> Attachments: HIVE-8485.2.patch, HIVE-8485.patch
>
>
> Oracle does not distinguish between empty strings and NULL,which proves 
> problematic for DataNucleus.
> In the event a user creates a table with some property stored as an empty 
> string the table will no longer be accessible.
> i.e. TBLPROPERTIES ('serialization.null.format'='')
> If they try to select, describe, drop, etc the client prints the following 
> exception.
> ERROR ql.Driver: FAILED: SemanticException [Error 10001]: Table not found 
> 
> The work around for this was to go into the hive metastore on the Oracle 
> database and replace NULL with some other string. Users could then drop the 
> tables or alter their data to use the new null format they just set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]

2015-01-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268466#comment-14268466
 ] 

Xuefu Zhang commented on HIVE-9281:
---

Could you load both versions to RB so that I can just look at the diff between 
the versions?

> Code cleanup [Spark Branch]
> ---
>
> Key: HIVE-9281
> URL: https://issues.apache.org/jira/browse/HIVE-9281
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch
>
>
> In preparation for merge, we need to cleanup the codes.
> This includes removing TODO's, fixing checkstyles, removing commented or 
> unused code, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]

2015-01-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268461#comment-14268461
 ] 

Szehon Ho commented on HIVE-9281:
-

Its mostly the same patch, you can quickly look through it or I can commit once 
tests pass on latest patch.

> Code cleanup [Spark Branch]
> ---
>
> Key: HIVE-9281
> URL: https://issues.apache.org/jira/browse/HIVE-9281
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch
>
>
> In preparation for merge, we need to cleanup the codes.
> This includes removing TODO's, fixing checkstyles, removing commented or 
> unused code, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9175) Add alters to list of events handled by NotificationListener

2015-01-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-9175:
-
Fix Version/s: 0.15.0
   Status: Patch Available  (was: Open)

> Add alters to list of events handled by NotificationListener
> 
>
> Key: HIVE-9175
> URL: https://issues.apache.org/jira/browse/HIVE-9175
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.15.0
>
> Attachments: HIVE-9175.patch
>
>
> HCatalog currently doesn't implement onAlterTable and onAlterPartition.  It 
> should.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >