date:20141230

[jira] [Commented] (HIVE-8920) IOContext problem with multiple MapWorks cloned for multi-insert [Spark Branch]

2014-12-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260905#comment-14260905
 ] 

Hive QA commented on HIVE-8920:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12689440/HIVE-8920.3-spark.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 7281 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/598/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/598/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-598/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12689440 - PreCommit-HIVE-SPARK-Build

 IOContext problem with multiple MapWorks cloned for multi-insert [Spark 
 Branch]
 ---

 Key: HIVE-8920
 URL: https://issues.apache.org/jira/browse/HIVE-8920
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Xuefu Zhang
 Attachments: HIVE-8920.1-spark.patch, HIVE-8920.2-spark.patch, 
 HIVE-8920.3-spark.patch


 The following query will not work:
 {code}
 from (select * from table0 union all select * from table1) s
 insert overwrite table table3 select s.x, count(1) group by s.x
 insert overwrite table table4 select s.y, count(1) group by s.y;
 {code}
 Currently, the plan for this query, before SplitSparkWorkResolver, looks like 
 below:
 {noformat}
M1M2
  \  / \
   U3   R5
   |
   R4
 {noformat}
 In {{SplitSparkWorkResolver#splitBaseWork}}, it assumes that the 
 {{childWork}} is a ReduceWork, but for this case, you can see that for M2 the 
 childWork could be UnionWork U3. Thus, the code will fail.
 HIVE-9041 addressed partially addressed the problem by removing union task. 
 However, it's still necessary to cloning M1 and M2 to support multi-insert. 
 Because M1 and M2 can run in a single JVM, the original solution of storing a 
 global IOContext will not work because M1 and M2 have different io contexts, 
 both needing to be stored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8181) Upgrade JavaEWAH version to allow for unsorted bitset creation

2014-12-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260907#comment-14260907
 ] 

Hive QA commented on HIVE-8181:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12689432/HIVE-8181.2.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6723 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_list_bucket_dml_10
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2219/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2219/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2219/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12689432 - PreCommit-HIVE-TRUNK-Build

 Upgrade JavaEWAH version to allow for unsorted bitset creation
 --

 Key: HIVE-8181
 URL: https://issues.apache.org/jira/browse/HIVE-8181
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0, 0.13.1
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-8181.1.patch, HIVE-8181.2.patch.txt


 JavaEWAH has removed the restriction that bitsets can only be set in order in 
 the latest release. 
 Currently the use of {{ewah_bitmap}} UDAF requires a {{SORT BY}}.
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.RuntimeException: Can't set bits out of order with 
 EWAHCompressedBitmap
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:824)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
 at 
 org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
 at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249)
 ... 7 more
 Caused by: java.lang.RuntimeException: Can't set bits out of order with 
 EWAHCompressedBitmap
 at 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9226) Beeline interweaves the query result and query log sometimes

2014-12-30 Thread Dong Chen (JIRA)

Dong Chen created HIVE-9226:
---

 Summary: Beeline interweaves the query result and query log 
sometimes
 Key: HIVE-9226
 URL: https://issues.apache.org/jira/browse/HIVE-9226
 Project: Hive
  Issue Type: Improvement
Reporter: Dong Chen
Assignee: Dong Chen
Priority: Minor


In most case, Beeline output the query log during execution and output the 
result at last. However, sometimes there are logs output after result, although 
the query has been done. This might make users a little confused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9226) Beeline interweaves the query result and query log sometimes

2014-12-30 Thread Dong Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-9226:

Status: Patch Available  (was: Open)

 Beeline interweaves the query result and query log sometimes
 

 Key: HIVE-9226
 URL: https://issues.apache.org/jira/browse/HIVE-9226
 Project: Hive
  Issue Type: Improvement
Reporter: Dong Chen
Assignee: Dong Chen
Priority: Minor

 In most case, Beeline output the query log during execution and output the 
 result at last. However, sometimes there are logs output after result, 
 although the query has been done. This might make users a little confused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8155) In select statement after * any random characters are allowed in hive but in RDBMS its not allowed

2014-12-30 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-8155:
-
Labels: TODOC15  (was: )

  In select statement after * any random characters are allowed in hive but in 
 RDBMS its not allowed
 ---

 Key: HIVE-8155
 URL: https://issues.apache.org/jira/browse/HIVE-8155
 Project: Hive
  Issue Type: Improvement
Reporter: Ferdinand Xu
Assignee: Dong Chen
Priority: Critical
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8155.1.patch, HIVE-8155.patch


 In select statement after * any random characters are allowed in hive but in 
 RDBMS its not allowed. 
 Steps:
 In the below query abcdef is random characters.
 In RDBMS(oracle): 
 select *abcdef from mytable;
 Output: 
 ERROR prepare() failed with: ORA-00923: FROM keyword not found where expected
 In Hive:
 select *abcdef from mytable;
 Output: 
 Query worked fine and display all the records of mytable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9226) Beeline interweaves the query result and query log sometimes

2014-12-30 Thread Dong Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-9226:

Attachment: HIVE-9226.patch

Hi [~brocknoland], 

I uploaded a patch to fix this issue, could you take a look when time is 
available? Thanks!

In Beeline, the query execution and result fetching is in one thread, and query 
log fetching is in another thread. The original idea is interrupting log thread 
and returning the result ASAP, then fetch remaining log if there are any.

Comparing the long time of query execution, it might be acceptable to show all 
the logs before outputting the result.

 Beeline interweaves the query result and query log sometimes
 

 Key: HIVE-9226
 URL: https://issues.apache.org/jira/browse/HIVE-9226
 Project: Hive
  Issue Type: Improvement
Reporter: Dong Chen
Assignee: Dong Chen
Priority: Minor
 Attachments: HIVE-9226.patch


 In most case, Beeline output the query log during execution and output the 
 result at last. However, sometimes there are logs output after result, 
 although the query has been done. This might make users a little confused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8155) In select statement after * any random characters are allowed in hive but in RDBMS its not allowed

2014-12-30 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260924#comment-14260924
 ] 

Lefty Leverenz commented on HIVE-8155:
--

Doc note:  This can be documented (with release information) in the Simple 
query bullet after the SELECT syntax.

* [Select Syntax | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-SelectSyntax]

  In select statement after * any random characters are allowed in hive but in 
 RDBMS its not allowed
 ---

 Key: HIVE-8155
 URL: https://issues.apache.org/jira/browse/HIVE-8155
 Project: Hive
  Issue Type: Improvement
Reporter: Ferdinand Xu
Assignee: Dong Chen
Priority: Critical
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8155.1.patch, HIVE-8155.patch


 In select statement after * any random characters are allowed in hive but in 
 RDBMS its not allowed. 
 Steps:
 In the below query abcdef is random characters.
 In RDBMS(oracle): 
 select *abcdef from mytable;
 Output: 
 ERROR prepare() failed with: ORA-00923: FROM keyword not found where expected
 In Hive:
 select *abcdef from mytable;
 Output: 
 Query worked fine and display all the records of mytable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9226) Beeline interweaves the query result and query log sometimes

2014-12-30 Thread Dong Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260928#comment-14260928
 ] 

Dong Chen commented on HIVE-9226:
-

cc [~chengxiang li]

 Beeline interweaves the query result and query log sometimes
 

 Key: HIVE-9226
 URL: https://issues.apache.org/jira/browse/HIVE-9226
 Project: Hive
  Issue Type: Improvement
Reporter: Dong Chen
Assignee: Dong Chen
Priority: Minor
 Attachments: HIVE-9226.patch


 In most case, Beeline output the query log during execution and output the 
 result at last. However, sometimes there are logs output after result, 
 although the query has been done. This might make users a little confused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7685) Parquet memory manager

2014-12-30 Thread Dong Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260929#comment-14260929
 ] 

Dong Chen commented on HIVE-7685:
-

The value is correctly passed down after verification.

 Parquet memory manager
 --

 Key: HIVE-7685
 URL: https://issues.apache.org/jira/browse/HIVE-7685
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Brock Noland
Assignee: Dong Chen
 Attachments: HIVE-7685.1.patch, HIVE-7685.1.patch.ready, 
 HIVE-7685.patch, HIVE-7685.patch.ready


 Similar to HIVE-4248, Parquet tries to write large very large row groups. 
 This causes Hive to run out of memory during dynamic partitions when a 
 reducer may have many Parquet files open at a given time.
 As such, we should implement a memory manager which ensures that we don't run 
 out of memory due to writing too many row groups within a single JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9119) ZooKeeperHiveLockManager does not use zookeeper in the proper way

2014-12-30 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260937#comment-14260937
 ] 

Lefty Leverenz commented on HIVE-9119:
--

Thanks for the changes, [~nyang].

One new question:  When you changed the default of 
*hive.zookeeper.session.timeout* to use a TimeValidator, did an extra zero slip 
into the value or was the original not in milliseconds?  (600*1000 - 
600ms.)

Also, you might want to split the description of 
*hive.zookeeper.connection.basesleeptime* into two lines.

 ZooKeeperHiveLockManager does not use zookeeper in the proper way
 -

 Key: HIVE-9119
 URL: https://issues.apache.org/jira/browse/HIVE-9119
 Project: Hive
  Issue Type: Improvement
  Components: Locking
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Na Yang
Assignee: Na Yang
 Attachments: HIVE-9119.1.patch, HIVE-9119.2.patch


 ZooKeeperHiveLockManager does not use zookeeper in the proper way. 
 Currently a new zookeeper client instance is created for each 
 getlock/releaselock query which sometimes causes the number of open 
 connections between
 HiveServer2 and ZooKeeper exceed the max connection number that zookeeper 
 server allows. 
 To use zookeeper as a distributed lock, there is no need to create a new 
 zookeeper instance for every getlock try. A single zookeeper instance could 
 be reused and shared by ZooKeeperHiveLockManagers.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-12-30 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7613:

Attachment: Hive on Spark Join Master Design.pdf

Attaching the master design-doc that describes all the Hive on Spark join 
optimizations, not just mapjoin but all the optimized joins.  It's now updated 
to match latest codebase, so can be useful for future code-maintenance.

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Fix For: spark-branch

 Attachments: HIve on Spark Map join background.docx, Hive on Spark 
 Join Master Design.pdf, small_table_broadcasting.pdf


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9227) Make HiveInputSplit support InputSplitWithLocationInfo

2014-12-30 Thread Rui Li (JIRA)

Rui Li created HIVE-9227:


 Summary: Make HiveInputSplit support InputSplitWithLocationInfo
 Key: HIVE-9227
 URL: https://issues.apache.org/jira/browse/HIVE-9227
 Project: Hive
  Issue Type: Improvement
Reporter: Rui Li
Assignee: Rui Li


This feature is introduced in MAPREDUCE-5896. We should support it in hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Building Hive-0.14 is failing because artifact pentaho-aggdesigner-algorithm-5.1.3-jhyde could not be resolved

2014-12-30 Thread Lefty Leverenz

Should this issue be documented in a release box in Getting Started:
Building Hive from Source
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-BuildingHivefromSource
?

-- Lefty

On Mon, Dec 29, 2014 at 2:04 PM, Alan Gates ga...@hortonworks.com wrote:

There was an issue with Hive 0.14 when it was released. We missed the
fact that it still had two SNAPSHOT dependencies in the pom. If you apply
the patches on HIVE-8845 (for Tez) and HIVE-8873 (for Calcite) that should
address your issue. This will be fixed in Hive 0.14.1.

Alan.

Ravi Prakash ravi...@ymail.com
December 22, 2014 at 14:14
Hi!
Has anyone tried building Hive-0.14 from source? I'm using the tag for
release-0.14.0 https://github.com/apache/hive/releases/tag/release-0.14.0

The command I use is: mvn install -DskipTests -Phadoop-2
-DcreateChecksum=true -Dtez.version=0.5.3 -Dcalcite.version=0.9.2-incubating

The build fails for me with the following error:[ERROR] Failed to execute
goal on project hive-exec: Could not resolve dependencies for project
org.apache.hive:hive-exec:jar:0.14.0: The following artifacts could not be
resolved: org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.3-jhyde,
net.hydromatic:linq4j:jar:0.4, net.hydromatic:quidem:jar:0.1.1: Could not
find artifact org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.3-jhyde in
nexus (http://localhost:8081/nexus/content/groups/public) - [Help 1]

This is a transitive dependency via the calcite-0.9.2-incubating
artifact. Is there a JIRA which someone can please point me to? It seems
wrong that an artifact with version 5.1.3-jhyde is required to build
Apache Hive, no disrespect to Julian. Am I missing something?
ThanksRavi

--
Sent with Postbox http://www.getpostbox.com

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity
to which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-7613) Research optimization of auto convert join to map join [Spark branch]

2014-12-30 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260943#comment-14260943
 ] 

Lefty Leverenz commented on HIVE-7613:
--

Should this join design doc be added to the wiki?  Or if not, should the 
existing Hive on Spark: Getting Started include a link to it?

* [Hive on Spark: Getting Started | 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started]

 Research optimization of auto convert join to map join [Spark branch]
 -

 Key: HIVE-7613
 URL: https://issues.apache.org/jira/browse/HIVE-7613
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Suhas Satish
Priority: Minor
 Fix For: spark-branch

 Attachments: HIve on Spark Map join background.docx, Hive on Spark 
 Join Master Design.pdf, small_table_broadcasting.pdf


 ConvertJoinMapJoin is an optimization the replaces a common join(aka shuffle 
 join) with a map join(aka broadcast or fragment replicate join) when 
 possible. we need to research how to make it workable with Hive on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9226) Beeline interweaves the query result and query log sometimes

2014-12-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261010#comment-14261010
 ] 

Hive QA commented on HIVE-9226:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12689457/HIVE-9226.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6723 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_list_bucket_dml_10
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2220/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2220/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2220/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12689457 - PreCommit-HIVE-TRUNK-Build

 Beeline interweaves the query result and query log sometimes
 

 Key: HIVE-9226
 URL: https://issues.apache.org/jira/browse/HIVE-9226
 Project: Hive
  Issue Type: Improvement
Reporter: Dong Chen
Assignee: Dong Chen
Priority: Minor
 Attachments: HIVE-9226.patch


 In most case, Beeline output the query log during execution and output the 
 result at last. However, sometimes there are logs output after result, 
 although the query has been done. This might make users a little confused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9228) Problem with subquery using windowing functions

2014-12-30 Thread Aihua Xu (JIRA)

Aihua Xu created HIVE-9228:
--

 Summary: Problem with subquery using windowing functions
 Key: HIVE-9228
 URL: https://issues.apache.org/jira/browse/HIVE-9228
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Aihua Xu
Assignee: Aihua Xu


The following query with window functions failed. The internal query works fine.

select st_fips_cd, zip_cd_5, hh_surr_key
from
(
select st_fips_cd, zip_cd_5, hh_surr_key,
count( case when advtg_len_rsdnc_cd = '1' then 1 end ) over (partition by 
st_fips_cd, zip_cd_5) as CNT_ADVTG_LEN_RSDNC_CD_1,
row_number() over (partition by st_fips_cd, zip_cd_5 order by hh_surr_key asc) 
as analytic_row_number3
from hh_agg
where analytic_row_number2 = 1
) t;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions

2014-12-30 Thread Mariano Dominguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariano Dominguez updated HIVE-9228:

Affects Version/s: 0.13.1

 Problem with subquery using windowing functions
 ---

 Key: HIVE-9228
 URL: https://issues.apache.org/jira/browse/HIVE-9228
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Aihua Xu
Assignee: Aihua Xu
   Original Estimate: 96h
  Remaining Estimate: 96h

 The following query with window functions failed. The internal query works 
 fine.
 select st_fips_cd, zip_cd_5, hh_surr_key
 from
 (
 select st_fips_cd, zip_cd_5, hh_surr_key,
 count( case when advtg_len_rsdnc_cd = '1' then 1 end ) over (partition by 
 st_fips_cd, zip_cd_5) as CNT_ADVTG_LEN_RSDNC_CD_1,
 row_number() over (partition by st_fips_cd, zip_cd_5 order by hh_surr_key 
 asc) as analytic_row_number3
 from hh_agg
 where analytic_row_number2 = 1
 ) t;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Backup Stage in query plan

2014-12-30 Thread Edson Ramiro

Hi all,

I found in the explain a backup stage, but I didn't find any docs
explaining what it is.

What is a backup stage? Do you have any doc about it?

I got this from the `explain formatted' of the TPCH 16 query.

 STAGE DEPENDENCIES: {
Stage-9: {
  DEPENDENT STAGES: Stage-3, Stage-6
},
Stage-8: {
  ROOT STAGE: TRUE,
  CONDITIONAL CHILD TASKS: Stage-10, Stage-3
},
Stage-2: {
  DEPENDENT STAGES: Stage-0
},
Stage-0: {
  DEPENDENT STAGES: Stage-5
},
Stage-6: {
  DEPENDENT STAGES: Stage-10
},
Stage-10: {
  BACKUP STAGE: Stage-3
},
Stage-5: {
  DEPENDENT STAGES: Stage-9
},
Stage-3: {}
  }

Thanks in advance,

  Edson Ramiro

[jira] [Updated] (HIVE-8920) IOContext problem with multiple MapWorks cloned for multi-insert [Spark Branch]

2014-12-30 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-8920:
--
Attachment: HIVE-8920.4-spark.patch

 IOContext problem with multiple MapWorks cloned for multi-insert [Spark 
 Branch]
 ---

 Key: HIVE-8920
 URL: https://issues.apache.org/jira/browse/HIVE-8920
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Xuefu Zhang
 Attachments: HIVE-8920.1-spark.patch, HIVE-8920.2-spark.patch, 
 HIVE-8920.3-spark.patch, HIVE-8920.4-spark.patch


 The following query will not work:
 {code}
 from (select * from table0 union all select * from table1) s
 insert overwrite table table3 select s.x, count(1) group by s.x
 insert overwrite table table4 select s.y, count(1) group by s.y;
 {code}
 Currently, the plan for this query, before SplitSparkWorkResolver, looks like 
 below:
 {noformat}
M1M2
  \  / \
   U3   R5
   |
   R4
 {noformat}
 In {{SplitSparkWorkResolver#splitBaseWork}}, it assumes that the 
 {{childWork}} is a ReduceWork, but for this case, you can see that for M2 the 
 childWork could be UnionWork U3. Thus, the code will fail.
 HIVE-9041 addressed partially addressed the problem by removing union task. 
 However, it's still necessary to cloning M1 and M2 to support multi-insert. 
 Because M1 and M2 can run in a single JVM, the original solution of storing a 
 global IOContext will not work because M1 and M2 have different io contexts, 
 both needing to be stored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9038) Join tests fail on Tez

2014-12-30 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-9038:
-
Assignee: Vikram Dixit K

 Join tests fail on Tez
 --

 Key: HIVE-9038
 URL: https://issues.apache.org/jira/browse/HIVE-9038
 Project: Hive
  Issue Type: Bug
  Components: Tests, Tez
Reporter: Ashutosh Chauhan
Assignee: Vikram Dixit K

 Tez doesn't run all tests. But, if you run them, following tests fail with 
 runt time exception pointing to bugs. 
 {{auto_join21.q,auto_join29.q,auto_join30.q
 ,auto_join_filters.q,auto_join_nulls.q}} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9167) Enhance encryption testing framework to allow create keys zones inside .q files

2014-12-30 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9167:
---
Labels: Kanban  (was: )

 Enhance encryption testing framework to allow create keys  zones inside .q 
 files
 -

 Key: HIVE-9167
 URL: https://issues.apache.org/jira/browse/HIVE-9167
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
  Labels: Kanban

 The current implementation of the encryption testing framework on HIVE-8900 
 initializes a couple of encrypted databases to be used on .q test files. This 
 is useful in order to make tests small, but it does not test all details 
 found on the encryption implementation, such as: encrypted tables with 
 different encryption strength in the same database.
 We need to allow this kind of encryption as it is how it will be used in the 
 real world where a database will have a few encrypted tables (not all the DB).
 Also, we need to make this encryption framework flexible so that we can 
 create/delete keys  zones on demand when running the .q files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9167) Enhance encryption testing framework to allow create keys zones inside .q files

2014-12-30 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9167:
---
Labels: Hive-Scrum  (was: Kanban)

 Enhance encryption testing framework to allow create keys  zones inside .q 
 files
 -

 Key: HIVE-9167
 URL: https://issues.apache.org/jira/browse/HIVE-9167
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
  Labels: Hive-Scrum

 The current implementation of the encryption testing framework on HIVE-8900 
 initializes a couple of encrypted databases to be used on .q test files. This 
 is useful in order to make tests small, but it does not test all details 
 found on the encryption implementation, such as: encrypted tables with 
 different encryption strength in the same database.
 We need to allow this kind of encryption as it is how it will be used in the 
 real world where a database will have a few encrypted tables (not all the DB).
 Also, we need to make this encryption framework flexible so that we can 
 create/delete keys  zones on demand when running the .q files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7898) HCatStorer should ignore namespaces generated by Pig

2014-12-30 Thread Justin Leet (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261275#comment-14261275
 ] 

Justin Leet commented on HIVE-7898:
---

This actually already happens in my patch. HCatStorer will abort with an error: 
e.g. Field named field already exists.  This isn't specifically in 
HCatBaseStorer, it actually occurs during the conversion from Pig Schema to 
HCatSchema in convertPigSchemaToHCatSchema(). The modified getColFromSchema 
will pass the now truncated name, so convertPigSchemaToHCatSchema() will 
attempt to add the now duplicated column and HCat won't allow the duplicated 
field to go through.

 HCatStorer should ignore namespaces generated by Pig
 

 Key: HIVE-7898
 URL: https://issues.apache.org/jira/browse/HIVE-7898
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Justin Leet
Assignee: Justin Leet
Priority: Minor
 Attachments: HIVE-7898.1.patch


 Currently, Pig aliases must exactly match the names of HCat columns for 
 HCatStorer to be successful.  However, several Pig operations prepend a 
 namespace to the alias in order to differentiate fields (e.g. after a group 
 with field b, you might have A::b).  In this case, even if the fields are in 
 the right order and the alias without namespace matches, the store will fail 
 because it tries to match the long form of the alias, despite the namespace 
 being extraneous information in this case.   Note that multiple aliases can 
 be applied (e.g. A::B::C::d).
 A workaround is possible by doing a 
 FOREACH relation GENERATE field1 AS field1, field2 AS field2, etc.  
 This quickly becomes tedious and bloated for tables with many fields.
 Changing this would normally require care around columns named, for example, 
 `A::b` as has been introduced in Hive 13.  However, a different function call 
 only validates Pig aliases if they follow the old rules for Hive columns.  As 
 such, a direct change (rather than attempting to match either the 
 namespace::alias or just alias) maintains compatibility for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7898) HCatStorer should ignore namespaces generated by Pig

2014-12-30 Thread Justin Leet (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261274#comment-14261274
 ] 

Justin Leet commented on HIVE-7898:
---

This actually already happens in my patch. HCatStorer will abort with an error: 
e.g. Field named field already exists.  This isn't specifically in 
HCatBaseStorer, it actually occurs during the conversion from Pig Schema to 
HCatSchema in convertPigSchemaToHCatSchema(). The modified getColFromSchema 
will pass the now truncated name, so convertPigSchemaToHCatSchema() will 
attempt to add the now duplicated column and HCat won't allow the duplicated 
field to go through.

 HCatStorer should ignore namespaces generated by Pig
 

 Key: HIVE-7898
 URL: https://issues.apache.org/jira/browse/HIVE-7898
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Justin Leet
Assignee: Justin Leet
Priority: Minor
 Attachments: HIVE-7898.1.patch


 Currently, Pig aliases must exactly match the names of HCat columns for 
 HCatStorer to be successful.  However, several Pig operations prepend a 
 namespace to the alias in order to differentiate fields (e.g. after a group 
 with field b, you might have A::b).  In this case, even if the fields are in 
 the right order and the alias without namespace matches, the store will fail 
 because it tries to match the long form of the alias, despite the namespace 
 being extraneous information in this case.   Note that multiple aliases can 
 be applied (e.g. A::B::C::d).
 A workaround is possible by doing a 
 FOREACH relation GENERATE field1 AS field1, field2 AS field2, etc.  
 This quickly becomes tedious and bloated for tables with many fields.
 Changing this would normally require care around columns named, for example, 
 `A::b` as has been introduced in Hive 13.  However, a different function call 
 only validates Pig aliases if they follow the old rules for Hive columns.  As 
 such, a direct change (rather than attempting to match either the 
 namespace::alias or just alias) maintains compatibility for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9167) Enhance encryption testing framework to allow create keys zones inside .q files

2014-12-30 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261279#comment-14261279
 ] 

Brock Noland commented on HIVE-9167:


Hi,

Thank you Sergio! I am going to go ahead and commit this since you will be out 
after today. We can address and remaining issues as follow-on jiras.

Thank you! 

 Enhance encryption testing framework to allow create keys  zones inside .q 
 files
 -

 Key: HIVE-9167
 URL: https://issues.apache.org/jira/browse/HIVE-9167
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergio Peña
Assignee: Sergio Peña
  Labels: Hive-Scrum
 Attachments: HIVE-9167.4.patch


 The current implementation of the encryption testing framework on HIVE-8900 
 initializes a couple of encrypted databases to be used on .q test files. This 
 is useful in order to make tests small, but it does not test all details 
 found on the encryption implementation, such as: encrypted tables with 
 different encryption strength in the same database.
 We need to allow this kind of encryption as it is how it will be used in the 
 real world where a database will have a few encrypted tables (not all the DB).
 Also, we need to make this encryption framework flexible so that we can 
 create/delete keys  zones on demand when running the .q files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 134 matches

Mail list logo