[jira] [Updated] (HIVE-11094) Beeline redirecting all output to ErrorStream

2015-06-26 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11094:
---
Assignee: (was: Jesus Camacho Rodriguez)

 Beeline redirecting all output to ErrorStream
 -

 Key: HIVE-11094
 URL: https://issues.apache.org/jira/browse/HIVE-11094
 Project: Hive
  Issue Type: Bug
  Components: CLI
Reporter: Jesus Camacho Rodriguez
 Attachments: HIVE-11094.patch


 Beeline is sending all output to ErrorStream, instead of using OutputStream 
 for info or debug information.
 The problem can be reproduced by running:
 {noformat}
 ./bin/beeline -u jdbc:hive2:// -e show databases  exec.out
 {noformat}
 I will still print the output through the terminal. The reason seems to be 
 that the normal output is also sent through the ErrorStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602451#comment-14602451
 ] 

Hive QA commented on HIVE-11104:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741994/HIVE-11104.3.patch

{color:green}SUCCESS:{color} +1 9024 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4387/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4387/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4387/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741994 - PreCommit-HIVE-TRUNK-Build

 Select operator doesn't propagate constants appearing in expressions
 

 Key: HIVE-11104
 URL: https://issues.apache.org/jira/browse/HIVE-11104
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11094) Beeline redirecting all output to ErrorStream

2015-06-26 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602450#comment-14602450
 ] 

Thejas M Nair commented on HIVE-11094:
--

The 'correct' behavior (based on hive-cli as well as most other tools) is to 
send only output to stdout and all info/warning etc goes to stderr.
The info messages are considered similar to log messages, just lower level than 
warn.


 Beeline redirecting all output to ErrorStream
 -

 Key: HIVE-11094
 URL: https://issues.apache.org/jira/browse/HIVE-11094
 Project: Hive
  Issue Type: Bug
  Components: CLI
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11094.patch


 Beeline is sending all output to ErrorStream, instead of using OutputStream 
 for info or debug information.
 The problem can be reproduced by running:
 {noformat}
 ./bin/beeline -u jdbc:hive2:// -e show databases  exec.out
 {noformat}
 I will still print the output through the terminal. The reason seems to be 
 that the normal output is also sent through the ErrorStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-6791:
---
Attachment: HIVE-6791.4-beeline-cli.patch

Update the patch addressing Xuefu's latest comments

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-6791-beeline-cli.2.patch, 
 HIVE-6791-beeline-cli.3.patch, HIVE-6791-beeline-cli.patch, 
 HIVE-6791.3-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
 HIVE-6791.4-beeline-cli.patch


 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11094) Beeline redirecting all output to ErrorStream

2015-06-26 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602454#comment-14602454
 ] 

Jesus Camacho Rodriguez commented on HIVE-11094:


Ok, thanks for the clarification, I was confused. I'll proceed and close the 
issue then. Thanks!

 Beeline redirecting all output to ErrorStream
 -

 Key: HIVE-11094
 URL: https://issues.apache.org/jira/browse/HIVE-11094
 Project: Hive
  Issue Type: Bug
  Components: CLI
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11094.patch


 Beeline is sending all output to ErrorStream, instead of using OutputStream 
 for info or debug information.
 The problem can be reproduced by running:
 {noformat}
 ./bin/beeline -u jdbc:hive2:// -e show databases  exec.out
 {noformat}
 I will still print the output through the terminal. The reason seems to be 
 that the normal output is also sent through the ErrorStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9970) Hive on spark

2015-06-26 Thread JoneZhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602467#comment-14602467
 ] 

JoneZhang commented on HIVE-9970:
-

I have resolved the problem.
first hive cli will load $HIVE_HONE\lib\*.jar accurately.
then spark will load old version hive jar because in 
$SPARK_HOME\conf\spark-ent.sh
export 
SPARK_CLASSPATH=$SPARK_HOME/lib/*:$HADOOP_HOME/share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar:$HIVE_HOME/lib/hive-contrib-0.12.0.jar:$HIVE_HOME/lib/hive-common-0.12.0.jar:$HIVE_HOME/bin/hive-cli

-0.12.0.jar:$HIVE_HOME/lib/hive-serde-0.12.0.jar:$HIVE_HOME/lib/:$EXTRA_CLASSPATH

however HiveConf.class in hive-common-0.12.0.jar does not contain 
SPARK_RPC_CLIENT_CONNECT_TIMEOUT.
so, java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT has occur.

 Hive on spark
 -

 Key: HIVE-9970
 URL: https://issues.apache.org/jira/browse/HIVE-9970
 Project: Hive
  Issue Type: Bug
Reporter: Amithsha

 Hi all,
 Recently i have configured Spark 1.2.0 and my environment is hadoop
 2.6.0 hive 1.1.0 Here i have tried hive on Spark while executing
 insert into i am getting the following g error.
 Query ID = hadoop2_20150313162828_8764adad-a8e4-49da-9ef5-35e4ebd6bc63
 Total jobs = 1
 Launching Job 1 out of 1
 In order to change the average load for a reducer (in bytes):
 set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
 set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
 set mapreduce.job.reduces=number
 Failed to execute spark task, with exception
 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create
 spark client.)'
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.spark.SparkTask
 Have added the spark-assembly jar in hive lib
 And also in hive console using the command add jar followed by the steps
 set spark.home=/opt/spark-1.2.1/;
 add jar 
 /opt/spark-1.2.1/assembly/target/scala-2.10/spark-assembly-1.2.1-hadoop2.4.0.jar;
 set hive.execution.engine=spark;
 set spark.master=spark://xxx:7077;
 set spark.eventLog.enabled=true;
 set spark.executor.memory=512m;
 set spark.serializer=org.apache.spark.serializer.KryoSerializer;
 Can anyone suggest
 Thanks  Regards
 Amithsha



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-6791:
---
Attachment: HIVE-6791.5-beeline-cli.patch

Rebase the patch

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-6791-beeline-cli.2.patch, 
 HIVE-6791-beeline-cli.3.patch, HIVE-6791-beeline-cli.patch, 
 HIVE-6791.3-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
 HIVE-6791.4-beeline-cli.patch, HIVE-6791.5-beeline-cli.patch


 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602551#comment-14602551
 ] 

Chengxiang Li commented on HIVE-10983:
--

Nice found, thanks for working on this issue, [~xiaowei].
For the patch, do you think we can just use 
{code:java}
return new Text(new String(text.getBytes(), 0, text.getLength(), 
previousCharset))
{code}
so that we do not need extra memory copy introduced in the patch.

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt


 {noformat}
 The mothod transformTextToUTF8 have a error bug,It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602518#comment-14602518
 ] 

Hive QA commented on HIVE-2:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741917/HIVE-2.1.patch

{color:green}SUCCESS:{color} +1 9026 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4388/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4388/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4388/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741917 - PreCommit-HIVE-TRUNK-Build

 ISO-8859-1 text output has fragments of previous longer rows appended
 -

 Key: HIVE-2
 URL: https://issues.apache.org/jira/browse/HIVE-2
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-2.1.patch


 If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query 
 results for a string column are incorrect for any row that was preceded by a 
 row containing a longer string.
 Example steps to reproduce:
 1. Create a table using ISO 8859-1 encoding:
 CREATE TABLE person_lat1 (name STRING)
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
 SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
 in HDFS. I'll attach an example file containing the following text: 
 Müller,Thomas
 Jørgensen,Jørgen
 Peña,Andrés
 Nåm,Fæk
 3. Execute SELECT * FROM person_lat1
 Result - The following output appears:
 +---+--+
 | person_lat1.name |
 +---+--+
 | Müller,Thomas |
 | Jørgensen,Jørgen |
 | Peña,Andrésørgen |
 | Nåm,Fækdrésørgen |
 +---+--+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602519#comment-14602519
 ] 

Hive QA commented on HIVE-6791:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742061/HIVE-6791.4-beeline-cli.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/3/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/3/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-BEELINE-Build-3/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-BEELINE-Build-3/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z beeline-cli ]]
+ [[ -d apache-git-beeline-source ]]
+ [[ ! -d apache-git-beeline-source/.git ]]
+ [[ ! -d apache-git-beeline-source ]]
+ cd apache-git-beeline-source
+ git fetch origin
From https://github.com/apache/hive
   2243de3..00e0d55  beeline-cli - origin/beeline-cli
   c5dc87a..cc4075b  llap   - origin/llap
+ git reset --hard HEAD
HEAD is now at 2243de3 HIVE-10905 QuitExit fails ending with ';' [beeline-cli 
Branch](Chinna Rao Lalam, reviewed by Ferdinand Xu)
+ git clean -f -d
Removing common/src/java/org/apache/hadoop/hive/conf/HiveVariableSource.java
Removing common/src/java/org/apache/hadoop/hive/conf/VariableSubstitution.java
Removing 
common/src/test/org/apache/hadoop/hive/conf/TestVariableSubstitution.java
+ git checkout beeline-cli
Already on 'beeline-cli'
Your branch is behind 'origin/beeline-cli' by 259 commits, and can be 
fast-forwarded.
+ git reset --hard origin/beeline-cli
HEAD is now at 00e0d55 Merge branch 'master' into beeline-cli
+ git merge --ff-only origin/beeline-cli
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742061 - PreCommit-HIVE-BEELINE-Build

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-6791-beeline-cli.2.patch, 
 HIVE-6791-beeline-cli.3.patch, HIVE-6791-beeline-cli.patch, 
 HIVE-6791.3-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
 HIVE-6791.4-beeline-cli.patch


 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-6791:
---
Attachment: HIVE-6791.5-beeline-cli.patch

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-6791-beeline-cli.2.patch, 
 HIVE-6791-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
 HIVE-6791.3-beeline-cli.patch, HIVE-6791.4-beeline-cli.patch, 
 HIVE-6791.5-beeline-cli.patch


 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-6791:
---
Attachment: (was: HIVE-6791-beeline-cli.3.patch)

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-6791-beeline-cli.2.patch, 
 HIVE-6791-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
 HIVE-6791.3-beeline-cli.patch, HIVE-6791.4-beeline-cli.patch, 
 HIVE-6791.5-beeline-cli.patch


 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-6791:
---
Attachment: (was: HIVE-6791.5-beeline-cli.patch)

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-6791-beeline-cli.2.patch, 
 HIVE-6791-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
 HIVE-6791.3-beeline-cli.patch, HIVE-6791.4-beeline-cli.patch, 
 HIVE-6791.5-beeline-cli.patch


 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602583#comment-14602583
 ] 

xiaowei wang commented on HIVE-10983:
-

Your method is better,more Concise .
According to your suggestions,I will put up another a  patch 
Thanks Very Much!  

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt


 {noformat}
 The mothod transformTextToUTF8 have a error bug,It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603108#comment-14603108
 ] 

Hive QA commented on HIVE-10983:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742078/HIVE-10983.4.patch.txt

{color:green}SUCCESS:{color} +1 9025 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4394/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4394/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4394/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742078 - PreCommit-HIVE-TRUNK-Build

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
 HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt


 {noformat}
 The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
 invoke a bad method of Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11125) when i run a sql use hive on spark, it seem like the hive cli finished, but the application is always running

2015-06-26 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11125:
---
Labels: TODOC-SPARK  (was: )

 when i run a sql use hive on spark, it seem like the hive cli finished, but 
 the application is always running
 -

 Key: HIVE-11125
 URL: https://issues.apache.org/jira/browse/HIVE-11125
 Project: Hive
  Issue Type: Bug
  Components: spark-branch
Affects Versions: 1.2.0
 Environment: Hive1.2.0
 Spark1.3.1
 Hadoop2.5.1
Reporter: JoneZhang
Assignee: Xuefu Zhang
  Labels: TODOC-SPARK

 when i run a sql use hive on spark,.
 The hive cli has finished
 hive (default) select count(id) from t1 where id100;
 Query ID = mqq_20150626174732_9e18f0c9-7b56-46ab-bf90-3b66f1a51300
 Total jobs = 1
 Launching Job 1 out of 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapreduce.job.reduces=number
 Starting Spark Job = 7d34cb8c-eaad-4724-a99a-37e517db80d9
 Query Hive on Spark job[0] stages:
 0
 1
 Status: Running (Hive on Spark job[0])
 Job Progress Format
 CurrentTime StageId_StageAttemptId: 
 SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount 
 [StageCost]
 2015-06-26 17:47:53,746 Stage-0_0: 0(+1)/5  Stage-1_0: 0/1
 2015-06-26 17:47:56,771 Stage-0_0: 1(+0)/5  Stage-1_0: 0/1
 2015-06-26 17:47:57,778 Stage-0_0: 4(+1)/5  Stage-1_0: 0/1
 2015-06-26 17:47:59,791 Stage-0_0: 5/5 Finished Stage-1_0: 0(+1)/1
 2015-06-26 17:48:00,797 Stage-0_0: 5/5 Finished Stage-1_0: 1/1 Finished
 Status: Finished successfully in 18.08 seconds
 OK
 5
 Time taken: 28.512 seconds, Fetched: 1 row(s)
 But the application is always running state on resourcemanager
 User: mqq
 Name: Hive on Spark
 Application Type: SPARK
 Application Tags: 
 State:RUNNING
 FinalStatus:  UNDEFINED
 Started:  2015-06-26 17:47:38
 Elapsed:  24mins, 33sec
 Tracking URL: ApplicationMaster
 Diagnostics:  
 the hive.log is 
 2015-06-26 18:12:26,878 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:26 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 2015-06-26 18:12:27,879 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:27 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 2015-06-26 18:12:28,880 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:28 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11100) Beeline should escape semi-colon in queries

2015-06-26 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11100:
---
Attachment: (was: HIVE-11100.patch)

 Beeline should escape semi-colon in queries
 ---

 Key: HIVE-11100
 URL: https://issues.apache.org/jira/browse/HIVE-11100
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 1.2.0, 1.1.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Minor

 Beeline should escape the semicolon in queries. for example, the query like 
 followings:
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY ';' LINES TERMINATED BY '\n';
 or 
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY '\;' LINES TERMINATED BY '\n';
 both failed.
 But the 2nd query with semicolon escaped with \ works in CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended

2015-06-26 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602848#comment-14602848
 ] 

Yongzhi Chen commented on HIVE-2:
-

[~ctang.ma] , [~xuefuz] could you review the change? Thanks

 ISO-8859-1 text output has fragments of previous longer rows appended
 -

 Key: HIVE-2
 URL: https://issues.apache.org/jira/browse/HIVE-2
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-2.1.patch


 If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query 
 results for a string column are incorrect for any row that was preceded by a 
 row containing a longer string.
 Example steps to reproduce:
 1. Create a table using ISO 8859-1 encoding:
 CREATE TABLE person_lat1 (name STRING)
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
 SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
 in HDFS. I'll attach an example file containing the following text: 
 Müller,Thomas
 Jørgensen,Jørgen
 Peña,Andrés
 Nåm,Fæk
 3. Execute SELECT * FROM person_lat1
 Result - The following output appears:
 +---+--+
 | person_lat1.name |
 +---+--+
 | Müller,Thomas |
 | Jørgensen,Jørgen |
 | Peña,Andrésørgen |
 | Nåm,Fækdrésørgen |
 +---+--+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11125) when i run a sql use hive on spark, it seem like the hive cli finished, but the application is always running

2015-06-26 Thread JoneZhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603083#comment-14603083
 ] 

JoneZhang commented on HIVE-11125:
--

Thank you very much.
This sitiation is different from hive on mapreduce.
I suggest that we should add some description about this in wiki.

 when i run a sql use hive on spark, it seem like the hive cli finished, but 
 the application is always running
 -

 Key: HIVE-11125
 URL: https://issues.apache.org/jira/browse/HIVE-11125
 Project: Hive
  Issue Type: Bug
  Components: spark-branch
Affects Versions: 1.2.0
 Environment: Hive1.2.0
 Spark1.3.1
 Hadoop2.5.1
Reporter: JoneZhang
Assignee: Xuefu Zhang

 when i run a sql use hive on spark,.
 The hive cli has finished
 hive (default) select count(id) from t1 where id100;
 Query ID = mqq_20150626174732_9e18f0c9-7b56-46ab-bf90-3b66f1a51300
 Total jobs = 1
 Launching Job 1 out of 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapreduce.job.reduces=number
 Starting Spark Job = 7d34cb8c-eaad-4724-a99a-37e517db80d9
 Query Hive on Spark job[0] stages:
 0
 1
 Status: Running (Hive on Spark job[0])
 Job Progress Format
 CurrentTime StageId_StageAttemptId: 
 SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount 
 [StageCost]
 2015-06-26 17:47:53,746 Stage-0_0: 0(+1)/5  Stage-1_0: 0/1
 2015-06-26 17:47:56,771 Stage-0_0: 1(+0)/5  Stage-1_0: 0/1
 2015-06-26 17:47:57,778 Stage-0_0: 4(+1)/5  Stage-1_0: 0/1
 2015-06-26 17:47:59,791 Stage-0_0: 5/5 Finished Stage-1_0: 0(+1)/1
 2015-06-26 17:48:00,797 Stage-0_0: 5/5 Finished Stage-1_0: 1/1 Finished
 Status: Finished successfully in 18.08 seconds
 OK
 5
 Time taken: 28.512 seconds, Fetched: 1 row(s)
 But the application is always running state on resourcemanager
 User: mqq
 Name: Hive on Spark
 Application Type: SPARK
 Application Tags: 
 State:RUNNING
 FinalStatus:  UNDEFINED
 Started:  2015-06-26 17:47:38
 Elapsed:  24mins, 33sec
 Tracking URL: ApplicationMaster
 Diagnostics:  
 the hive.log is 
 2015-06-26 18:12:26,878 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:26 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 2015-06-26 18:12:27,879 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:27 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 2015-06-26 18:12:28,880 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:28 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11125) when i run a sql use hive on spark, it seem like the hive cli finished, but the application is always running

2015-06-26 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-11125.

Resolution: Not A Problem

 when i run a sql use hive on spark, it seem like the hive cli finished, but 
 the application is always running
 -

 Key: HIVE-11125
 URL: https://issues.apache.org/jira/browse/HIVE-11125
 Project: Hive
  Issue Type: Bug
  Components: spark-branch
Affects Versions: 1.2.0
 Environment: Hive1.2.0
 Spark1.3.1
 Hadoop2.5.1
Reporter: JoneZhang
Assignee: Xuefu Zhang

 when i run a sql use hive on spark,.
 The hive cli has finished
 hive (default) select count(id) from t1 where id100;
 Query ID = mqq_20150626174732_9e18f0c9-7b56-46ab-bf90-3b66f1a51300
 Total jobs = 1
 Launching Job 1 out of 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapreduce.job.reduces=number
 Starting Spark Job = 7d34cb8c-eaad-4724-a99a-37e517db80d9
 Query Hive on Spark job[0] stages:
 0
 1
 Status: Running (Hive on Spark job[0])
 Job Progress Format
 CurrentTime StageId_StageAttemptId: 
 SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount 
 [StageCost]
 2015-06-26 17:47:53,746 Stage-0_0: 0(+1)/5  Stage-1_0: 0/1
 2015-06-26 17:47:56,771 Stage-0_0: 1(+0)/5  Stage-1_0: 0/1
 2015-06-26 17:47:57,778 Stage-0_0: 4(+1)/5  Stage-1_0: 0/1
 2015-06-26 17:47:59,791 Stage-0_0: 5/5 Finished Stage-1_0: 0(+1)/1
 2015-06-26 17:48:00,797 Stage-0_0: 5/5 Finished Stage-1_0: 1/1 Finished
 Status: Finished successfully in 18.08 seconds
 OK
 5
 Time taken: 28.512 seconds, Fetched: 1 row(s)
 But the application is always running state on resourcemanager
 User: mqq
 Name: Hive on Spark
 Application Type: SPARK
 Application Tags: 
 State:RUNNING
 FinalStatus:  UNDEFINED
 Started:  2015-06-26 17:47:38
 Elapsed:  24mins, 33sec
 Tracking URL: ApplicationMaster
 Diagnostics:  
 the hive.log is 
 2015-06-26 18:12:26,878 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:26 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 2015-06-26 18:12:27,879 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:27 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 2015-06-26 18:12:28,880 INFO  [stderr-redir-1]: client.SparkClientImpl 
 (SparkClientImpl.java:run(569)) - 15/06/26 18:12:28 main INFO 
 org.apache.spark.deploy.yarn.Client Application report for 
 application_1433328839160_0071 (state: RUNNING)
 ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11100) Beeline should escape semi-colon in queries

2015-06-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603184#comment-14603184
 ] 

Xuefu Zhang commented on HIVE-11100:


Can we find out how CLI is escaping? I feel a little uncomfortable the way we 
are processing multiple command lines using split by ; and then manually 
fixing the escaping problem. Ideally, we should add a grammar that can handle 
this.

 Beeline should escape semi-colon in queries
 ---

 Key: HIVE-11100
 URL: https://issues.apache.org/jira/browse/HIVE-11100
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 1.2.0, 1.1.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Minor
 Attachments: HIVE-11100.patch


 Beeline should escape the semicolon in queries. for example, the query like 
 followings:
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY ';' LINES TERMINATED BY '\n';
 or 
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY '\;' LINES TERMINATED BY '\n';
 both failed.
 But the 2nd query with semicolon escaped with \ works in CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended

2015-06-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603194#comment-14603194
 ] 

Xuefu Zhang commented on HIVE-2:


+1

 ISO-8859-1 text output has fragments of previous longer rows appended
 -

 Key: HIVE-2
 URL: https://issues.apache.org/jira/browse/HIVE-2
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-2.1.patch


 If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query 
 results for a string column are incorrect for any row that was preceded by a 
 row containing a longer string.
 Example steps to reproduce:
 1. Create a table using ISO 8859-1 encoding:
 CREATE TABLE person_lat1 (name STRING)
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
 SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
 in HDFS. I'll attach an example file containing the following text: 
 Müller,Thomas
 Jørgensen,Jørgen
 Peña,Andrés
 Nåm,Fæk
 3. Execute SELECT * FROM person_lat1
 Result - The following output appears:
 +---+--+
 | person_lat1.name |
 +---+--+
 | Müller,Thomas |
 | Jørgensen,Jørgen |
 | Peña,Andrésørgen |
 | Nåm,Fækdrésørgen |
 +---+--+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11100) Beeline should escape semi-colon in queries

2015-06-26 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11100:
---
Attachment: HIVE-11100.patch

For unknown reason, the precommit test did not run. Reattach the patch to kick 
off the build.

 Beeline should escape semi-colon in queries
 ---

 Key: HIVE-11100
 URL: https://issues.apache.org/jira/browse/HIVE-11100
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 1.2.0, 1.1.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Minor
 Attachments: HIVE-11100.patch


 Beeline should escape the semicolon in queries. for example, the query like 
 followings:
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY ';' LINES TERMINATED BY '\n';
 or 
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY '\;' LINES TERMINATED BY '\n';
 both failed.
 But the 2nd query with semicolon escaped with \ works in CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11100) Beeline should escape semi-colon in queries

2015-06-26 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602982#comment-14602982
 ] 

Chaoyu Tang commented on HIVE-11100:


Patch was uploaded to https://reviews.apache.org/r/35907/ and requested for 
review. Thanks in advance.

 Beeline should escape semi-colon in queries
 ---

 Key: HIVE-11100
 URL: https://issues.apache.org/jira/browse/HIVE-11100
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 1.2.0, 1.1.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Minor
 Attachments: HIVE-11100.patch


 Beeline should escape the semicolon in queries. for example, the query like 
 followings:
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY ';' LINES TERMINATED BY '\n';
 or 
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY '\;' LINES TERMINATED BY '\n';
 both failed.
 But the 2nd query with semicolon escaped with \ works in CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-26 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603259#comment-14603259
 ] 

Aihua Xu commented on HIVE-10895:
-

[~thejas], [~xuefuz], [~ctang.ma], could you please review the code? 

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Aihua Xu
 Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, 
 HIVE-10895.3.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7598) Potential null pointer dereference in MergeTask#closeJob()

2015-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-7598:
-
Description: 
Call to Utilities.mvFileToFinalPath() passes null as second last parameter, 
conf.

null gets passed to createEmptyBuckets() which dereferences conf directly:
{code}
boolean isCompressed = conf.getCompressed();
TableDesc tableInfo = conf.getTableInfo();
{code}

  was:
Call to Utilities.mvFileToFinalPath() passes null as second last parameter, 
conf.
null gets passed to createEmptyBuckets() which dereferences conf directly:
{code}
boolean isCompressed = conf.getCompressed();
TableDesc tableInfo = conf.getTableInfo();
{code}


 Potential null pointer dereference in MergeTask#closeJob()
 --

 Key: HIVE-7598
 URL: https://issues.apache.org/jira/browse/HIVE-7598
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: SUYEON LEE
Priority: Minor
 Attachments: HIVE-7598.patch


 Call to Utilities.mvFileToFinalPath() passes null as second last parameter, 
 conf.
 null gets passed to createEmptyBuckets() which dereferences conf directly:
 {code}
 boolean isCompressed = conf.getCompressed();
 TableDesc tableInfo = conf.getTableInfo();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7672) Potential resource leak in EximUtil#createExportDump()

2015-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-7672:
-
Description: 
Here is related code:
{code}
  OutputStream out = fs.create(metadataPath);
  out.write(jsonContainer.toString().getBytes(UTF-8));
  out.close();
{code}
If out.write() throws exception, out would be left unclosed.
out.close() should be enclosed in finally block.

  was:
Here is related code:

{code}
  OutputStream out = fs.create(metadataPath);
  out.write(jsonContainer.toString().getBytes(UTF-8));
  out.close();
{code}
If out.write() throws exception, out would be left unclosed.
out.close() should be enclosed in finally block.


 Potential resource leak in EximUtil#createExportDump()
 --

 Key: HIVE-7672
 URL: https://issues.apache.org/jira/browse/HIVE-7672
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: SUYEON LEE
Priority: Minor
 Attachments: HIVE-7672.patch


 Here is related code:
 {code}
   OutputStream out = fs.create(metadataPath);
   out.write(jsonContainer.toString().getBytes(UTF-8));
   out.close();
 {code}
 If out.write() throws exception, out would be left unclosed.
 out.close() should be enclosed in finally block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603287#comment-14603287
 ] 

Hive QA commented on HIVE-11055:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742081/HIVE-11055.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4396/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4396/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4396/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Downloading: 
http://repo.maven.apache.org/maven2/org/abego/treelayout/org.abego.treelayout.core/1.0.1/org.abego.treelayout.core-1.0.1.pom
4/4 KB   
 
Downloaded: 
http://repo.maven.apache.org/maven2/org/abego/treelayout/org.abego.treelayout.core/1.0.1/org.abego.treelayout.core-1.0.1.pom
 (4 KB at 275.5 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/antlr/antlr4-runtime/4.5/antlr4-runtime-4.5.jar
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/abego/treelayout/org.abego.treelayout.core/1.0.1/org.abego.treelayout.core-1.0.1.jar
 
 
Downloading: 
http://repo.maven.apache.org/maven2/org/antlr/antlr4-runtime/4.5/antlr4-runtime-4.5.jar
Downloading: 
http://repo.maven.apache.org/maven2/org/abego/treelayout/org.abego.treelayout.core/1.0.1/org.abego.treelayout.core-1.0.1.jar
4/366 KB   
8/366 KB   
12/366 KB   4/25 KB   
8/366 KB   4/25 KB   
12/366 KB   6/25 KB   
16/366 KB   10/25 KB   
16/366 KB   14/25 KB   
16/366 KB   18/25 KB   
16/366 KB   6/25 KB   
20/366 KB   18/25 KB   
24/366 KB   18/25 KB   
28/366 KB   22/25 KB   
32/366 KB   22/25 KB   
24/366 KB   22/25 KB   
32/366 KB   25/25 KB   
36/366 KB   25/25 KB   
40/366 KB   25/25 KB   
   
Downloaded: 
http://repo.maven.apache.org/maven2/org/abego/treelayout/org.abego.treelayout.core/1.0.1/org.abego.treelayout.core-1.0.1.jar
 (25 KB at 692.1 KB/sec)
44/366 KB  
48/366 KB   
52/366 KB   
56/366 KB   
60/366 KB   
64/366 KB   
68/366 KB   
72/366 KB   
76/366 KB   
80/366 KB   
84/366 KB   
88/366 KB   
92/366 KB   
96/366 KB   
100/366 KB   
104/366 KB   
108/366 KB   
112/366 KB   
116/366 KB   
120/366 KB   
124/366 KB   
128/366 KB   
132/366 KB   
136/366 KB   
140/366 KB   
144/366 KB   
148/366 KB   
152/366 KB   
156/366 KB   
160/366 KB   
164/366 KB   
168/366 KB   
172/366 KB   
176/366 KB   
180/366 KB   
184/366 KB   
188/366 KB   
192/366 KB   
196/366 KB   
200/366 KB   
204/366 KB   
208/366 KB   
212/366 KB   
216/366 KB   
220/366 KB   
224/366 KB   
228/366 KB   
232/366 KB   
236/366 KB   
240/366 KB   
244/366 KB   
248/366 KB   
252/366 KB   
256/366 KB   
260/366 KB   
264/366 KB   
268/366 KB   
272/366 KB   
276/366 KB   
280/366 KB   
284/366 KB   
288/366 KB   
292/366 KB   
296/366 KB   
300/366 KB   
304/366 KB   
308/366 KB   
312/366 KB   
316/366 KB   
320/366 KB   
324/366 KB   
328/366 KB   
332/366 KB   
336/366 KB   
340/366 KB   
344/366 KB   
348/366 KB   
352/366 KB   
356/366 KB   
360/366 KB   
364/366 KB   
366/366 KB   
 
Downloaded: 
http://repo.maven.apache.org/maven2/org/antlr/antlr4-runtime/4.5/antlr4-runtime-4.5.jar
 (366 KB at 2536.6 KB/sec)
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hplsql ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/hplsql 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-hplsql ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-hplsql ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hive-hplsql ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/hplsql/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-hplsql ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-hplsql ---
[INFO] Compiling 30 source files to 
/data/hive-ptest/working/apache-github-source-source/hplsql/target/classes
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
/data/hive-ptest/working/apache-github-source-source/hplsql/src/main/java/org/apache/hive/hplsql/Copy.java:[292,38]
 cannot find symbol
  symbol:   method resolvePath(org.apache.hadoop.fs.Path)
  location: variable fs of type 

[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.18.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10754) new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog

2015-06-26 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603366#comment-14603366
 ] 

Aihua Xu commented on HIVE-10754:
-

[~mithun] and [~ctang.ma], can you guys take a look? Should be straightforward. 
The test failed is not related.

 new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog
 -

 Key: HIVE-10754
 URL: https://issues.apache.org/jira/browse/HIVE-10754
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10754.patch


 Replace all the deprecated new Job() with Job.getInstance() in HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11128) Stats annotation should consider select star same as select without column list

2015-06-26 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603384#comment-14603384
 ] 

Prasanth Jayachandran commented on HIVE-11128:
--

I think this is not the actual issue. Select 1 from table is a valid query. 
Although not a select * query. It will contain empty column list. But output 
column expression map and output signature will have references to the constant 
which will be taken into account during data size estimation.

 Stats annotation should consider select star same as select without column 
 list
 ---

 Key: HIVE-11128
 URL: https://issues.apache.org/jira/browse/HIVE-11128
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11128.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11119) Spark reduce vectorization doesnt account for scratch columns

2015-06-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603406#comment-14603406
 ] 

Xuefu Zhang commented on HIVE-9:


Patch looks good to me. Just left a minor question on RB. Thanks.

+1

 Spark reduce vectorization doesnt account for scratch columns
 -

 Key: HIVE-9
 URL: https://issues.apache.org/jira/browse/HIVE-9
 Project: Hive
  Issue Type: Bug
  Components: Spark, Vectorization
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11119) Spark reduce vectorization doesnt account for scratch columns

2015-06-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603417#comment-14603417
 ] 

Ashutosh Chauhan commented on HIVE-9:
-

yeah.. will move to util class

 Spark reduce vectorization doesnt account for scratch columns
 -

 Key: HIVE-9
 URL: https://issues.apache.org/jira/browse/HIVE-9
 Project: Hive
  Issue Type: Bug
  Components: Spark, Vectorization
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-26 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10895:

Attachment: (was: HIVE-10895.3.patch)

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Aihua Xu
 Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, 
 HIVE-10895.3.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603272#comment-14603272
 ] 

Hive QA commented on HIVE-11095:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742079/HIVE-11095.2.patch.txt

{color:green}SUCCESS:{color} +1 9025 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4395/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4395/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4395/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742079 - PreCommit-HIVE-TRUNK-Build

 SerDeUtils  another bug ,when Text is reused
 

 Key: HIVE-11095
 URL: https://issues.apache.org/jira/browse/HIVE-11095
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
 Fix For: 1.2.0

 Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt


 {noformat}
 The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 How I found this bug?
 When i query data from a lzo table , I found in results : the length of the 
 current row is always largr than the previous row, and sometimes,the current 
 row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select * from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
 `line` string)
 PARTITIONED BY (
 `logdate` string)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '
 U'
 WITH SERDEPROPERTIES (
 'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT com.hadoop.mapred.DeprecatedLzoTextInputFormat
 OUTPUTFORMAT org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.17.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10950) Unit test against HBase Metastore

2015-06-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-10950:
--
Attachment: HIVE-10950-2.patch

The previous patch cannot recover from failures. So it can only run individual 
qtest, but not the whole TestCliDriver. Attach a new patch solves the problem. 
It brings back a clean snapshot of hbase metastore after every test. 

 Unit test against HBase Metastore
 -

 Key: HIVE-10950
 URL: https://issues.apache.org/jira/browse/HIVE-10950
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10950-1.patch, HIVE-10950-2.patch


 We need to run the entire Hive UT against HBase Metastore and make sure they 
 pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10376) Move code to create jar for ivydownload.q to a separate id in maven ant-run-plugin in itests/pom.xml and remove sed dependency.

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603296#comment-14603296
 ] 

Hive QA commented on HIVE-10376:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726129/HIVE-10376.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4397/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4397/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4397/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4397/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2a77e87 HIVE-11051: Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray 
cannot be cast to [Ljava.lang.Object; (Matt McCline via Gopal V)
+ git clean -f -d
Removing bin/ext/hplsql.sh
Removing bin/hplsql
Removing bin/hplsql.cmd
Removing hplsql/
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 2a77e87 HIVE-11051: Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray 
cannot be cast to [Ljava.lang.Object; (Matt McCline via Gopal V)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726129 - PreCommit-HIVE-TRUNK-Build

 Move code to create jar for ivydownload.q to a separate id in maven 
 ant-run-plugin in itests/pom.xml and remove sed dependency.
 ---

 Key: HIVE-10376
 URL: https://issues.apache.org/jira/browse/HIVE-10376
 Project: Hive
  Issue Type: Improvement
Reporter: Anant Nag
Assignee: Anant Nag
 Attachments: HIVE-10376.patch


 Currently the code to create an example jar for ivyDownload.q is  piggybanked 
 on the download-spark ant-run-plugin id.  This code should be moved to a 
 separate execution id called something like create-ivytest-jar or more 
 generally itests-setup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11128) Stats annotation should consider select star same as select without column list

2015-06-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11128:

Attachment: HIVE-11128.patch

 Stats annotation should consider select star same as select without column 
 list
 ---

 Key: HIVE-11128
 URL: https://issues.apache.org/jira/browse/HIVE-11128
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11128.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603307#comment-14603307
 ] 

Ashutosh Chauhan commented on HIVE-11104:
-

[~prasanth_j] Its an unrelated issue which exists in StatsAnnotation rules. 
Opened HIVE-11128 for it.

 Select operator doesn't propagate constants appearing in expressions
 

 Key: HIVE-11104
 URL: https://issues.apache.org/jira/browse/HIVE-11104
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603205#comment-14603205
 ] 

Ashutosh Chauhan commented on HIVE-11104:
-

[~prasanth_j] Can you please take a look?

 Select operator doesn't propagate constants appearing in expressions
 

 Key: HIVE-11104
 URL: https://issues.apache.org/jira/browse/HIVE-11104
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-26 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10895:

Attachment: HIVE-10895.3.patch

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Aihua Xu
 Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, 
 HIVE-10895.3.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7305) Return value from in.read() is ignored in SerializationUtils#readLongLE()

2015-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-7305:
-
Description: 
{code}
  long readLongLE(InputStream in) throws IOException {
in.read(readBuffer, 0, 8);
return (((readBuffer[0]  0xff)  0)
+ ((readBuffer[1]  0xff)  8)
{code}

Return value from read() may indicate fewer than 8 bytes read.
The return value should be checked.

  was:
{code}
  long readLongLE(InputStream in) throws IOException {
in.read(readBuffer, 0, 8);
return (((readBuffer[0]  0xff)  0)
+ ((readBuffer[1]  0xff)  8)
{code}
Return value from read() may indicate fewer than 8 bytes read.
The return value should be checked.


 Return value from in.read() is ignored in SerializationUtils#readLongLE()
 -

 Key: HIVE-7305
 URL: https://issues.apache.org/jira/browse/HIVE-7305
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: skrho
Priority: Minor
 Attachments: HIVE-7305_001.patch


 {code}
   long readLongLE(InputStream in) throws IOException {
 in.read(readBuffer, 0, 8);
 return (((readBuffer[0]  0xff)  0)
 + ((readBuffer[1]  0xff)  8)
 {code}
 Return value from read() may indicate fewer than 8 bytes read.
 The return value should be checked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-26 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603284#comment-14603284
 ] 

Prasanth Jayachandran commented on HIVE-11104:
--

The stats diff does not look correct. All data sizes are now 0 which will break 
all join optimizations.

 Select operator doesn't propagate constants appearing in expressions
 

 Key: HIVE-11104
 URL: https://issues.apache.org/jira/browse/HIVE-11104
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended

2015-06-26 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603301#comment-14603301
 ] 

Yongzhi Chen commented on HIVE-2:
-

Thanks [~xuefuz] for reviewing it. 

 ISO-8859-1 text output has fragments of previous longer rows appended
 -

 Key: HIVE-2
 URL: https://issues.apache.org/jira/browse/HIVE-2
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-2.1.patch


 If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query 
 results for a string column are incorrect for any row that was preceded by a 
 row containing a longer string.
 Example steps to reproduce:
 1. Create a table using ISO 8859-1 encoding:
 CREATE TABLE person_lat1 (name STRING)
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
 SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
 in HDFS. I'll attach an example file containing the following text: 
 Müller,Thomas
 Jørgensen,Jørgen
 Peña,Andrés
 Nåm,Fæk
 3. Execute SELECT * FROM person_lat1
 Result - The following output appears:
 +---+--+
 | person_lat1.name |
 +---+--+
 | Müller,Thomas |
 | Jørgensen,Jørgen |
 | Peña,Andrésørgen |
 | Nåm,Fækdrésørgen |
 +---+--+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11118) Load data query should valide file formats with destination tables

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8:
--
Attachment: HIVE-8.4.patch

 Load data query should valide file formats with destination tables
 --

 Key: HIVE-8
 URL: https://issues.apache.org/jira/browse/HIVE-8
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-8.2.patch, HIVE-8.3.patch, 
 HIVE-8.4.patch, HIVE-8.patch


 Load data local inpath queries does not do any validation wrt file format. If 
 the destination table is ORC and if we try to load files that are not ORC, 
 the load will succeed but querying such tables will result in runtime 
 exceptions. We can do some simple sanity checks to prevent loading of files 
 that does not match the destination table file format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-26 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603378#comment-14603378
 ] 

Prasanth Jayachandran commented on HIVE-11104:
--

I think the issues seems to be with column expression map not containing 
ExprNodeConstantDesc. Stats annotation is aware of constant projections if it 
is contained in colExprMap. That being said, I am fine with taking this in 
subsequent follow up jira. +1

 Select operator doesn't propagate constants appearing in expressions
 

 Key: HIVE-11104
 URL: https://issues.apache.org/jira/browse/HIVE-11104
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11032) Enable more tests for grouping by skewed data [Spark Branch]

2015-06-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603622#comment-14603622
 ] 

Xuefu Zhang commented on HIVE-11032:


[~mohitsabharwal], could you please create a JIRA tracking the missing feature 
of hive.explain.user and related tests? Thanks.

 Enable more tests for grouping by skewed data [Spark Branch]
 

 Key: HIVE-11032
 URL: https://issues.apache.org/jira/browse/HIVE-11032
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Mohit Sabharwal
Priority: Minor
 Attachments: HIVE-11032.1-spark.patch, HIVE-11032.2-spark.patch


 Not all of such tests are enabled, e.g. {{groupby1_map_skew.q}}. We can use 
 this JIRA to track whether we need more of them.
 Basically, we need to look at all tests with {{set 
 hive.groupby.skewindata=true;}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.20.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11031) ORC concatenation of old files can fail while merging column statistics

2015-06-26 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603439#comment-14603439
 ] 

Prasanth Jayachandran commented on HIVE-11031:
--

It could be because of this or HIVE-10685. Can you try with branch-1.2 and see 
if it works for your query? Alternatively you can provide me a small repro. I 
can verify and confirm.

 ORC concatenation of old files can fail while merging column statistics
 ---

 Key: HIVE-11031
 URL: https://issues.apache.org/jira/browse/HIVE-11031
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Fix For: 1.2.1

 Attachments: HIVE-11031-branch-1.0.patch, HIVE-11031.2.patch, 
 HIVE-11031.3.patch, HIVE-11031.4.patch, HIVE-11031.patch


 Column statistics in ORC are optional protobuf fields. Old ORC files might 
 not have statistics for newly added types like decimal, date, timestamp etc. 
 But column statistics merging assumes column statistics exists for these 
 types and invokes merge. For example, merging of TimestampColumnStatistics 
 directly casts the received ColumnStatistics object without doing instanceof 
 check. If the ORC file contains time stamp column statistics then this will 
 work else it will throw ClassCastException.
 Also, the file merge operator swallows the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603458#comment-14603458
 ] 

Hive QA commented on HIVE-10983:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742100/HIVE-10983.5.patch.txt

{color:green}SUCCESS:{color} +1 9025 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4398/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4398/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4398/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742100 - PreCommit-HIVE-TRUNK-Build

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
 HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt


 {noformat}
 The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
 invoke a bad method of Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-26 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603489#comment-14603489
 ] 

Dmitry Tolpeko commented on HIVE-11055:
---

I will need to compile for hadoop-1 and try to find replacement for method 
resolvePath(org.apache.hadoop.fs.Path) available in hadoop-2 only. Any hints 
how to deals with such cases? Thanks.

 HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
 ---

 Key: HIVE-11055
 URL: https://issues.apache.org/jira/browse/HIVE-11055
 Project: Hive
  Issue Type: Improvement
Reporter: Dmitry Tolpeko
Assignee: Dmitry Tolpeko
 Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch


 There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
 (actually any SQL-on-Hadoop implementation and any JDBC source).
 Alan Gates offered to contribute it to Hive under HPL/SQL name 
 (org.apache.hive.hplsql package). This JIRA is to create a patch to 
 contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11130) Refactoring the code so that HiveTxnManager interface will support lock/unlock table/database object

2015-06-26 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11130:

Attachment: HIVE-11130.patch

 Refactoring the code so that HiveTxnManager interface will support 
 lock/unlock table/database object
 

 Key: HIVE-11130
 URL: https://issues.apache.org/jira/browse/HIVE-11130
 Project: Hive
  Issue Type: Sub-task
  Components: Locking
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-11130.patch


 This is just a refactoring step which keeps the current logic, but it exposes 
 the explicit lock/unlock table and database  in HiveTxnManager which should 
 be implemented differently by the subclasses ( currently it's not. e.g., for 
 ZooKeeper implementation, we should lock table and database when we try to 
 lock the table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603549#comment-14603549
 ] 

Ashutosh Chauhan commented on HIVE-11104:
-

Pushed to branch-1 as well.

 Select operator doesn't propagate constants appearing in expressions
 

 Key: HIVE-11104
 URL: https://issues.apache.org/jira/browse/HIVE-11104
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 2.0.0

 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11100) Beeline should escape semi-colon in queries

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603561#comment-14603561
 ] 

Hive QA commented on HIVE-11100:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742137/HIVE-11100.patch

{color:green}SUCCESS:{color} +1 9032 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4399/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4399/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4399/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742137 - PreCommit-HIVE-TRUNK-Build

 Beeline should escape semi-colon in queries
 ---

 Key: HIVE-11100
 URL: https://issues.apache.org/jira/browse/HIVE-11100
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 1.2.0, 1.1.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Minor
 Attachments: HIVE-11100.patch


 Beeline should escape the semicolon in queries. for example, the query like 
 followings:
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY ';' LINES TERMINATED BY '\n';
 or 
 CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY '\;' LINES TERMINATED BY '\n';
 both failed.
 But the 2nd query with semicolon escaped with \ works in CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions

2015-06-26 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603468#comment-14603468
 ] 

Gunther Hagleitner commented on HIVE-11104:
---

[~ashutoshc] branch-1?

 Select operator doesn't propagate constants appearing in expressions
 

 Key: HIVE-11104
 URL: https://issues.apache.org/jira/browse/HIVE-11104
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 2.0.0

 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.19.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603478#comment-14603478
 ] 

Gunther Hagleitner commented on HIVE-10233:
---

.19 has more reported fixes (fallback in case of all joins are small, actually 
making fallback work...)

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11129) Issue a warning when copied from UTF-8 to ISO 8859-1

2015-06-26 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-11129:
---

Assignee: Aihua Xu

 Issue a warning when copied from UTF-8 to ISO 8859-1
 

 Key: HIVE-11129
 URL: https://issues.apache.org/jira/browse/HIVE-11129
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Aihua Xu
Assignee: Aihua Xu

 Copying data from a table using UTF-8 encoding to one using ISO 8859-1 
 encoding causes data corruption without warning.
 {noformat}
 CREATE TABLE person_utf8 (name STRING)
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
 WITH SERDEPROPERTIES ('serialization.encoding'='UTF8');
 {noformat}
 Put the following data in the table:
 Müller,Thomas
 Jørgensen,Jørgen
 Vega,Andrés
 中村,浩人
 אביה,נועם
 {noformat}
 CREATE TABLE person_2 ROW FORMAT SERDE 
 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
 WITH SERDEPROPERTIES ('serialization.encoding'='ISO8859_1')
 AS select * from person_utf8;
 {noformat}
 expected to get mangled data but we should give a warning. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11032) Enable more tests for grouping by skewed data [Spark Branch]

2015-06-26 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603481#comment-14603481
 ] 

Mohit Sabharwal commented on HIVE-11032:


Thanks [~lirui], yes verified that query plan is in line with what we see in MR.

When {{hive.groupby.skewindata=true}} is set, unless there is a distinct 
clause, the Reduce Output Operator partitions based on {{rand()}}. (The 
subsequent Reducer then does partial aggregation and the following reducer does 
final aggregation.)

I also verified the behavior for other cases as well, for example when 
{{hive.map.aggr=true}} is set in addition to {{hive.groupby.skewindata=true}} 
as documented here: 
https://cwiki.apache.org/confluence/display/Hive/GroupByWithRollup

The {{index_bitmap3}} test failure is unrelated to this patch. 

 Enable more tests for grouping by skewed data [Spark Branch]
 

 Key: HIVE-11032
 URL: https://issues.apache.org/jira/browse/HIVE-11032
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Mohit Sabharwal
Priority: Minor
 Attachments: HIVE-11032.1-spark.patch, HIVE-11032.2-spark.patch


 Not all of such tests are enabled, e.g. {{groupby1_map_skew.q}}. We can use 
 this JIRA to track whether we need more of them.
 Basically, we need to look at all tests with {{set 
 hive.groupby.skewindata=true;}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11106) HiveServer2 JDBC (greater than v0.13.1) cannot connect to non-default database

2015-06-26 Thread Tom Coleman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Coleman updated HIVE-11106:
---
Description: 
Using HiveServer 0.14.0 or greater, I cannot connect a non-default database.

For example when connecting to HiveServer to via the following URLs, the 
session uses the 'default' database, instead of the intended database.
jdbc://localhost:1/customDb

This exact issue was fixed in 0.13.1 of HiveServer from 
https://issues.apache.org/jira/browse/HIVE-5904 but for some reason this fix 
was not ported to v0.14.0 or greater. From looking at the source, it looks as 
if this fix was overriden by another change to the HiveConnection class, was 
this intentional or a defect reintroduced from another defect fix?

This means that we need to use 0.13.1 in order to connect to a non-default 
database via JDBC and we cannot upgrade Hive versions. We don't want placing a 
JDBC interceptor to inject use customDb each time a connection is borrowed 
from the pool on production code. One should be able to connect straight to the 
non-default database via the JDBC URL.

Now it perhaps could be a simple oversight on my behalf in which the syntax to 
connect to a non-default database has changed from 0.14.0 onwards but I'd be 
grateful is this could be confirmed.

  was:
Using HiveServer 0.14.0 or greater, I cannot connect a non-default database.

For example when connecting to HiveServer to via the following URLs, the 
session uses the 'default' database, instead of the intended database.
jdbc://localhost:1/customDb

This exact issue was fixed in 0.13.1 of HiveServer from 
https://issues.apache.org/jira/browse/HIVE-5904 but for some reason this fix 
was not ported to v0.14.0 or greater. From looking at the source, it looks as 
if this fix was overriden by another change to the HiveConnection class, was 
this intentional or a defect reintroduced from another defect fix?

This means that we need to use 0.13.1 in order to connect to a non-default 
database via JDBC and we cannot upgrade Hive versions.

Now it perhaps could be a simple oversight on my behalf in which the syntax to 
connect to a non-default database has changed from 0.14.0 onwards but I'd be 
grateful is this could be confirmed.


 HiveServer2 JDBC (greater than v0.13.1) cannot connect to non-default database
 --

 Key: HIVE-11106
 URL: https://issues.apache.org/jira/browse/HIVE-11106
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.14.0
Reporter: Tom Coleman

 Using HiveServer 0.14.0 or greater, I cannot connect a non-default database.
 For example when connecting to HiveServer to via the following URLs, the 
 session uses the 'default' database, instead of the intended database.
 jdbc://localhost:1/customDb
 This exact issue was fixed in 0.13.1 of HiveServer from 
 https://issues.apache.org/jira/browse/HIVE-5904 but for some reason this fix 
 was not ported to v0.14.0 or greater. From looking at the source, it looks as 
 if this fix was overriden by another change to the HiveConnection class, was 
 this intentional or a defect reintroduced from another defect fix?
 This means that we need to use 0.13.1 in order to connect to a non-default 
 database via JDBC and we cannot upgrade Hive versions. We don't want placing 
 a JDBC interceptor to inject use customDb each time a connection is 
 borrowed from the pool on production code. One should be able to connect 
 straight to the non-default database via the JDBC URL.
 Now it perhaps could be a simple oversight on my behalf in which the syntax 
 to connect to a non-default database has changed from 0.14.0 onwards but I'd 
 be grateful is this could be confirmed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602637#comment-14602637
 ] 

Chengxiang Li commented on HIVE-10983:
--

Great, [~xiaowei], let's wait for the unit test result. Besides, could you also 
test it with your own test case.

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
 HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt


 {noformat}
 The mothod transformTextToUTF8 have a error bug,It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-26 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602645#comment-14602645
 ] 

Dmitry Tolpeko commented on HIVE-11055:
---

Correction: plhql-site.xml - hplsql-site.xml

 HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
 ---

 Key: HIVE-11055
 URL: https://issues.apache.org/jira/browse/HIVE-11055
 Project: Hive
  Issue Type: Improvement
Reporter: Dmitry Tolpeko
Assignee: Dmitry Tolpeko
 Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch


 There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
 (actually any SQL-on-Hadoop implementation and any JDBC source).
 Alan Gates offered to contribute it to Hive under HPL/SQL name 
 (org.apache.hive.hplsql package). This JIRA is to create a patch to 
 contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11118) Load data query should validate file formats with destination tables

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8:
--
Summary: Load data query should validate file formats with destination 
tables  (was: Load data query should valide file formats with destination 
tables)

 Load data query should validate file formats with destination tables
 

 Key: HIVE-8
 URL: https://issues.apache.org/jira/browse/HIVE-8
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-8.2.patch, HIVE-8.3.patch, 
 HIVE-8.4.patch, HIVE-8.patch


 Load data local inpath queries does not do any validation wrt file format. If 
 the destination table is ORC and if we try to load files that are not ORC, 
 the load will succeed but querying such tables will result in runtime 
 exceptions. We can do some simple sanity checks to prevent loading of files 
 that does not match the destination table file format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11128) Stats annotation should consider select star same as select without column list

2015-06-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11128:

Attachment: HIVE-11128.2.patch

 Stats annotation should consider select star same as select without column 
 list
 ---

 Key: HIVE-11128
 URL: https://issues.apache.org/jira/browse/HIVE-11128
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-11128.2.patch, HIVE-11128.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11118) Load data query should validate file formats with destination tables

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603885#comment-14603885
 ] 

Hive QA commented on HIVE-8:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742179/HIVE-8.4.patch

{color:green}SUCCESS:{color} +1 9030 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4402/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4402/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4402/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742179 - PreCommit-HIVE-TRUNK-Build

 Load data query should validate file formats with destination tables
 

 Key: HIVE-8
 URL: https://issues.apache.org/jira/browse/HIVE-8
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-8.2.patch, HIVE-8.3.patch, 
 HIVE-8.4.patch, HIVE-8.patch


 Load data local inpath queries does not do any validation wrt file format. If 
 the destination table is ORC and if we try to load files that are not ORC, 
 the load will succeed but querying such tables will result in runtime 
 exceptions. We can do some simple sanity checks to prevent loading of files 
 that does not match the destination table file format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11133) Support hive.explain.user for Spark [Spark Branch]

2015-06-26 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11133:
---
Summary: Support hive.explain.user for Spark [Spark Branch]  (was: Support 
hive.explain.user for Spark)

 Support hive.explain.user for Spark [Spark Branch]
 --

 Key: HIVE-11133
 URL: https://issues.apache.org/jira/browse/HIVE-11133
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Mohit Sabharwal

 User friendly explain output ({{set hive.explain.user=true}}) should support 
 Spark as well. 
 Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603901#comment-14603901
 ] 

Prasanth Jayachandran commented on HIVE-10233:
--

Looks good to me too. 

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
 HIVE-10233.21.patch, HIVE-10233.22.patch, HIVE-10233.23.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.22.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
 HIVE-10233.21.patch, HIVE-10233.22.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11032) Enable more tests for grouping by skewed data [Spark Branch]

2015-06-26 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603814#comment-14603814
 ] 

Mohit Sabharwal commented on HIVE-11032:


Created HIVE-11133 to support {{hive.explain.user}} for Spark.

 Enable more tests for grouping by skewed data [Spark Branch]
 

 Key: HIVE-11032
 URL: https://issues.apache.org/jira/browse/HIVE-11032
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Mohit Sabharwal
Priority: Minor
 Attachments: HIVE-11032.1-spark.patch, HIVE-11032.2-spark.patch


 Not all of such tests are enabled, e.g. {{groupby1_map_skew.q}}. We can use 
 this JIRA to track whether we need more of them.
 Basically, we need to look at all tests with {{set 
 hive.groupby.skewindata=true;}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11028) Tez: table self join and join with another table fails with IndexOutOfBoundsException

2015-06-26 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-11028:
--
Fix Version/s: 1.2.2

 Tez: table self join and join with another table fails with 
 IndexOutOfBoundsException
 -

 Key: HIVE-11028
 URL: https://issues.apache.org/jira/browse/HIVE-11028
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 2.0.0, 1.2.2

 Attachments: HIVE-11028.1.patch, HIVE-11028.2.patch, 
 HIVE-11028.3.patch


 {noformat}
 create table tez_self_join1(id1 int, id2 string, id3 string);
 insert into table tez_self_join1 values(1, 'aa','bb'), (2, 'ab','ab'), 
 (3,'ba','ba');
 create table tez_self_join2(id1 int);
 insert into table tez_self_join2 values(1),(2),(3);
 explain
 select s.id2, s.id3
 from
 (
  select self1.id1, self1.id2, self1.id3
  from tez_self_join1 self1 join tez_self_join1 self2
  on self1.id2=self2.id3 ) s
 join tez_self_join2
 on s.id1=tez_self_join2.id1
 where s.id2='ab';
 {noformat}
 fails with error:
 {noformat}
 2015-06-16 15:41:55,759 ERROR [main]: ql.Driver 
 (SessionState.java:printError(979)) - FAILED: Execution Error, return code 2 
 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, 
 vertexName=Reducer 3, vertexId=vertex_1434494327112_0002_4_04, 
 diagnostics=[Task failed, taskId=task_1434494327112_0002_4_04_00, 
 diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: Index: 
 0, Size: 0
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:109)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:313)
 at 
 org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:71)
 at 
 org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.initializeOp(CommonMergeJoinOperator.java:99)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
 at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
 ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603817#comment-14603817
 ] 

Hive QA commented on HIVE-10233:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742236/HIVE-10233.21.patch

{color:green}SUCCESS:{color} +1 9027 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4401/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4401/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4401/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742236 - PreCommit-HIVE-TRUNK-Build

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
 HIVE-10233.21.patch, HIVE-10233.22.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.23.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
 HIVE-10233.21.patch, HIVE-10233.22.patch, HIVE-10233.23.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.21.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
 HIVE-10233.21.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
---
Attachment: HIVE-11131.1.patch

 Get row information on DataWritableWriter once for better writing performance
 -

 Key: HIVE-11131
 URL: https://issues.apache.org/jira/browse/HIVE-11131
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-11131.1.patch


 DataWritableWriter is a class used to write Hive records to Parquet files. 
 This class is getting all the information about how to parse a record, such 
 as schema and object inspector, every time a record is written (or write() is 
 called).
 We can make this class perform better by initializing some writers per data
 type once, and saving all object inspectors on each writer.
 The class expects that the next records written will have the same object 
 inspectors and schema, so there is no need to have conditions for that. When 
 a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
---
Attachment: (was: HIVE-11131.1.patch)

 Get row information on DataWritableWriter once for better writing performance
 -

 Key: HIVE-11131
 URL: https://issues.apache.org/jira/browse/HIVE-11131
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-11131.1.patch


 DataWritableWriter is a class used to write Hive records to Parquet files. 
 This class is getting all the information about how to parse a record, such 
 as schema and object inspector, every time a record is written (or write() is 
 called).
 We can make this class perform better by initializing some writers per data
 type once, and saving all object inspectors on each writer.
 The class expects that the next records written will have the same object 
 inspectors and schema, so there is no need to have conditions for that. When 
 a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-26 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603852#comment-14603852
 ] 

Vikram Dixit K commented on HIVE-10233:
---

+1 LGTM.

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
 HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
 HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
 HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
 HIVE-10233.21.patch, HIVE-10233.22.patch, HIVE-10233.23.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8177) Wrong parameter order in ExplainTask#getJSONLogicalPlan()

2015-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8177:
-
Description: 
{code}
  JSONObject jsonPlan = outputMap(work.getParseContext().getTopOps(), true,
  out, jsonOutput, work.getExtended(), 0);
{code}

The order of 4th and 5th parameters is reverted.

  was:
{code}
  JSONObject jsonPlan = outputMap(work.getParseContext().getTopOps(), true,
  out, jsonOutput, work.getExtended(), 0);
{code}
The order of 4th and 5th parameters is reverted.


 Wrong parameter order in ExplainTask#getJSONLogicalPlan()
 -

 Key: HIVE-8177
 URL: https://issues.apache.org/jira/browse/HIVE-8177
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: SUYEON LEE
Priority: Minor
 Attachments: HIVE-8177.patch


 {code}
   JSONObject jsonPlan = outputMap(work.getParseContext().getTopOps(), 
 true,
   out, jsonOutput, work.getExtended(), 0);
 {code}
 The order of 4th and 5th parameters is reverted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8342) Potential null dereference in ColumnTruncateMapper#jobClose()

2015-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8342:
-
Description: 
{code}
Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, null,
  reporter);
{code}
Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is 
dereferenced:
{code}
boolean isCompressed = conf.getCompressed();
TableDesc tableInfo = conf.getTableInfo();
{code}

  was:
{code}
Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, null,
  reporter);
{code}

Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is 
dereferenced:
{code}
boolean isCompressed = conf.getCompressed();
TableDesc tableInfo = conf.getTableInfo();
{code}


 Potential null dereference in ColumnTruncateMapper#jobClose()
 -

 Key: HIVE-8342
 URL: https://issues.apache.org/jira/browse/HIVE-8342
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: skrho
Priority: Minor
 Attachments: HIVE-8342_001.patch, HIVE-8342_002.patch


 {code}
 Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, 
 null,
   reporter);
 {code}
 Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is 
 dereferenced:
 {code}
 boolean isCompressed = conf.getCompressed();
 TableDesc tableInfo = conf.getTableInfo();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8343) Return value from BlockingQueue.offer() is not checked in DynamicPartitionPruner

2015-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8343:
-
Description: 
In addEvent() and processVertex(), there is call such as the following:

{code}
  queue.offer(event);
{code}
The return value should be checked. If false is returned, event would not have 
been queued.

Take a look at line 328 in:
http://fuseyism.com/classpath/doc/java/util/concurrent/LinkedBlockingQueue-source.html

  was:
In addEvent() and processVertex(), there is call such as the following:
{code}
  queue.offer(event);
{code}
The return value should be checked. If false is returned, event would not have 
been queued.

Take a look at line 328 in:
http://fuseyism.com/classpath/doc/java/util/concurrent/LinkedBlockingQueue-source.html


 Return value from BlockingQueue.offer() is not checked in 
 DynamicPartitionPruner
 

 Key: HIVE-8343
 URL: https://issues.apache.org/jira/browse/HIVE-8343
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: JongWon Park
Priority: Minor
 Attachments: HIVE-8343.patch


 In addEvent() and processVertex(), there is call such as the following:
 {code}
   queue.offer(event);
 {code}
 The return value should be checked. If false is returned, event would not 
 have been queued.
 Take a look at line 328 in:
 http://fuseyism.com/classpath/doc/java/util/concurrent/LinkedBlockingQueue-source.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8285) Reference equality is used on boolean values in PartitionPruner#removeTruePredciates()

2015-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8285:
-
Description: 
{code}
  if (e.getTypeInfo() == TypeInfoFactory.booleanTypeInfo
   eC.getValue() == Boolean.TRUE) {
{code}
equals() should be used in the above comparison.

  was:
{code}
  if (e.getTypeInfo() == TypeInfoFactory.booleanTypeInfo
   eC.getValue() == Boolean.TRUE) {
{code}

equals() should be used in the above comparison.


 Reference equality is used on boolean values in 
 PartitionPruner#removeTruePredciates()
 --

 Key: HIVE-8285
 URL: https://issues.apache.org/jira/browse/HIVE-8285
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Ted Yu
Priority: Minor
 Attachments: HIVE-8285.patch


 {code}
   if (e.getTypeInfo() == TypeInfoFactory.booleanTypeInfo
eC.getValue() == Boolean.TRUE) {
 {code}
 equals() should be used in the above comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
---
Attachment: (was: HIVE-11131.1.patch)

 Get row information on DataWritableWriter once for better writing performance
 -

 Key: HIVE-11131
 URL: https://issues.apache.org/jira/browse/HIVE-11131
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-11131.1.patch


 DataWritableWriter is a class used to write Hive records to Parquet files. 
 This class is getting all the information about how to parse a record, such 
 as schema and object inspector, every time a record is written (or write() is 
 called).
 We can make this class perform better by initializing some writers per data
 type once, and saving all object inspectors on each writer.
 The class expects that the next records written will have the same object 
 inspectors and schema, so there is no need to have conditions for that. When 
 a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
---
Attachment: HIVE-11131.1.patch

 Get row information on DataWritableWriter once for better writing performance
 -

 Key: HIVE-11131
 URL: https://issues.apache.org/jira/browse/HIVE-11131
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-11131.1.patch


 DataWritableWriter is a class used to write Hive records to Parquet files. 
 This class is getting all the information about how to parse a record, such 
 as schema and object inspector, every time a record is written (or write() is 
 called).
 We can make this class perform better by initializing some writers per data
 type once, and saving all object inspectors on each writer.
 The class expects that the next records written will have the same object 
 inspectors and schema, so there is no need to have conditions for that. When 
 a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11132) Queries using join and group by produce incorrect output when hive.auto.convert.join=false and hive.optimize.reducededuplication=true

2015-06-26 Thread Rich Haase (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603705#comment-14603705
 ] 

Rich Haase commented on HIVE-11132:
---

Explain plan when hive.auto.convert.join=false and 
hive.optimize.reducededuplication=true:

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: mooo
Statistics: Num rows: 1511511 Data size: 3402058087 Basic stats: 
COMPLETE Column stats: NONE
Filter Operator
  predicate: (((oppty_id is not null and oppty_line_id is not null) 
and (order_order_system  'sfdc_performance')) and (oppty_id = 
'006400CZbnWAAT')) (type: boolean)
  Statistics: Num rows: 188939 Data size: 425257542 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
key expressions: '006400CZbnWAAT' (type: string), 
oppty_line_id (type: string)
sort order: ++
Map-reduce partition columns: '006400CZbnWAAT' (type: 
string), oppty_line_id (type: string)
Statistics: Num rows: 188939 Data size: 425257542 Basic stats: 
COMPLETE Column stats: NONE
  TableScan
alias: mooo_s
Statistics: Num rows: 1511511 Data size: 940228122 Basic stats: 
COMPLETE Column stats: NONE
Filter Operator
  predicate: ((oppty_id is not null and oppty_line_id is not null) 
and (oppty_id = '006400CZbnWAAT')) (type: boolean)
  Statistics: Num rows: 188939 Data size: 117528593 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
key expressions: '006400CZbnWAAT' (type: string), 
oppty_line_id (type: string)
sort order: ++
Map-reduce partition columns: '006400CZbnWAAT' (type: 
string), oppty_line_id (type: string)
Statistics: Num rows: 188939 Data size: 117528593 Basic stats: 
COMPLETE Column stats: NONE
  TableScan
alias: forecast
Statistics: Num rows: 29923099 Data size: 7723657280 Basic stats: 
COMPLETE Column stats: NONE
Filter Operator
  predicate: ((oppty_id is not null and oppty_line_id is not null) 
and (oppty_id = '006400CZbnWAAT')) (type: boolean)
  Statistics: Num rows: 3740387 Data size: 965457063 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
key expressions: '006400CZbnWAAT' (type: string), 
oppty_line_id (type: string)
sort order: ++
Map-reduce partition columns: '006400CZbnWAAT' (type: 
string), oppty_line_id (type: string)
Statistics: Num rows: 3740387 Data size: 965457063 Basic stats: 
COMPLETE Column stats: NONE
  TableScan
alias: split
Statistics: Num rows: 2072636 Data size: 524862652 Basic stats: 
COMPLETE Column stats: NONE
Filter Operator
  predicate: ((oppty_id is not null and oppty_line_id is not null) 
and (oppty_id = '006400CZbnWAAT')) (type: boolean)
  Statistics: Num rows: 259079 Data size: 65607704 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
key expressions: '006400CZbnWAAT' (type: string), 
oppty_line_id (type: string)
sort order: ++
Map-reduce partition columns: '006400CZbnWAAT' (type: 
string), oppty_line_id (type: string)
Statistics: Num rows: 259079 Data size: 65607704 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Operator Tree:
Join Operator
  condition map:
   Inner Join 0 to 1
   Inner Join 0 to 2
   Inner Join 0 to 3
  condition expressions:
0
1
2
3
  Statistics: Num rows: 12343277 Data size: 3186008376 Basic stats: 
COMPLETE Column stats: NONE
  Select Operator
expressions: '006400CZbnWAAT' (type: string)
outputColumnNames: _col0
Statistics: Num rows: 12343277 Data size: 3186008376 Basic stats: 
COMPLETE Column stats: NONE
Group By Operator
  aggregations: count()
  keys: _col0 (type: string)
  mode: complete
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 6171638 Data size: 1593004058 Basic stats: 
COMPLETE Column stats: NONE
  Select Operator
expressions: _col0 (type: string), _col1 (type: bigint)
outputColumnNames: _col0, _col1
Statistics: Num rows: 6171638 Data size: 1593004058 Basic 
stats: COMPLETE Column stats: NONE
File Output Operator
  compressed: 

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603712#comment-14603712
 ] 

Hive QA commented on HIVE-10895:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742160/HIVE-10895.3.patch

{color:green}SUCCESS:{color} +1 9032 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4400/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4400/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4400/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742160 - PreCommit-HIVE-TRUNK-Build

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Aihua Xu
 Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, 
 HIVE-10895.3.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread xiaowei wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-10983:

Attachment: HIVE-10983.3.patch.txt

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
 HIVE-10983.3.patch.txt


 {noformat}
 The mothod transformTextToUTF8 have a error bug,It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2998) Making Hive run on Windows Server and Windows Azure environment

2015-06-26 Thread nevi_me (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602616#comment-14602616
 ] 

nevi_me commented on HIVE-2998:
---

Will this only be for Windows Server only?

 Making Hive run on Windows Server and Windows Azure environment
 ---

 Key: HIVE-2998
 URL: https://issues.apache.org/jira/browse/HIVE-2998
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.7.1, 0.8.1
 Environment: Windows Server 2008 R2 and Windows Azure
Reporter: Lengning Liu

 This is the master JIRA for improvements to Hive that would enable it to run 
 natively on Windows Server and Windows Azure environments.  Microsoft has 
 done the initial work here to have Hive (releases 0.7.1 and 0.8.1) running on 
 Windows and would like to contribute this work back to the community. The 
 end-to-end HiveQL query tests pass. We are currently working on investigating 
 failed unit test cases. It is expected that we post the initial patches 
 within a few weeks for review. Looking forward to the collaboration. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-26 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602617#comment-14602617
 ] 

xiaowei wang commented on HIVE-11095:
-

According to the suggestion of Chengxiang Li ,I put up a new patch, 
HIVE-11095.2.patch.txt

 SerDeUtils  another bug ,when Text is reused
 

 Key: HIVE-11095
 URL: https://issues.apache.org/jira/browse/HIVE-11095
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
 Fix For: 1.2.0

 Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt


 {noformat}
 The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 How I found this bug?
 When i query data from a lzo table , I found in results : the length of the 
 current row is always largr than the previous row, and sometimes,the current 
 row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select * from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
 `line` string)
 PARTITIONED BY (
 `logdate` string)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '
 U'
 WITH SERDEPROPERTIES (
 'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT com.hadoop.mapred.DeprecatedLzoTextInputFormat
 OUTPUTFORMAT org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread xiaowei wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-10983:

Attachment: HIVE-10983.4.patch.txt

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
 HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt


 {noformat}
 The mothod transformTextToUTF8 have a error bug,It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-26 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602596#comment-14602596
 ] 

xiaowei wang commented on HIVE-10983:
-

According to  the suggestion of Chengxiang Li   ,I  put up a new patch, 
HIVE-10983.4.patch.txt

 SerDeUtils bug  ,when Text is reused 
 -

 Key: HIVE-10983
 URL: https://issues.apache.org/jira/browse/HIVE-10983
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
  Labels: patch
 Fix For: 0.14.1, 1.2.0

 Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
 HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt


 {noformat}
 The mothod transformTextToUTF8 have a error bug,It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 When i query data from a lzo table , I found  in results : the length of the 
 current row is always largr  than the previous row, and sometimes,the current 
  row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select *   from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content  of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
   `line` string)
 PARTITIONED BY (
   `logdate` string)
 ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\\U'
 WITH SERDEPROPERTIES (
   'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT  com.hadoop.mapred.DeprecatedLzoTextInputFormat
   OUTPUTFORMAT 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602603#comment-14602603
 ] 

Hive QA commented on HIVE-10895:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741934/HIVE-10895.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9029 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestAdminUser.testCreateAdminNAddUser
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4389/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4389/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4389/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741934 - PreCommit-HIVE-TRUNK-Build

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Aihua Xu
 Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, 
 HIVE-10895.3.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11031) ORC concatenation of old files can fail while merging column statistics

2015-06-26 Thread Demeter Sztanko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602613#comment-14602613
 ] 

Demeter Sztanko commented on HIVE-11031:


Hello [~prasanth_j], my MR jobs are getting this error when concatenating ORC 
files:

{code}
java.io.IOException: java.io.IOException: java.lang.IndexOutOfBoundsException: 
Index: 0
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:226)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:136)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:230)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:210)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 0
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:105)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:224)
... 11 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 0
at java.util.Collections$EmptyList.get(Collections.java:3212)
at 
org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.nextStripe(OrcFileStripeMergeRecordReader.java:82)
at 
org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.next(OrcFileStripeMergeRecordReader.java:71)
at 
org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.next(OrcFileStripeMergeRecordReader.java:31)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
... 15 more
2015-06-26 08:24:19,248 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
for the task
{code}

Is this failure a result of the bug described in this ticket or that can be a 
different problem?

 ORC concatenation of old files can fail while merging column statistics
 ---

 Key: HIVE-11031
 URL: https://issues.apache.org/jira/browse/HIVE-11031
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical
 Fix For: 1.2.1

 Attachments: HIVE-11031-branch-1.0.patch, HIVE-11031.2.patch, 
 HIVE-11031.3.patch, HIVE-11031.4.patch, HIVE-11031.patch


 Column statistics in ORC are optional protobuf fields. Old ORC files might 
 not have statistics for newly added types like decimal, date, timestamp etc. 
 But column statistics merging assumes column statistics exists for these 
 types and invokes merge. For example, merging of TimestampColumnStatistics 
 directly casts the received ColumnStatistics object without doing instanceof 
 check. If the ORC file contains time stamp column statistics then this will 
 work else it will throw ClassCastException.
 Also, the file merge operator swallows the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-26 Thread xiaowei wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-11095:

Attachment: HIVE-11095.2.patch.txt

 SerDeUtils  another bug ,when Text is reused
 

 Key: HIVE-11095
 URL: https://issues.apache.org/jira/browse/HIVE-11095
 Project: Hive
  Issue Type: Bug
  Components: API, CLI
Affects Versions: 0.14.0, 1.0.0, 1.2.0
 Environment: Hadoop 2.3.0-cdh5.0.0
 Hive 0.14
Reporter: xiaowei wang
Assignee: xiaowei wang
 Fix For: 1.2.0

 Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt


 {noformat}
 The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
 Text,getBytes()!
 The method getBytes of Text returns the raw bytes; however, only data up to 
 Text.length is valid.A better way is  use copyBytes()  if you need the 
 returned array to be precisely the length of the data.
 But the copyBytes is added behind hadoop1. 
 {noformat}
 How I found this bug?
 When i query data from a lzo table , I found in results : the length of the 
 current row is always largr than the previous row, and sometimes,the current 
 row contains the contents of the previous row。 For example ,i execute a sql ,
 {code:sql}
 select * from web_searchhub where logdate=2015061003
 {code}
 the result of sql see blow.Notice that ,the second row content contains the 
 first row content.
 {noformat}
 INFO [03:00:05.589] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
 INFO [03:00:05.594] 18941e66-9962-44ad-81bc-3519f47ba274 
 session=901,thread=223ession=3151,thread=254 2015061003
 {noformat}
 The content of origin lzo file content see below ,just 2 rows.
 {noformat}
 INFO [03:00:05.635] b88e0473-7530-494c-82d8-e2d2ebd2666c_forweb 
 session=3148,thread=285
 INFO [03:00:05.635] HttpFrontServer::FrontSH 
 msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
 {noformat}
 I think this error is caused by the Text reuse,and I found the solutions .
 Addicational, table create sql is : 
 {code:sql}
 CREATE EXTERNAL TABLE `web_searchhub`(
 `line` string)
 PARTITIONED BY (
 `logdate` string)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '
 U'
 WITH SERDEPROPERTIES (
 'serialization.encoding'='GBK')
 STORED AS INPUTFORMAT com.hadoop.mapred.DeprecatedLzoTextInputFormat
 OUTPUTFORMAT org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat;
 LOCATION
 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-26 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-11055:
--
Attachment: HIVE-11055.2.patch

 HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
 ---

 Key: HIVE-11055
 URL: https://issues.apache.org/jira/browse/HIVE-11055
 Project: Hive
  Issue Type: Improvement
Reporter: Dmitry Tolpeko
Assignee: Dmitry Tolpeko
 Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch


 There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
 (actually any SQL-on-Hadoop implementation and any JDBC source).
 Alan Gates offered to contribute it to Hive under HPL/SQL name 
 (org.apache.hive.hplsql package). This JIRA is to create a patch to 
 contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11055) HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)

2015-06-26 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602626#comment-14602626
 ] 

Dmitry Tolpeko commented on HIVE-11055:
---

HIVE-11055.2.patch created:

Updates:

1) Modified Hive pom.xml, added modulehplsql/module to build HPL/SQL tool
2) Added hplsql/pom.xml
3) Added bin/hplsql and bin/ext/hplsql.sh to run the tool from shell
4) Added bin/hplsql.cmd for Windows.

Open issues:

1) The tool depends on antlr-runtime-4.5.jar that needs to be put to $HIVE_LIB
2) The tool uses plhql-site.xml configuration file that needs to be distributed 
as well




 HPL/SQL - Implementing Procedural SQL in Hive (PL/HQL Contribution)
 ---

 Key: HIVE-11055
 URL: https://issues.apache.org/jira/browse/HIVE-11055
 Project: Hive
  Issue Type: Improvement
Reporter: Dmitry Tolpeko
Assignee: Dmitry Tolpeko
 Attachments: HIVE-11055.1.patch, HIVE-11055.2.patch


 There is PL/HQL tool (www.plhql.org) that implements procedural SQL for Hive 
 (actually any SQL-on-Hadoop implementation and any JDBC source).
 Alan Gates offered to contribute it to Hive under HPL/SQL name 
 (org.apache.hive.hplsql package). This JIRA is to create a patch to 
 contribute  the PL/HQL code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
---
Attachment: HIVE-11131.2.patch

 Get row information on DataWritableWriter once for better writing performance
 -

 Key: HIVE-11131
 URL: https://issues.apache.org/jira/browse/HIVE-11131
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-11131.1.patch, HIVE-11131.2.patch


 DataWritableWriter is a class used to write Hive records to Parquet files. 
 This class is getting all the information about how to parse a record, such 
 as schema and object inspector, every time a record is written (or write() is 
 called).
 We can make this class perform better by initializing some writers per data
 type once, and saving all object inspectors on each writer.
 The class expects that the next records written will have the same object 
 inspectors and schema, so there is no need to have conditions for that. When 
 a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
---
Attachment: (was: HIVE-11131.1.patch)

 Get row information on DataWritableWriter once for better writing performance
 -

 Key: HIVE-11131
 URL: https://issues.apache.org/jira/browse/HIVE-11131
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-11131.2.patch


 DataWritableWriter is a class used to write Hive records to Parquet files. 
 This class is getting all the information about how to parse a record, such 
 as schema and object inspector, every time a record is written (or write() is 
 called).
 We can make this class perform better by initializing some writers per data
 type once, and saving all object inspectors on each writer.
 The class expects that the next records written will have the same object 
 inspectors and schema, so there is no need to have conditions for that. When 
 a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11123) Fix how to confirm the RDBMS product name at Metastore.

2015-06-26 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HIVE-11123:
--
Attachment: HIVE-11123.1.patch

I attach a patch file.

 Fix how to confirm the RDBMS product name at Metastore.
 ---

 Key: HIVE-11123
 URL: https://issues.apache.org/jira/browse/HIVE-11123
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.2.0
 Environment: PostgreSQL
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
 Attachments: HIVE-11123.1.patch


 I use PostgreSQL to Hive Metastore. And I saw the following message at 
 PostgreSQL log.
 {code}
  2015-06-26 10:58:15.488 JST ERROR:  syntax error at or near @@ at 
 character 5
  2015-06-26 10:58:15.488 JST STATEMENT:  SET @@session.sql_mode=ANSI_QUOTES
  2015-06-26 10:58:15.489 JST ERROR:  relation v$instance does not exist 
 at character 21
  2015-06-26 10:58:15.489 JST STATEMENT:  SELECT version FROM v$instance
  2015-06-26 10:58:15.490 JST ERROR:  column version does not exist at 
 character 10
  2015-06-26 10:58:15.490 JST STATEMENT:  SELECT @@version
 {code}
 When Hive CLI and Beeline embedded mode are carried out, this message is 
 output to PostgreSQL log.
 These queries are called from MetaStoreDirectSql#determineDbType. And if we 
 use MetaStoreDirectSql#getProductName, we need not to call these queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11130) Refactoring the code so that HiveTxnManager interface will support lock/unlock table/database object

2015-06-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603929#comment-14603929
 ] 

Hive QA commented on HIVE-11130:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742216/HIVE-11130.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9027 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4403/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4403/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4403/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742216 - PreCommit-HIVE-TRUNK-Build

 Refactoring the code so that HiveTxnManager interface will support 
 lock/unlock table/database object
 

 Key: HIVE-11130
 URL: https://issues.apache.org/jira/browse/HIVE-11130
 Project: Hive
  Issue Type: Sub-task
  Components: Locking
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-11130.patch


 This is just a refactoring step which keeps the current logic, but it exposes 
 the explicit lock/unlock table and database  in HiveTxnManager which should 
 be implemented differently by the subclasses ( currently it's not. e.g., for 
 ZooKeeper implementation, we should lock table and database when we try to 
 lock the table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-26 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov reassigned HIVE-7150:
-

Assignee: Alexander Pivovarov

 FileInputStream is not closed in HiveConnection#getHttpClient()
 ---

 Key: HIVE-7150
 URL: https://issues.apache.org/jira/browse/HIVE-7150
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Alexander Pivovarov
  Labels: jdbc
 Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch


 Here is related code:
 {code}
 sslTrustStore.load(new FileInputStream(sslTrustStorePath),
 sslTrustStorePassword.toCharArray());
 {code}
 The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >