date:20140826

[jira] [Updated] (HIVE-7890) SessionStart creates HMS Client while not impersonating

2014-08-26 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7890:
---

Description: 
In SessionState.start [an instance of the the HMSClient is 
created|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L367].
 When impersonation is enabled, this call does not occur within a "doas" call 
and thus the HMSClient is created as the server user, not the impersonated user.

Thus calls to the HMS are made by the "hive" user as opposed to the end user. 
This causes file ownership such as a database directory owner to be incorrect.

  was:In SessionStart an instance of the the HMSClient is created. When 
impersonation is enabled, this call does not occur within a "doas" call and 
thus the HMSClient is created as the server user, not the impersonated user.


> SessionStart creates HMS Client while not impersonating
> ---
>
> Key: HIVE-7890
> URL: https://issues.apache.org/jira/browse/HIVE-7890
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
> Attachments: HIVE-7890.2.patch
>
>
> In SessionState.start [an instance of the the HMSClient is 
> created|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L367].
>  When impersonation is enabled, this call does not occur within a "doas" call 
> and thus the HMSClient is created as the server user, not the impersonated 
> user.
> Thus calls to the HMS are made by the "hive" user as opposed to the end user. 
> This causes file ownership such as a database directory owner to be incorrect.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7881) enable Qtest scriptfiel1.q[Spark Branch]

2014-08-26 Thread Chengxiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7881:


Status: Patch Available  (was: Open)

> enable Qtest scriptfiel1.q[Spark Branch]
> 
>
> Key: HIVE-7881
> URL: https://issues.apache.org/jira/browse/HIVE-7881
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M1
> Attachments: HIVE-7881.1-spark.patch
>
>
> scriptfile1.q failed due to script file not found, should verify whether add 
> script file to SparkContext.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7890) SessionState creates HMS Client while not impersonating

2014-08-26 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7890:
---

Attachment: HIVE-7890.2.patch

> SessionState creates HMS Client while not impersonating
> ---
>
> Key: HIVE-7890
> URL: https://issues.apache.org/jira/browse/HIVE-7890
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
> Attachments: HIVE-7890.2.patch
>
>
> In SessionState.start [an instance of the the HMSClient is 
> created|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L367].
>  When impersonation is enabled, this call does not occur within a "doas" call 
> and thus the HMSClient is created as the server user, not the impersonated 
> user.
> Thus calls to the HMS are made by the "hive" user as opposed to the end user. 
> This causes file ownership such as a database directory owner to be incorrect.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7890) SessionState creates HMS Client while not impersonating

2014-08-26 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7890:
---

Assignee: Brock Noland
  Status: Patch Available  (was: Open)

> SessionState creates HMS Client while not impersonating
> ---
>
> Key: HIVE-7890
> URL: https://issues.apache.org/jira/browse/HIVE-7890
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7890.2.patch
>
>
> In SessionState.start [an instance of the the HMSClient is 
> created|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L367].
>  When impersonation is enabled, this call does not occur within a "doas" call 
> and thus the HMSClient is created as the server user, not the impersonated 
> user.
> Thus calls to the HMS are made by the "hive" user as opposed to the end user. 
> This causes file ownership such as a database directory owner to be incorrect.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7890) SessionState creates HMS Client while not impersonating

2014-08-26 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7890:
---

Summary: SessionState creates HMS Client while not impersonating  (was: 
SessionStart creates HMS Client while not impersonating)

> SessionState creates HMS Client while not impersonating
> ---
>
> Key: HIVE-7890
> URL: https://issues.apache.org/jira/browse/HIVE-7890
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
> Attachments: HIVE-7890.2.patch
>
>
> In SessionState.start [an instance of the the HMSClient is 
> created|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L367].
>  When impersonation is enabled, this call does not occur within a "doas" call 
> and thus the HMSClient is created as the server user, not the impersonated 
> user.
> Thus calls to the HMS are made by the "hive" user as opposed to the end user. 
> This causes file ownership such as a database directory owner to be incorrect.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7881) enable Qtest scriptfiel1.q[Spark Branch]

2014-08-26 Thread Chengxiang Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7881:


Attachment: HIVE-7881.1-spark.patch

ScriptOperator try to find script file in system PATH and user work directory, 
In spark local mode, added files are download to spark root directory, add 
spark root directory to ScriptOperator search patch while in spark mode.

> enable Qtest scriptfiel1.q[Spark Branch]
> 
>
> Key: HIVE-7881
> URL: https://issues.apache.org/jira/browse/HIVE-7881
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M1
> Attachments: HIVE-7881.1-spark.patch
>
>
> scriptfile1.q failed due to script file not found, should verify whether add 
> script file to SparkContext.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7764) Support all JDBC-HiveServer2 authentication modes on a secure cluster

2014-08-26 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111921#comment-14111921
 ] 

Lefty Leverenz commented on HIVE-7764:
--

Does this need to be documented in the wiki?  (If so, where?)

> Support all JDBC-HiveServer2 authentication modes on a secure cluster
> -
>
> Key: HIVE-7764
> URL: https://issues.apache.org/jira/browse/HIVE-7764
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-7764.1.patch, HIVE-7764.2.patch
>
>
> Currently, HiveServer2 logs in with its keytab only if 
> hive.server2.authentication is set to KERBEROS. However, 
> hive.server2.authentication is config that determines the auth type an end 
> user will use while authenticating with HiveServer2. There is a valid use 
> case of user authenticating with HiveServer2 using LDAP for example, while 
> HiveServer2 runs the query on a kerberized cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7775) enable sample8.q.[Spark Branch]

2014-08-26 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111915#comment-14111915
 ] 

Lefty Leverenz commented on HIVE-7775:
--

Fix Version should be spark-branch, not 0.14.0.

> enable sample8.q.[Spark Branch]
> ---
>
> Key: HIVE-7775
> URL: https://issues.apache.org/jira/browse/HIVE-7775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: 0.14.0
>
> Attachments: HIVE-7775.1-spark.patch, HIVE-7775.2-spark.patch
>
>
> sample8.q contain join query, should enable this qtest after hive on spark 
> support join operation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111907#comment-14111907
 ] 

Hive QA commented on HIVE-7730:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664542/HIVE-7730.004.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6115 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/519/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/519/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-519/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664542

> Extend ReadEntity to add accessed columns from query
> 
>
> Key: HIVE-7730
> URL: https://issues.apache.org/jira/browse/HIVE-7730
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaomeng Huang
> Attachments: HIVE-7730.001.patch, HIVE-7730.002.patch, 
> HIVE-7730.003.patch, HIVE-7730.004.patch
>
>
> -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
> have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
> hookContext. (e.g. the needed colums from query).-
> -So we should get instance of HiveSemanticAnalyzerHookContext from 
> configuration, extends HiveSemanticAnalyzerHookContext with a new 
> implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
> what you want to the class.-
> Hive should store accessed columns to ReadEntity when we set 
> HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
> Then external authorization model can get accessed columns when do 
> authorization in compile before execute. Maybe we will remove 
> columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
> AuthorizationModeV2 can get accessed columns from ReadEntity too.
> Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
> {code}   boolean isColumnInfoNeedForAuth = 
> SessionState.get().isAuthorizationModeV2()
> && HiveConf.getBoolVar(conf, 
> HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
> if (isColumnInfoNeedForAuth
> || HiveConf.getBoolVar(this.conf, 
> HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
>   ColumnAccessAnalyzer columnAccessAnalyzer = new 
> ColumnAccessAnalyzer(pCtx);
>   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
> }
> compiler.compile(pCtx, rootTasks, inputs, outputs);
> // TODO: 
> // after compile, we can put accessed column list to ReadEntity getting 
> from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7876) further improve the columns stats update speed for all the partitions of a table

2014-08-26 Thread pengcheng xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7876:
--

Attachment: HIVE-7876.3.patch

further reduce writepath time, set partition=null, now it is just read stats, 
then insert stats(if null) or update stats (if not null)

> further improve the columns stats update speed for all the partitions of a 
> table
> 
>
> Key: HIVE-7876
> URL: https://issues.apache.org/jira/browse/HIVE-7876
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7876.2.patch, HIVE-7876.3.patch
>
>
> The previous solution https://issues.apache.org/jira/browse/HIVE-7736
> is not enough for the case when there are too many columns/partitions.
> The user will encounter 
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
> We try to remove more of transaction overhead



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7876) further improve the columns stats update speed for all the partitions of a table

2014-08-26 Thread pengcheng xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengcheng xiong updated HIVE-7876:
--

Status: Patch Available  (was: Open)

further reduce writepath time, set partition=null, now it is just read stats, 
then insert stats(if null) or update stats (if not null)

> further improve the columns stats update speed for all the partitions of a 
> table
> 
>
> Key: HIVE-7876
> URL: https://issues.apache.org/jira/browse/HIVE-7876
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7876.2.patch, HIVE-7876.3.patch
>
>
> The previous solution https://issues.apache.org/jira/browse/HIVE-7736
> is not enough for the case when there are too many columns/partitions.
> The user will encounter 
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
> We try to remove more of transaction overhead



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 25047: further improve the columns stats update speed for all the partitions of a table

2014-08-26 Thread pengcheng xiong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25047/
---

(Updated Aug. 27, 2014, 5:55 a.m.)


Review request for hive.


Changes
---

further reduce writepath time, set partition=null, now it is just read stats, 
then insert stats(if null) or update stats (if not null)


Repository: hive-git


Description
---

The previous solution https://issues.apache.org/jira/browse/HIVE-7736
is not enough for the case when there are too many columns/partitions.
The user will encounter
org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
We try to remove more of transaction overhead


Diffs (updated)
-

  metastore/bin/.gitignore 0dd9890 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
06d7595 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0693039 
  metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java e435d69 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 3847d99 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 981fa1a 
  ql/.gitignore 916e17c 

Diff: https://reviews.apache.org/r/25047/diff/


Testing
---


Thanks,

pengcheng xiong

Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-26 Thread Lefty Leverenz


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/#review51637
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Just curious:  for all 3 parameters, if the units are msec by default (as 
hive-default.xml.template said) then why are the default values shown as "0s" 
-- doesn't that mean 0 seconds, or is "s" something else?



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


The parameter description needs the time unit information that used to be 
in hive-default.xml.template:  "Accepts a numeric value which is msec by 
default but also can be used with other time units appended (sec, min, hour, 
day)."



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Description needs unit information that was in hive-default.xml.template:  
"Accepts a numeric value which is msec by default but also can be used with 
other time units appended (sec, min, hour, day)."



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Description needs unit information that was in hive-default.xml.template:  
"Accepts a numeric value which is msec by default but also can be used with 
other time units appended (sec, min, hour, day)."


- Lefty Leverenz


On Aug. 27, 2014, 4:42 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15449/
> ---
> 
> (Updated Aug. 27, 2014, 4:42 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-5799
> https://issues.apache.org/jira/browse/HIVE-5799
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Need some timeout facility for preventing resource leakages from instable or 
> bad clients.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
>   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
>  PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
> 45fbd61 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> 21c33bc 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 9785e95 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
> eee1cc6 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> bc0a02c 
>   
> service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
>  39d2184 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> d573592 
> 
> Diff: https://reviews.apache.org/r/15449/diff/
> 
> 
> Testing
> ---
> 
> Confirmed in the local environment.
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>

[jira] [Created] (HIVE-7890) SessionStart creates HMS Client while not impersonating

2014-08-26 Thread Brock Noland (JIRA)

Brock Noland created HIVE-7890:
--

 Summary: SessionStart creates HMS Client while not impersonating
 Key: HIVE-7890
 URL: https://issues.apache.org/jira/browse/HIVE-7890
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland


In SessionStart an instance of the the HMSClient is created. When impersonation 
is enabled, this call does not occur within a "doas" call and thus the 
HMSClient is created as the server user, not the impersonated user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111864#comment-14111864
 ] 

Szehon Ho commented on HIVE-7889:
-

Got it, I took a look.  Its very weird code in the other JavaObjInspectors as 
some other ones do this kind of thing.  Java is pass by value so you cant 
change outside's reference of 'o' like that, so the assignment does nothing 
useful, unless I'm mis-understanding something.  

Let's go with the first patch then if that is ok, I left a minor comment on the 
rb.

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.1.patch, HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 25086: HIVE-7889 : Query fails with char partition column

2014-08-26 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25086/#review51636
---


Looks good overall, just a minor comment below.


serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaHiveCharObjectInspector.java


Please put a space after the cast, and let's get rid of the useless 
assignment in the below line, even though its like that in other inspectors.


- Szehon Ho


On Aug. 27, 2014, 12:55 a.m., Mohit Sabharwal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25086/
> ---
> 
> (Updated Aug. 27, 2014, 12:55 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-7889
> https://issues.apache.org/jira/browse/HIVE-7889
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar.
> 
> Similar issue for Varchar was fixed in HIVE-6642.
> 
> 
> Diffs
> -
> 
>   ql/src/test/queries/clientpositive/partition_char.q PRE-CREATION 
>   ql/src/test/results/clientpositive/partition_char.q.out PRE-CREATION 
>   
> serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaHiveCharObjectInspector.java
>  ff114c04f396fa3b51aa6c065ae019dac2db3a81 
> 
> Diff: https://reviews.apache.org/r/25086/diff/
> 
> 
> Testing
> ---
> 
> Added q-test
> 
> 
> Thanks,
> 
> Mohit Sabharwal
> 
>

[jira] [Commented] (HIVE-7701) Upgrading tez to 0.4.1 causes metadata only query to fail.

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111855#comment-14111855
 ] 

Hive QA commented on HIVE-7701:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664534/HIVE-7701.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6116 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/518/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/518/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-518/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664534

> Upgrading tez to 0.4.1 causes metadata only query to fail.
> --
>
> Key: HIVE-7701
> URL: https://issues.apache.org/jira/browse/HIVE-7701
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7701.1.patch
>
>
> With HIVE-7477 we will be upgrading to tez 0.4.1. However, the 
> metadataonly1.q file fails with the upgrade. The upgrade is required to 
> prevent hanging of test runs. We can track the single failure here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5799) session/operation timeout for hiveserver2

2014-08-26 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111844#comment-14111844
 ] 

Navis commented on HIVE-5799:
-

Addressed comments and updated RB

> session/operation timeout for hiveserver2
> -
>
> Key: HIVE-5799
> URL: https://issues.apache.org/jira/browse/HIVE-5799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-5799.1.patch.txt, HIVE-5799.10.patch.txt, 
> HIVE-5799.11.patch.txt, HIVE-5799.12.patch.txt, HIVE-5799.13.patch.txt, 
> HIVE-5799.2.patch.txt, HIVE-5799.3.patch.txt, HIVE-5799.4.patch.txt, 
> HIVE-5799.5.patch.txt, HIVE-5799.6.patch.txt, HIVE-5799.7.patch.txt, 
> HIVE-5799.8.patch.txt, HIVE-5799.9.patch.txt
>
>
> Need some timeout facility for preventing resource leakages from instable  or 
> bad clients.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-26 Thread Navis Ryu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/
---

(Updated Aug. 27, 2014, 4:42 a.m.)


Review request for hive.


Changes
---

Addressed comments


Bugs: HIVE-5799
https://issues.apache.org/jira/browse/HIVE-5799


Repository: hive-git


Description
---

Need some timeout facility for preventing resource leakages from instable or 
bad clients.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
  common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
 PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 
  service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
21c33bc 
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
eee1cc6 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
bc0a02c 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 39d2184 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
d573592 

Diff: https://reviews.apache.org/r/15449/diff/


Testing
---

Confirmed in the local environment.


Thanks,

Navis Ryu

[jira] [Updated] (HIVE-5799) session/operation timeout for hiveserver2

2014-08-26 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5799:


Attachment: HIVE-5799.13.patch.txt

> session/operation timeout for hiveserver2
> -
>
> Key: HIVE-5799
> URL: https://issues.apache.org/jira/browse/HIVE-5799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-5799.1.patch.txt, HIVE-5799.10.patch.txt, 
> HIVE-5799.11.patch.txt, HIVE-5799.12.patch.txt, HIVE-5799.13.patch.txt, 
> HIVE-5799.2.patch.txt, HIVE-5799.3.patch.txt, HIVE-5799.4.patch.txt, 
> HIVE-5799.5.patch.txt, HIVE-5799.6.patch.txt, HIVE-5799.7.patch.txt, 
> HIVE-5799.8.patch.txt, HIVE-5799.9.patch.txt
>
>
> Need some timeout facility for preventing resource leakages from instable  or 
> bad clients.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6361) Un-fork Sqlline

2014-08-26 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-6361:
--

Attachment: HIVE-6361.2.patch

> Un-fork Sqlline
> ---
>
> Key: HIVE-6361
> URL: https://issues.apache.org/jira/browse/HIVE-6361
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 0.12.0
>Reporter: Julian Hyde
>Assignee: Julian Hyde
> Attachments: HIVE-6361.2.patch, HIVE-6361.patch
>
>
> I propose to merge the two development forks of sqlline: Hive's beeline 
> module, and the fork at https://github.com/julianhyde/sqlline.
> How did the forks come about? Hive’s SQL command-line interface Beeline was 
> created by forking Sqlline (see HIVE-987, HIVE-3100), which at the time it 
> was a useful but low-activity project languishing on SourceForge without an 
> active owner. Around the same time, Julian Hyde independently started a 
> github repo based on the same code base. Now several projects are using 
> Julian Hyde's sqlline, including Apache Drill, Apache Phoenix, Cascading 
> Lingual and Optiq.
> Merging these two forks will allow us to pool our resources. (Case in point: 
> Drill issue DRILL-327 had already been fixed in a later version of sqlline; 
> it still exists in beeline.)
> I propose the following steps:
> 1. Copy Julian Hyde's sqlline as a new Hive module, hive-sqlline.
> 2. Port fixes to hive-beeline into hive-sqlline.
> 3. Make hive-beeline depend on hive-sqlline, and remove code that is 
> identical. What remains in the hive-beeline module is Beeline.java (a derived 
> class of Sqlline.java) and Hive-specific extensions.
> 4. Make the hive-sqlline the official successor to Julian Hyde's sqlline.
> This achieves continuity for Hive’s users, gives the users of the non-Hive 
> sqlline a version with minimal dependencies, unifies the two code lines, and 
> brings everything under the Apache roof.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6361) Un-fork Sqlline

2014-08-26 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-6361:
--

Status: Patch Available  (was: Open)

Attached, patch HIVE-6361.2.patch, commit c96a790, parent 253a869.

> Un-fork Sqlline
> ---
>
> Key: HIVE-6361
> URL: https://issues.apache.org/jira/browse/HIVE-6361
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 0.12.0
>Reporter: Julian Hyde
>Assignee: Julian Hyde
> Attachments: HIVE-6361.2.patch, HIVE-6361.patch
>
>
> I propose to merge the two development forks of sqlline: Hive's beeline 
> module, and the fork at https://github.com/julianhyde/sqlline.
> How did the forks come about? Hive’s SQL command-line interface Beeline was 
> created by forking Sqlline (see HIVE-987, HIVE-3100), which at the time it 
> was a useful but low-activity project languishing on SourceForge without an 
> active owner. Around the same time, Julian Hyde independently started a 
> github repo based on the same code base. Now several projects are using 
> Julian Hyde's sqlline, including Apache Drill, Apache Phoenix, Cascading 
> Lingual and Optiq.
> Merging these two forks will allow us to pool our resources. (Case in point: 
> Drill issue DRILL-327 had already been fixed in a later version of sqlline; 
> it still exists in beeline.)
> I propose the following steps:
> 1. Copy Julian Hyde's sqlline as a new Hive module, hive-sqlline.
> 2. Port fixes to hive-beeline into hive-sqlline.
> 3. Make hive-beeline depend on hive-sqlline, and remove code that is 
> identical. What remains in the hive-beeline module is Beeline.java (a derived 
> class of Sqlline.java) and Hive-specific extensions.
> 4. Make the hive-sqlline the official successor to Julian Hyde's sqlline.
> This achieves continuity for Hive’s users, gives the users of the non-Hive 
> sqlline a version with minimal dependencies, unifies the two code lines, and 
> brings everything under the Apache roof.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6361) Un-fork Sqlline

2014-08-26 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-6361:
--

Status: Open  (was: Patch Available)

> Un-fork Sqlline
> ---
>
> Key: HIVE-6361
> URL: https://issues.apache.org/jira/browse/HIVE-6361
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 0.12.0
>Reporter: Julian Hyde
>Assignee: Julian Hyde
> Attachments: HIVE-6361.patch
>
>
> I propose to merge the two development forks of sqlline: Hive's beeline 
> module, and the fork at https://github.com/julianhyde/sqlline.
> How did the forks come about? Hive’s SQL command-line interface Beeline was 
> created by forking Sqlline (see HIVE-987, HIVE-3100), which at the time it 
> was a useful but low-activity project languishing on SourceForge without an 
> active owner. Around the same time, Julian Hyde independently started a 
> github repo based on the same code base. Now several projects are using 
> Julian Hyde's sqlline, including Apache Drill, Apache Phoenix, Cascading 
> Lingual and Optiq.
> Merging these two forks will allow us to pool our resources. (Case in point: 
> Drill issue DRILL-327 had already been fixed in a later version of sqlline; 
> it still exists in beeline.)
> I propose the following steps:
> 1. Copy Julian Hyde's sqlline as a new Hive module, hive-sqlline.
> 2. Port fixes to hive-beeline into hive-sqlline.
> 3. Make hive-beeline depend on hive-sqlline, and remove code that is 
> identical. What remains in the hive-beeline module is Beeline.java (a derived 
> class of Sqlline.java) and Hive-specific extensions.
> 4. Make the hive-sqlline the official successor to Julian Hyde's sqlline.
> This achieves continuity for Hive’s users, gives the users of the non-Hive 
> sqlline a version with minimal dependencies, unifies the two code lines, and 
> brings everything under the Apache roof.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5799) session/operation timeout for hiveserver2

2014-08-26 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5799:


Attachment: (was: HIVE-5799.12.patch.txt)

> session/operation timeout for hiveserver2
> -
>
> Key: HIVE-5799
> URL: https://issues.apache.org/jira/browse/HIVE-5799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-5799.1.patch.txt, HIVE-5799.10.patch.txt, 
> HIVE-5799.11.patch.txt, HIVE-5799.12.patch.txt, HIVE-5799.2.patch.txt, 
> HIVE-5799.3.patch.txt, HIVE-5799.4.patch.txt, HIVE-5799.5.patch.txt, 
> HIVE-5799.6.patch.txt, HIVE-5799.7.patch.txt, HIVE-5799.8.patch.txt, 
> HIVE-5799.9.patch.txt
>
>
> Need some timeout facility for preventing resource leakages from instable  or 
> bad clients.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6329) Support column level encryption/decryption

2014-08-26 Thread Larry McCay (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111827#comment-14111827
 ] 

Larry McCay commented on HIVE-6329:
---

I see, [~navis]. So, the intent of this patch is to provide a hook within the 
SerDe mechanism with enough fidelity to do encryption but the initial 
implementation just provides an encoding to Base64 implementation. That helps 
me understand the patch more and I think you have accomplished this.

I would be a bit leery of calling the hook and Base64 implementation that we 
are providing in this patch "column level encryption/decryption" - even though 
you are enabling someone to use it for that. This happens to be a patch that 
introduces column/value encoding/decoding. This is easily reversible and 
joinable across tables allowing correlations to be made.

Are we able to frame the usecase that is actually represented by this patch as 
a problem that needs solving or do we need to make this implementation more 
robust in terms of encryption/decryption and all then key management 
requirements required to do that properly?

I am just concerned about introducing new interfaces and hooks that need to be 
supported if they are not what we would consider strategic implementation 
choices for a given feature like encryption. Does the SerDe mechansim provide 
everything that we need? It seems like this approach provides little in terms 
of key management and metadata which are requisite for encryption mechanisms. 
Though, I may still be missing the forest for the trees.

What I would like to do is ensure that our customers have a path forward with 
their needs met while not moving this forward in apache until we have an actual 
encryption mechanism available.

Does that make sense?

What do you think that will require?

> Support column level encryption/decryption
> --
>
> Key: HIVE-6329
> URL: https://issues.apache.org/jira/browse/HIVE-6329
> Project: Hive
>  Issue Type: New Feature
>  Components: Security, Serializers/Deserializers
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-6329.1.patch.txt, HIVE-6329.10.patch.txt, 
> HIVE-6329.11.patch.txt, HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt, 
> HIVE-6329.4.patch.txt, HIVE-6329.5.patch.txt, HIVE-6329.6.patch.txt, 
> HIVE-6329.7.patch.txt, HIVE-6329.8.patch.txt, HIVE-6329.9.patch.txt
>
>
> Receiving some requirements on encryption recently but hive is not supporting 
> it. Before the full implementation via HIVE-5207, this might be useful for 
> some cases.
> {noformat}
> hive> create table encode_test(id int, name STRING, phone STRING, address 
> STRING) 
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> > WITH SERDEPROPERTIES ('column.encode.columns'='phone,address', 
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
> STORED AS TEXTFILE;
> OK
> Time taken: 0.584 seconds
> hive> insert into table encode_test select 
> 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
> ..
> OK
> Time taken: 5.121 seconds
> hive> select * from encode_test;
> OK
> 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
> Time taken: 0.078 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111791#comment-14111791
 ] 

Hive QA commented on HIVE-7889:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664532/HIVE-7889.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6116 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/517/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/517/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-517/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664532

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.1.patch, HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7669) parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-26 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111765#comment-14111765
 ] 

Szehon Ho commented on HIVE-7669:
-

Thanks, +1 on latest patch pending tests

> parallel order by clause on a string column fails with IOException: Split 
> points are out of order
> -
>
> Key: HIVE-7669
> URL: https://issues.apache.org/jira/browse/HIVE-7669
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Query Processor, SQL
>Affects Versions: 0.12.0
> Environment: Hive 0.12.0-cdh5.0.0
> OS: Redhat linux
>Reporter: Vishal Kamath
>Assignee: Navis
>  Labels: orderby
> Attachments: HIVE-7669.1.patch.txt, HIVE-7669.2.patch.txt, 
> HIVE-7669.3.patch.txt
>
>
> The source table has 600 Million rows and it has a String column 
> "l_shipinstruct" which has 4 unique values. (Ie. these 4 values are repeated 
> across the 600 million rows)
> We are sorting it based on this string column "l_shipinstruct" as shown in 
> the below HiveQL with the following parameters. 
> {code:sql}
> set hive.optimize.sampling.orderby=true;
> set hive.optimize.sampling.orderby.number=1000;
> set hive.optimize.sampling.orderby.percent=0.1f;
> insert overwrite table lineitem_temp_report 
> select 
>   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
> l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
> l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
> from 
>   lineitem
> order by l_shipinstruct;
> {code}
> Stack Trace
> Diagnostic Messages for this Task:
> {noformat}
> Error: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.(MapTask.java:569)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 10 more
> Caused by: java.lang.IllegalArgumentException: Can't read partitions file
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
> at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
> at 
> org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
> ... 15 more
> Caused by: java.io.IOException: Split points are out of order
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
> ... 17 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-26 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111763#comment-14111763
 ] 

Szehon Ho commented on HIVE-7730:
-

Thanks Xiaomeng, +1 pending latest test result

> Extend ReadEntity to add accessed columns from query
> 
>
> Key: HIVE-7730
> URL: https://issues.apache.org/jira/browse/HIVE-7730
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaomeng Huang
> Attachments: HIVE-7730.001.patch, HIVE-7730.002.patch, 
> HIVE-7730.003.patch, HIVE-7730.004.patch
>
>
> -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
> have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
> hookContext. (e.g. the needed colums from query).-
> -So we should get instance of HiveSemanticAnalyzerHookContext from 
> configuration, extends HiveSemanticAnalyzerHookContext with a new 
> implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
> what you want to the class.-
> Hive should store accessed columns to ReadEntity when we set 
> HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
> Then external authorization model can get accessed columns when do 
> authorization in compile before execute. Maybe we will remove 
> columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
> AuthorizationModeV2 can get accessed columns from ReadEntity too.
> Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
> {code}   boolean isColumnInfoNeedForAuth = 
> SessionState.get().isAuthorizationModeV2()
> && HiveConf.getBoolVar(conf, 
> HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
> if (isColumnInfoNeedForAuth
> || HiveConf.getBoolVar(this.conf, 
> HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
>   ColumnAccessAnalyzer columnAccessAnalyzer = new 
> ColumnAccessAnalyzer(pCtx);
>   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
> }
> compiler.compile(pCtx, rootTasks, inputs, outputs);
> // TODO: 
> // after compile, we can put accessed column list to ReadEntity getting 
> from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7887) VectorFileSinkOp does not publish the stats correctly

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111754#comment-14111754
 ] 

Hive QA commented on HIVE-7887:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664526/HIVE-7887.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6115 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_part_project
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_funcs
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/516/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/516/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-516/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664526

> VectorFileSinkOp does not publish the stats correctly
> -
>
> Key: HIVE-7887
> URL: https://issues.apache.org/jira/browse/HIVE-7887
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7887.1.patch
>
>
> VectorFSOp inherits FSOp, but the stats collection code in processOp() is 
> out-of-date. Needs to be updated to be in sync with FSOp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7876) further improve the columns stats update speed for all the partitions of a table

2014-08-26 Thread Sujesh Chirackkal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111752#comment-14111752
 ] 

Sujesh Chirackkal commented on HIVE-7876:
-

Though the first property change worked for some queries, started facing the 
same exception. Changed the below property as well and seems it is working now. 
Will keep you posted if we get the exception again.

 set hive.metastore.client.socket.timeout=300; 

> further improve the columns stats update speed for all the partitions of a 
> table
> 
>
> Key: HIVE-7876
> URL: https://issues.apache.org/jira/browse/HIVE-7876
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7876.2.patch
>
>
> The previous solution https://issues.apache.org/jira/browse/HIVE-7736
> is not enough for the case when there are too many columns/partitions.
> The user will encounter 
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
> We try to remove more of transaction overhead



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24688: parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-26 Thread Navis Ryu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24688/
---

(Updated Aug. 27, 2014, 2:18 a.m.)


Review request for hive.


Changes
---

Addressed comment


Bugs: HIVE-7669
https://issues.apache.org/jira/browse/HIVE-7669


Repository: hive-git


Description
---

The source table has 600 Million rows and it has a String column 
"l_shipinstruct" which has 4 unique values. (Ie. these 4 values are repeated 
across the 600 million rows)

We are sorting it based on this string column "l_shipinstruct" as shown in the 
below HiveQL with the following parameters. 
{code:sql}
set hive.optimize.sampling.orderby=true;
set hive.optimize.sampling.orderby.number=1000;
set hive.optimize.sampling.orderby.percent=0.1f;

insert overwrite table lineitem_temp_report 
select 
  l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, 
l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, 
l_receiptdate, l_shipinstruct, l_shipmode, l_comment
from 
  lineitem
order by l_shipinstruct;
{code}
Stack Trace
Diagnostic Messages for this Task:
{noformat}
Error: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at 
org.apache.hadoop.mapred.MapTask$OldOutputCollector.(MapTask.java:569)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 10 more
Caused by: java.lang.IllegalArgumentException: Can't read partitions file
at 
org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
at 
org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
at 
org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
... 15 more
Caused by: java.io.IOException: Split points are out of order
at 
org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
... 17 more
{noformat}


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
  common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HiveTotalOrderPartitioner.java 
6c22362 
  ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java 166461a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java ef72039 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestPartitionKeySampler.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24688/diff/


Testing
---


Thanks,

Navis Ryu

[jira] [Updated] (HIVE-7669) parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-26 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7669:


Attachment: HIVE-7669.3.patch.txt

> parallel order by clause on a string column fails with IOException: Split 
> points are out of order
> -
>
> Key: HIVE-7669
> URL: https://issues.apache.org/jira/browse/HIVE-7669
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Query Processor, SQL
>Affects Versions: 0.12.0
> Environment: Hive 0.12.0-cdh5.0.0
> OS: Redhat linux
>Reporter: Vishal Kamath
>Assignee: Navis
>  Labels: orderby
> Attachments: HIVE-7669.1.patch.txt, HIVE-7669.2.patch.txt, 
> HIVE-7669.3.patch.txt
>
>
> The source table has 600 Million rows and it has a String column 
> "l_shipinstruct" which has 4 unique values. (Ie. these 4 values are repeated 
> across the 600 million rows)
> We are sorting it based on this string column "l_shipinstruct" as shown in 
> the below HiveQL with the following parameters. 
> {code:sql}
> set hive.optimize.sampling.orderby=true;
> set hive.optimize.sampling.orderby.number=1000;
> set hive.optimize.sampling.orderby.percent=0.1f;
> insert overwrite table lineitem_temp_report 
> select 
>   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
> l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
> l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
> from 
>   lineitem
> order by l_shipinstruct;
> {code}
> Stack Trace
> Diagnostic Messages for this Task:
> {noformat}
> Error: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.(MapTask.java:569)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 10 more
> Caused by: java.lang.IllegalArgumentException: Can't read partitions file
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
> at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
> at 
> org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
> ... 15 more
> Caused by: java.io.IOException: Split points are out of order
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
> ... 17 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7876) further improve the columns stats update speed for all the partitions of a table

2014-08-26 Thread Sujesh Chirackkal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111741#comment-14111741
 ] 

Sujesh Chirackkal commented on HIVE-7876:
-

Faced the same exception for a table which has more than 2000 partitions. 
Changing the below property helped us to run the job,

change from 30 seconds to 300

hive.stats.jdbc.timeout=300;

> further improve the columns stats update speed for all the partitions of a 
> table
> 
>
> Key: HIVE-7876
> URL: https://issues.apache.org/jira/browse/HIVE-7876
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7876.2.patch
>
>
> The previous solution https://issues.apache.org/jira/browse/HIVE-7736
> is not enough for the case when there are too many columns/partitions.
> The user will encounter 
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
> We try to remove more of transaction overhead



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6250) sql std auth - view authorization should not underlying table. More tests and fixes.

2014-08-26 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111726#comment-14111726
 ] 

Thejas M Nair commented on HIVE-6250:
-

This is a bug. Please create a new jira and upload your fix. I will review it. 
Thanks for reporting it!

> sql std auth - view authorization should not underlying table. More tests and 
> fixes.
> 
>
> Key: HIVE-6250
> URL: https://issues.apache.org/jira/browse/HIVE-6250
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6250.1.patch, HIVE-6250.2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This patch adds more tests for table and view authorization and also fixes a 
> number of issues found during testing -
> - View authorization should happen on only on the view, and not the 
> underlying table (Change in ReadEntity to indicate if it is a direct/indirect 
> dependency)
> - table owner in metadata should be the user as per SessionState 
> authentication provider
> - added utility function for finding the session state authentication 
> provider user
> - authorization should be based on current roles
> - admin user should have all permissions
> - error message improvements



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6329) Support column level encryption/decryption

2014-08-26 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111728#comment-14111728
 ] 

Navis commented on HIVE-6329:
-

Admittedly, I'm not know well of this area (security, encryption, etc.). This 
patch is provided to a customer by request and I've heard that they implemented 
AES+OTP combination(for encryption) with some hive hooks(for authorization) to 
secure data. And some others are just happy with simple Base64 like 
obfuscation. It's not whole set for encryption but just a layer for handling 
format. I'm good if someone take this and improve further.

> Support column level encryption/decryption
> --
>
> Key: HIVE-6329
> URL: https://issues.apache.org/jira/browse/HIVE-6329
> Project: Hive
>  Issue Type: New Feature
>  Components: Security, Serializers/Deserializers
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-6329.1.patch.txt, HIVE-6329.10.patch.txt, 
> HIVE-6329.11.patch.txt, HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt, 
> HIVE-6329.4.patch.txt, HIVE-6329.5.patch.txt, HIVE-6329.6.patch.txt, 
> HIVE-6329.7.patch.txt, HIVE-6329.8.patch.txt, HIVE-6329.9.patch.txt
>
>
> Receiving some requirements on encryption recently but hive is not supporting 
> it. Before the full implementation via HIVE-5207, this might be useful for 
> some cases.
> {noformat}
> hive> create table encode_test(id int, name STRING, phone STRING, address 
> STRING) 
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> > WITH SERDEPROPERTIES ('column.encode.columns'='phone,address', 
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
> STORED AS TEXTFILE;
> OK
> Time taken: 0.584 seconds
> hive> insert into table encode_test select 
> 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
> ..
> OK
> Time taken: 5.121 seconds
> hive> select * from encode_test;
> OK
> 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
> Time taken: 0.078 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-26 Thread Xiaomeng Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang updated HIVE-7730:
-

Attachment: HIVE-7730.004.patch

> Extend ReadEntity to add accessed columns from query
> 
>
> Key: HIVE-7730
> URL: https://issues.apache.org/jira/browse/HIVE-7730
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaomeng Huang
> Attachments: HIVE-7730.001.patch, HIVE-7730.002.patch, 
> HIVE-7730.003.patch, HIVE-7730.004.patch
>
>
> -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
> have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
> hookContext. (e.g. the needed colums from query).-
> -So we should get instance of HiveSemanticAnalyzerHookContext from 
> configuration, extends HiveSemanticAnalyzerHookContext with a new 
> implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
> what you want to the class.-
> Hive should store accessed columns to ReadEntity when we set 
> HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
> Then external authorization model can get accessed columns when do 
> authorization in compile before execute. Maybe we will remove 
> columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
> AuthorizationModeV2 can get accessed columns from ReadEntity too.
> Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
> {code}   boolean isColumnInfoNeedForAuth = 
> SessionState.get().isAuthorizationModeV2()
> && HiveConf.getBoolVar(conf, 
> HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
> if (isColumnInfoNeedForAuth
> || HiveConf.getBoolVar(this.conf, 
> HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
>   ColumnAccessAnalyzer columnAccessAnalyzer = new 
> ColumnAccessAnalyzer(pCtx);
>   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
> }
> compiler.compile(pCtx, rootTasks, inputs, outputs);
> // TODO: 
> // after compile, we can put accessed column list to ReadEntity getting 
> from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Hive User Group Meeting

2014-08-26 Thread Xuefu Zhang

Dear Apache Hive users and developers,

The next Hive user group meeting mentioned previously was officially
announced here:
http://www.meetup.com/Hive-User-Group-Meeting/events/202007872/. As it's
only about one and a half month away, please RSVP if you plan to go so that
the organizers can plan the meeting accordingly.

Currently, we still have a few talk slots open. Please let me know if
you're interested to give a talk.

Regards,
Xuefu

On Mon, Jul 7, 2014 at 6:01 PM, Xuefu Zhang  wrote:

> Dear Hive users,
>
> Hive community is considering a user group meeting during Hadoop World
> that will be held in New York October 15-17th. To make this happen, your
> support is essential. First, I'm wondering if any user, especially those in
> New York area would be willing to host the meetup. Secondly, I'm soliciting
> talks from users as well as developers, and so please propose or share your
> thoughts on the contents of the meetup.
>
> I will soon setup a meetup event to  formally announce this. In the
> meantime, your suggestions, comments, and kind assistance are greatly
> appreciated.
>
> Sincerely,
>
> Xuefu
>

Re: Review Request 24962: HIVE-7730: Extend ReadEntity to add accessed columns from query

2014-08-26 Thread Xiaomeng Huang



> On Aug. 26, 2014, 6:04 p.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 9537
> > 
> >
> > Can you please remove this as per our previous discussion?  The 
> > construction of new linkedlist is not needed.

fixed


> On Aug. 26, 2014, 6:04 p.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 9540
> > 
> >
> > Please indent 2 spaces instead of 4.

fixed


> On Aug. 26, 2014, 6:04 p.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 9544
> > 
> >
> > Please indent 2 spaces.

fixed


- Xiaomeng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24962/#review51554
---


On Aug. 26, 2014, 2:22 a.m., Xiaomeng Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24962/
> ---
> 
> (Updated Aug. 26, 2014, 2:22 a.m.)
> 
> 
> Review request for hive, Prasad Mujumdar and Szehon Ho.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> External authorization model can not get accessed columns from query. Hive 
> should store accessed columns to ReadEntity 
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 7ed50b4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b05d3b4 
> 
> Diff: https://reviews.apache.org/r/24962/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Xiaomeng Huang
> 
>

Re: Review Request 24962: HIVE-7730: Extend ReadEntity to add accessed columns from query

2014-08-26 Thread Xiaomeng Huang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24962/
---

(Updated Aug. 27, 2014, 1:37 a.m.)


Review request for hive, Prasad Mujumdar and Szehon Ho.


Repository: hive-git


Description
---

External authorization model can not get accessed columns from query. Hive 
should store accessed columns to ReadEntity 


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 7ed50b4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b05d3b4 

Diff: https://reviews.apache.org/r/24962/diff/


Testing
---


Thanks,

Xiaomeng Huang

[jira] [Commented] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111708#comment-14111708
 ] 

Mohit Sabharwal commented on HIVE-7889:
---

Hmm, the "o" arg isn't really used anywhere. I assigned to it just to respect 
the "set" interface. 

In PrimitiveObjectInspectorConverter.HiveCharConverter.convert(), the "hc" arg 
passed to set() isn't used anywhere.
It returns HiveChar or HiveCharWritable, depending on object inspector:
  HiveCharWritable for WritableHiveCharObjectInspector 
  HiveChar for JavaHiveCharObjectInspector



> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.1.patch, HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111704#comment-14111704
 ] 

Hive QA commented on HIVE-6847:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664510/HIVE-6847.6.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/515/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/515/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-515/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-515/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'service/src/java/org/apache/hive/service/cli/CLIService.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-unit/target itests/custom-serde/target itests/util/target 
hcatalog/target hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
accumulo-handler/target hwi/target common/target common/src/gen service/target 
service/src/java/org/apache/hive/service/cli/CLIService.java.orig 
contrib/target serde/target beeline/target odbc/target cli/target 
ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1620773.

At revision 1620773.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664510

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.2.patch, HIVE-6847.3.patch, 
> HIVE-6847.4.patch, HIVE-6847.5.patch, HIVE-6847.6.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7885) CLIServer.openSessionWithImpersonation logs as if it were openSessionW

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111700#comment-14111700
 ] 

Hive QA commented on HIVE-7885:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664497/HIVE-7885.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6115 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/514/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/514/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-514/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664497

> CLIServer.openSessionWithImpersonation logs as if it were openSessionW
> --
>
> Key: HIVE-7885
> URL: https://issues.apache.org/jira/browse/HIVE-7885
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-7885.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7701) Upgrading tez to 0.4.1 causes metadata only query to fail.

2014-08-26 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111682#comment-14111682
 ] 

Gunther Hagleitner commented on HIVE-7701:
--

rb: https://reviews.apache.org/r/25089

> Upgrading tez to 0.4.1 causes metadata only query to fail.
> --
>
> Key: HIVE-7701
> URL: https://issues.apache.org/jira/browse/HIVE-7701
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7701.1.patch
>
>
> With HIVE-7477 we will be upgrading to tez 0.4.1. However, the 
> metadataonly1.q file fails with the upgrade. The upgrade is required to 
> prevent hanging of test runs. We can track the single failure here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7701) Upgrading tez to 0.4.1 causes metadata only query to fail.

2014-08-26 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7701:
-

Status: Patch Available  (was: Open)

> Upgrading tez to 0.4.1 causes metadata only query to fail.
> --
>
> Key: HIVE-7701
> URL: https://issues.apache.org/jira/browse/HIVE-7701
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7701.1.patch
>
>
> With HIVE-7477 we will be upgrading to tez 0.4.1. However, the 
> metadataonly1.q file fails with the upgrade. The upgrade is required to 
> prevent hanging of test runs. We can track the single failure here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7701) Upgrading tez to 0.4.1 causes metadata only query to fail.

2014-08-26 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7701:
-

Attachment: HIVE-7701.1.patch

> Upgrading tez to 0.4.1 causes metadata only query to fail.
> --
>
> Key: HIVE-7701
> URL: https://issues.apache.org/jira/browse/HIVE-7701
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7701.1.patch
>
>
> With HIVE-7477 we will be upgrading to tez 0.4.1. However, the 
> metadataonly1.q file fails with the upgrade. The upgrade is required to 
> prevent hanging of test runs. We can track the single failure here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111673#comment-14111673
 ] 

Szehon Ho commented on HIVE-7889:
-

So I think this wont work (just transitively assigns), I meant, don't we need 
to call o.setValue() as it was in the past?  Different method depending on if 
its a HiveChar and HiveCharWriteable.

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.1.patch, HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 25086: HIVE-7889 : Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25086/
---

(Updated Aug. 27, 2014, 12:55 a.m.)


Review request for hive.


Changes
---

Inc. feedback.


Bugs: HIVE-7889
https://issues.apache.org/jira/browse/HIVE-7889


Repository: hive-git


Description
---

For a char partition column, JavaHiveCharObjectInspector attempts to cast 
HiveCharWritable to HiveChar.

Similar issue for Varchar was fixed in HIVE-6642.


Diffs (updated)
-

  ql/src/test/queries/clientpositive/partition_char.q PRE-CREATION 
  ql/src/test/results/clientpositive/partition_char.q.out PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaHiveCharObjectInspector.java
 ff114c04f396fa3b51aa6c065ae019dac2db3a81 

Diff: https://reviews.apache.org/r/25086/diff/


Testing
---

Added q-test


Thanks,

Mohit Sabharwal

[jira] [Commented] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111649#comment-14111649
 ] 

Mohit Sabharwal commented on HIVE-7889:
---

Thanks for catching that! Updated patch with feedback.

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.1.patch, HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-7889:
--

Attachment: HIVE-7889.1.patch

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.1.patch, HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111628#comment-14111628
 ] 

Szehon Ho commented on HIVE-7889:
-

Hey Mohit , thanks for the fix and the test case.  But I am wondering, 
shouldn't the method still do a set?  Right now it is changed to just return 
the object?

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-7889:
--

Status: Patch Available  (was: Open)

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7887) VectorFileSinkOp does not publish the stats correctly

2014-08-26 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111625#comment-14111625
 ] 

Gunther Hagleitner commented on HIVE-7887:
--

+1

> VectorFileSinkOp does not publish the stats correctly
> -
>
> Key: HIVE-7887
> URL: https://issues.apache.org/jira/browse/HIVE-7887
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7887.1.patch
>
>
> VectorFSOp inherits FSOp, but the stats collection code in processOp() is 
> out-of-date. Needs to be updated to be in sync with FSOp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-7889:
--

Attachment: HIVE-7889.patch

> Query fails with char partition column
> --
>
> Key: HIVE-7889
> URL: https://issues.apache.org/jira/browse/HIVE-7889
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-7889.patch
>
>
> For a char partition column, JavaHiveCharObjectInspector attempts to cast 
> HiveCharWritable to HiveChar:
> {code}
> create table partition_char_1 (key string, value char(20)) partitioned by (dt 
> char(10), region int);
> insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
>   select * from src tablesample (10 rows);
> select * from partition_char_1 limit 1;
> java.sql.SQLException: Error while compiling statement: FAILED: 
> RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed 
> with exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be 
> cast to 
> org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveChar
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 25086: HIVE-7889 : Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25086/
---

Review request for hive.


Bugs: HIVE-7889
https://issues.apache.org/jira/browse/HIVE-7889


Repository: hive-git


Description
---

For a char partition column, JavaHiveCharObjectInspector attempts to cast 
HiveCharWritable to HiveChar.

Similar issue for Varchar was fixed in HIVE-6642.


Diffs
-

  ql/src/test/queries/clientpositive/partition_char.q PRE-CREATION 
  ql/src/test/results/clientpositive/partition_char.q.out PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaHiveCharObjectInspector.java
 ff114c04f396fa3b51aa6c065ae019dac2db3a81 

Diff: https://reviews.apache.org/r/25086/diff/


Testing
---

Added q-test


Thanks,

Mohit Sabharwal

[jira] [Resolved] (HIVE-7888) fix dynpart_sort_opt_vectorization.q test failure in trunk

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J resolved HIVE-7888.
--

Resolution: Duplicate

Duplicate of HIVE-7557

> fix dynpart_sort_opt_vectorization.q test failure in trunk
> --
>
> Key: HIVE-7888
> URL: https://issues.apache.org/jira/browse/HIVE-7888
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth J
>
> Test is part of TestMiniTezCliDriver
> Test fails with the following exception
> {code}
> [Error getting row data with exception java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
>  ]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing vector batch (tag=0) [Error getting row data with exception 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7353) HiveServer2 using embedded MetaStore leaks JDOPersistanceManager

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111616#comment-14111616
 ] 

Hive QA commented on HIVE-7353:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664493/HIVE-7353.8.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6115 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.metastore.TestMetastoreVersion.testVersionMisMatch
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/513/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/513/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-513/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664493

> HiveServer2 using embedded MetaStore leaks JDOPersistanceManager
> 
>
> Key: HIVE-7353
> URL: https://issues.apache.org/jira/browse/HIVE-7353
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-7353.1.patch, HIVE-7353.2.patch, HIVE-7353.3.patch, 
> HIVE-7353.4.patch, HIVE-7353.5.patch, HIVE-7353.6.patch, HIVE-7353.7.patch, 
> HIVE-7353.8.patch
>
>
> While using embedded metastore, while creating background threads to run 
> async operations, HiveServer2 ends up creating new instances of 
> JDOPersistanceManager which are cached in JDOPersistanceManagerFactory. Even 
> when the background thread is killed by the thread pool manager, the 
> JDOPersistanceManager are never GCed because they are cached by 
> JDOPersistanceManagerFactory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Hive-branch-0.12-hadoop1 - Build # 44 - Still Failing

2014-08-26 Thread Apache Jenkins Server

Changes for Build #35

Changes for Build #36

Changes for Build #37

Changes for Build #38

Changes for Build #39

Changes for Build #40

Changes for Build #41

Changes for Build #42

Changes for Build #43
[daijy] PIG-4119: Add message at end of each testcase with timestamp in Pig 
system tests


Changes for Build #44



No tests ran.

The Apache Jenkins build system has built Hive-branch-0.12-hadoop1 (build #44)

Status: Still Failing

Check console output at 
https://builds.apache.org/job/Hive-branch-0.12-hadoop1/44/ to view the results.

[jira] [Updated] (HIVE-7888) fix dynpart_sort_opt_vectorization.q test failure in trunk

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7888:
-

Attachment: (was: HIVE-7887.1.patch)

> fix dynpart_sort_opt_vectorization.q test failure in trunk
> --
>
> Key: HIVE-7888
> URL: https://issues.apache.org/jira/browse/HIVE-7888
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth J
>
> Test is part of TestMiniTezCliDriver
> Test fails with the following exception
> {code}
> [Error getting row data with exception java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
>  ]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing vector batch (tag=0) [Error getting row data with exception 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7888) fix dynpart_sort_opt_vectorization.q test failure in trunk

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7888:
-

Status: Patch Available  (was: Open)

> fix dynpart_sort_opt_vectorization.q test failure in trunk
> --
>
> Key: HIVE-7888
> URL: https://issues.apache.org/jira/browse/HIVE-7888
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth J
>
> Test is part of TestMiniTezCliDriver
> Test fails with the following exception
> {code}
> [Error getting row data with exception java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
>  ]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing vector batch (tag=0) [Error getting row data with exception 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7888) fix dynpart_sort_opt_vectorization.q test failure in trunk

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7888:
-

Status: Open  (was: Patch Available)

Wrongly uploaded patch from HIVE-7887


> fix dynpart_sort_opt_vectorization.q test failure in trunk
> --
>
> Key: HIVE-7888
> URL: https://issues.apache.org/jira/browse/HIVE-7888
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth J
>
> Test is part of TestMiniTezCliDriver
> Test fails with the following exception
> {code}
> [Error getting row data with exception java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
>  ]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing vector batch (tag=0) [Error getting row data with exception 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7887) VectorFileSinkOp does not publish the stats correctly

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7887:
-

Status: Patch Available  (was: Open)

> VectorFileSinkOp does not publish the stats correctly
> -
>
> Key: HIVE-7887
> URL: https://issues.apache.org/jira/browse/HIVE-7887
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7887.1.patch
>
>
> VectorFSOp inherits FSOp, but the stats collection code in processOp() is 
> out-of-date. Needs to be updated to be in sync with FSOp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7888) fix dynpart_sort_opt_vectorization.q test failure in trunk

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7888:
-

Attachment: HIVE-7887.1.patch

> fix dynpart_sort_opt_vectorization.q test failure in trunk
> --
>
> Key: HIVE-7888
> URL: https://issues.apache.org/jira/browse/HIVE-7888
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth J
>
> Test is part of TestMiniTezCliDriver
> Test fails with the following exception
> {code}
> [Error getting row data with exception java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
>  ]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing vector batch (tag=0) [Error getting row data with exception 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
> at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7887) VectorFileSinkOp does not publish the stats correctly

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7887:
-

Attachment: HIVE-7887.1.patch

> VectorFileSinkOp does not publish the stats correctly
> -
>
> Key: HIVE-7887
> URL: https://issues.apache.org/jira/browse/HIVE-7887
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7887.1.patch
>
>
> VectorFSOp inherits FSOp, but the stats collection code in processOp() is 
> out-of-date. Needs to be updated to be in sync with FSOp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7888) fix dynpart_sort_opt_vectorization.q test failure in trunk

2014-08-26 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7888:
-

Description: 
Test is part of TestMiniTezCliDriver
Test fails with the following exception
{code}
[Error getting row data with exception java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at 
org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
 ]
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at 
org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing vector batch (tag=0) [Error getting row data with exception 
java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

{code}

  was:
Test fails with the following exception
{code}
[Error getting row data with exception java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.security.AccessController.doPrivileged(Native Method)
a

[jira] [Created] (HIVE-7888) fix dynpart_sort_opt_vectorization.q test failure in trunk

2014-08-26 Thread Prasanth J (JIRA)

Prasanth J created HIVE-7888:


 Summary: fix dynpart_sort_opt_vectorization.q test failure in trunk
 Key: HIVE-7888
 URL: https://issues.apache.org/jira/browse/HIVE-7888
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J


Test fails with the following exception
{code}
[Error getting row data with exception java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at 
org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
 ]
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at 
org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:553)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing vector batch (tag=0) [Error getting row data with exception 
java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:155)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
at 
org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:564)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7889) Query fails with char partition column

2014-08-26 Thread Mohit Sabharwal (JIRA)

Mohit Sabharwal created HIVE-7889:
-

 Summary: Query fails with char partition column
 Key: HIVE-7889
 URL: https://issues.apache.org/jira/browse/HIVE-7889
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Mohit Sabharwal
Assignee: Mohit Sabharwal


For a char partition column, JavaHiveCharObjectInspector attempts to cast 
HiveCharWritable to HiveChar:

{code}
create table partition_char_1 (key string, value char(20)) partitioned by (dt 
char(10), region int);

insert overwrite table partition_char_1 partition(dt='2000-01-01', region=1)
  select * from src tablesample (10 rows);

select * from partition_char_1 limit 1;

java.sql.SQLException: Error while compiling statement: FAILED: 
RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
exception org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
org.apache.hadoop.hive.common.type.HiveCharjava.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to 
org.apache.hadoop.hive.common.type.HiveChar
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveCharObjectInspector.set(JavaHiveCharObjectInspector.java:67)
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$HiveCharConverter.convert(PrimitiveObjectInspectorConverter.java:506)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.createPartValue(FetchOperator.java:315)
{code}





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111607#comment-14111607
 ] 

Ashutosh Chauhan commented on HIVE-7405:


Mostly looks good. Some minor comments on RB.

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.10.patch, 
> HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, 
> HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, 
> HIVE-7405.A.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111595#comment-14111595
 ] 

Ashutosh Chauhan commented on HIVE-7604:


+1 Proposed api LGTM

> Add Metastore API to fetch one or more partition names
> --
>
> Key: HIVE-7604
> URL: https://issues.apache.org/jira/browse/HIVE-7604
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.14.0
>
> Attachments: Design_HIVE_7604.1.txt, Design_HIVE_7604.txt
>
>
> We need a new API in Metastore to address the following use cases. Both use 
> cases arise from having tables with hundreds of thousands or in some cases 
> millions of partitions.
> 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
> Obtain all dates for which partitions are available. This can be used by 
> tools/frameworks programmatically to understand gaps in partitions before 
> reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
> obtain this information which is unfriendly and heavy weight. And for tables 
> which have large number of partitions, it takes a long time to run the 
> queries and it also requires large heap space.
> 2. Typically users would like to know the list of partitions available and 
> would run queries that would only involve partition keys (select distinct 
> partkey1 from table) Or to obtain the latest date partition from a dimension 
> table to join against another fact table (select * from fact_table join 
> select max(dt) from dimension_table). Those queries (metadata only queries) 
> can be pushed to metastore and need not be run even locally in Hive. If the 
> queries can be converted into database based queries, the clients can be 
> light weight and need not fetch all partition names. The results can be 
> obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7835) [CBO] Handle a case where FieldTimmer trims all fields from input

2014-08-26 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7835:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to cbo branch.

> [CBO] Handle a case where FieldTimmer trims all fields from input
> -
>
> Key: HIVE-7835
> URL: https://issues.apache.org/jira/browse/HIVE-7835
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: h-7835.patch
>
>
> eg in queries like select 1 from t1 where nothing needs to be projected we 
> generate empty project list which results in incorrect ast getting generated. 
> Currently, we fix while generating ast by eliminating such AST node, but 
> better is to generate correct project list to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-26 Thread Thiruvel Thirumoolan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-7604:
---

Attachment: Design_HIVE_7604.1.txt

Thanks [~ashutoshc], uploading revised document with additional information for 
return values. Lemme know if its unclear.

> Add Metastore API to fetch one or more partition names
> --
>
> Key: HIVE-7604
> URL: https://issues.apache.org/jira/browse/HIVE-7604
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.14.0
>
> Attachments: Design_HIVE_7604.1.txt, Design_HIVE_7604.txt
>
>
> We need a new API in Metastore to address the following use cases. Both use 
> cases arise from having tables with hundreds of thousands or in some cases 
> millions of partitions.
> 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
> Obtain all dates for which partitions are available. This can be used by 
> tools/frameworks programmatically to understand gaps in partitions before 
> reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
> obtain this information which is unfriendly and heavy weight. And for tables 
> which have large number of partitions, it takes a long time to run the 
> queries and it also requires large heap space.
> 2. Typically users would like to know the list of partitions available and 
> would run queries that would only involve partition keys (select distinct 
> partkey1 from table) Or to obtain the latest date partition from a dimension 
> table to join against another fact table (select * from fact_table join 
> select max(dt) from dimension_table). Those queries (metadata only queries) 
> can be pushed to metastore and need not be run even locally in Hive. If the 
> queries can be converted into database based queries, the clients can be 
> light weight and need not fetch all partition names. The results can be 
> obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 25083: Handle a case where FieldTimmer trims all fields from input

2014-08-26 Thread Harish Butani


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25083/#review51611
---

Ship it!


Ship It!

- Harish Butani


On Aug. 26, 2014, 10:43 p.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25083/
> ---
> 
> (Updated Aug. 26, 2014, 10:43 p.m.)
> 
> 
> Review request for hive and Harish Butani.
> 
> 
> Bugs: HIVE-7835
> https://issues.apache.org/jira/browse/HIVE-7835
> 
> 
> Repository: hive
> 
> 
> Description
> ---
> 
> Handle a case where FieldTimmer trims all fields from input
> 
> 
> Diffs
> -
> 
>   
> branches/cbo/ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HiveRelFieldTrimmer.java
>  1620376 
>   
> branches/cbo/ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/ASTConverter.java
>  1620376 
> 
> Diff: https://reviews.apache.org/r/25083/diff/
> 
> 
> Testing
> ---
> 
> Existing test in cbo_correctness.q select null from t3;
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>

[jira] [Commented] (HIVE-7835) [CBO] Handle a case where FieldTimmer trims all fields from input

2014-08-26 Thread Harish Butani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111559#comment-14111559
 ] 

Harish Butani commented on HIVE-7835:
-

+1

> [CBO] Handle a case where FieldTimmer trims all fields from input
> -
>
> Key: HIVE-7835
> URL: https://issues.apache.org/jira/browse/HIVE-7835
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: h-7835.patch
>
>
> eg in queries like select 1 from t1 where nothing needs to be projected we 
> generate empty project list which results in incorrect ast getting generated. 
> Currently, we fix while generating ast by eliminating such AST node, but 
> better is to generate correct project list to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24688: parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-26 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24688/#review51610
---


Ah I got it, so you are decreasing the old 'stepSize' if the value is the still 
same, so that you can increase the chance of finding a higher value.  Mostly 
looks good, just had that comment from last time.


ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java


Can we add some context, like "Sampled partition key: current..."?


- Szehon Ho


On Aug. 26, 2014, 3:51 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24688/
> ---
> 
> (Updated Aug. 26, 2014, 3:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-7669
> https://issues.apache.org/jira/browse/HIVE-7669
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The source table has 600 Million rows and it has a String column 
> "l_shipinstruct" which has 4 unique values. (Ie. these 4 values are repeated 
> across the 600 million rows)
> 
> We are sorting it based on this string column "l_shipinstruct" as shown in 
> the below HiveQL with the following parameters. 
> {code:sql}
> set hive.optimize.sampling.orderby=true;
> set hive.optimize.sampling.orderby.number=1000;
> set hive.optimize.sampling.orderby.percent=0.1f;
> 
> insert overwrite table lineitem_temp_report 
> select 
>   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
> l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
> l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
> from 
>   lineitem
> order by l_shipinstruct;
> {code}
> Stack Trace
> Diagnostic Messages for this Task:
> {noformat}
> Error: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.(MapTask.java:569)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 10 more
> Caused by: java.lang.IllegalArgumentException: Can't read partitions file
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
> at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
> at 
> org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
> ... 15 more
> Caused by: java.io.IOException: Split points are out of order
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
> ... 17 more
> {noformat}
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
>   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/HiveTotalOrderPartitioner.java 
> 6c22362 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java 166461a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java ef72039 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestPartitionKeySampler.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24688/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>

[jira] [Updated] (HIVE-7826) Dynamic partition pruning on Tez

2014-08-26 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7826:
-

Attachment: HIVE-7826.4.patch

.4 fixes small issue with stats annotation for event operators.

> Dynamic partition pruning on Tez
> 
>
> Key: HIVE-7826
> URL: https://issues.apache.org/jira/browse/HIVE-7826
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>  Labels: TODOC14, tez
> Attachments: HIVE-7826.1.patch, HIVE-7826.2.patch, HIVE-7826.3.patch, 
> HIVE-7826.4.patch
>
>
> It's natural in a star schema to map one or more dimensions to partition 
> columns. Time or location are likely candidates. 
> It can also useful to be to compute the partitions one would like to scan via 
> a subquery (where p in select ... from ...).
> The resulting joins in hive require a full table scan of the large table 
> though, because partition pruning takes place before the corresponding values 
> are known.
> On Tez it's relatively straight forward to send the values needed to prune to 
> the application master - where splits are generated and tasks are submitted. 
> Using these values we can strip out any unneeded partitions dynamically, 
> while the query is running.
> The approach is straight forward:
> - Insert synthetic conditions for each join representing "x in (keys of other 
> side in join)"
> - This conditions will be pushed as far down as possible
> - If the condition hits a table scan and the column involved is a partition 
> column:
>- Setup Operator to send key events to AM
> - else:
>- Remove synthetic predicate



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7887) VectorFileSinkOp does not publish the stats correctly

2014-08-26 Thread Prasanth J (JIRA)

Prasanth J created HIVE-7887:


 Summary: VectorFileSinkOp does not publish the stats correctly
 Key: HIVE-7887
 URL: https://issues.apache.org/jira/browse/HIVE-7887
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J


VectorFSOp inherits FSOp, but the stats collection code in processOp() is 
out-of-date. Needs to be updated to be in sync with FSOp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7222) Support timestamp column statistics in ORC and extend PPD for timestamp

2014-08-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111506#comment-14111506
 ] 

Hive QA commented on HIVE-7222:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664480/HIVE-7222.3.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6116 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge_incompat1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge_incompat2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/512/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/512/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-512/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664480

> Support timestamp column statistics in ORC and extend PPD for timestamp
> ---
>
> Key: HIVE-7222
> URL: https://issues.apache.org/jira/browse/HIVE-7222
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Daniel Dai
>  Labels: orcfile
> Attachments: HIVE-7222-1.patch, HIVE-7222.1.patch, HIVE-7222.2.patch, 
> HIVE-7222.3.patch
>
>
> Add column statistics for timestamp columns in ORC. Also extend predicate 
> pushdown to support timestamp column evaluation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-08-26 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6847:
---

Attachment: HIVE-6847.6.patch

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.2.patch, HIVE-6847.3.patch, 
> HIVE-6847.4.patch, HIVE-6847.5.patch, HIVE-6847.6.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-08-26 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6847:
---

Status: Open  (was: Patch Available)

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.2.patch, HIVE-6847.3.patch, 
> HIVE-6847.4.patch, HIVE-6847.5.patch, HIVE-6847.6.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-08-26 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6847:
---

Status: Patch Available  (was: Open)

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.2.patch, HIVE-6847.3.patch, 
> HIVE-6847.4.patch, HIVE-6847.5.patch, HIVE-6847.6.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7846) authorization api should support group, not assume case insensitive role names

2014-08-26 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111498#comment-14111498
 ] 

Thejas M Nair commented on HIVE-7846:
-

Note that this change restores the case sensitive role name behavior for the 
default authorization mode (It was the case in hive 0.12). 

> authorization api should support group, not assume case insensitive role names
> --
>
> Key: HIVE-7846
> URL: https://issues.apache.org/jira/browse/HIVE-7846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7846.1.patch
>
>
> The case insensitive behavior of roles should be specific to sql standard 
> authorization.
> Group type for principal also should be disabled at the sql std authorization 
> layer, instead of disallowing it at the API level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 25083: Handle a case where FieldTimmer trims all fields from input

2014-08-26 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25083/
---

(Updated Aug. 26, 2014, 10:43 p.m.)


Review request for hive and Harish Butani.


Bugs: HIVE-7835
https://issues.apache.org/jira/browse/HIVE-7835


Repository: hive


Description
---

Handle a case where FieldTimmer trims all fields from input


Diffs
-

  
branches/cbo/ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HiveRelFieldTrimmer.java
 1620376 
  
branches/cbo/ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/ASTConverter.java
 1620376 

Diff: https://reviews.apache.org/r/25083/diff/


Testing
---

Existing test in cbo_correctness.q select null from t3;


Thanks,

Ashutosh Chauhan

[jira] [Updated] (HIVE-7835) [CBO] Handle a case where FieldTimmer trims all fields from input

2014-08-26 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7835:
---

Attachment: h-7835.patch

> [CBO] Handle a case where FieldTimmer trims all fields from input
> -
>
> Key: HIVE-7835
> URL: https://issues.apache.org/jira/browse/HIVE-7835
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: h-7835.patch
>
>
> eg in queries like select 1 from t1 where nothing needs to be projected we 
> generate empty project list which results in incorrect ast getting generated. 
> Currently, we fix while generating ast by eliminating such AST node, but 
> better is to generate correct project list to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 25083: Handle a case where FieldTimmer trims all fields from input

2014-08-26 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25083/
---

Review request for hive and Harish Butani.


Bugs: HIVE-7835
https://issues.apache.org/jira/browse/HIVE-7835


Repository: hive


Description
---

Handle a case where FieldTimmer trims all fields from input


Diffs
-

  
branches/cbo/ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/rules/HiveRelFieldTrimmer.java
 1620376 
  
branches/cbo/ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/ASTConverter.java
 1620376 

Diff: https://reviews.apache.org/r/25083/diff/


Testing
---

Existing test in cbo_correctness.q select null from t3;


Thanks,

Ashutosh Chauhan

[jira] [Updated] (HIVE-7835) [CBO] Handle a case where FieldTimmer trims all fields from input

2014-08-26 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7835:
---

Status: Patch Available  (was: Open)

> [CBO] Handle a case where FieldTimmer trims all fields from input
> -
>
> Key: HIVE-7835
> URL: https://issues.apache.org/jira/browse/HIVE-7835
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: h-7835.patch
>
>
> eg in queries like select 1 from t1 where nothing needs to be projected we 
> generate empty project list which results in incorrect ast getting generated. 
> Currently, we fix while generating ast by eliminating such AST node, but 
> better is to generate correct project list to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111480#comment-14111480
 ] 

Ashutosh Chauhan commented on HIVE-7604:


Looks good to me. I couldn't understand {{PartitionValuesResponse}} completely. 
Can you add a small description of it in your design doc. 

> Add Metastore API to fetch one or more partition names
> --
>
> Key: HIVE-7604
> URL: https://issues.apache.org/jira/browse/HIVE-7604
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.14.0
>
> Attachments: Design_HIVE_7604.txt
>
>
> We need a new API in Metastore to address the following use cases. Both use 
> cases arise from having tables with hundreds of thousands or in some cases 
> millions of partitions.
> 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
> Obtain all dates for which partitions are available. This can be used by 
> tools/frameworks programmatically to understand gaps in partitions before 
> reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
> obtain this information which is unfriendly and heavy weight. And for tables 
> which have large number of partitions, it takes a long time to run the 
> queries and it also requires large heap space.
> 2. Typically users would like to know the list of partitions available and 
> would run queries that would only involve partition keys (select distinct 
> partkey1 from table) Or to obtain the latest date partition from a dimension 
> table to join against another fact table (select * from fact_table join 
> select max(dt) from dimension_table). Those queries (metadata only queries) 
> can be pushed to metastore and need not be run even locally in Hive. If the 
> queries can be converted into database based queries, the clients can be 
> light weight and need not fetch all partition names. The results can be 
> obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7826) Dynamic partition pruning on Tez

2014-08-26 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111462#comment-14111462
 ] 

Gunther Hagleitner commented on HIVE-7826:
--

[~damien.carol] thank you for your interest. This feature is Tez only right 
now. But if you are using tez and you have a cluster with tez 0.5 running you 
can give this a spin. You basically need to use the apache tez branch and apply 
this patch. The relevant configs are:

hive.tez.dynamic.partition.pruning=true (turn it on or off)
hive.tez.dynamic.partition.pruning.max.event.size=size in bytes (maximum size 
of the event that the task will send to the AM, if it's bigger it will turn 
itself off)
hive.tez.dynamic.parition.pruning.max.data.size=size in bytes (maximum total 
size of expected output in the planning stage, if expected size is bigger, it 
will turn itself off)

Any feedback is welcome. Functionality and performance. If you describe your 
use case to me, I will make sure it's covered in the unit tests. If you're 
game: Code review is also welcome.

> Dynamic partition pruning on Tez
> 
>
> Key: HIVE-7826
> URL: https://issues.apache.org/jira/browse/HIVE-7826
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>  Labels: TODOC14, tez
> Attachments: HIVE-7826.1.patch, HIVE-7826.2.patch, HIVE-7826.3.patch
>
>
> It's natural in a star schema to map one or more dimensions to partition 
> columns. Time or location are likely candidates. 
> It can also useful to be to compute the partitions one would like to scan via 
> a subquery (where p in select ... from ...).
> The resulting joins in hive require a full table scan of the large table 
> though, because partition pruning takes place before the corresponding values 
> are known.
> On Tez it's relatively straight forward to send the values needed to prune to 
> the application master - where splits are generated and tasks are submitted. 
> Using these values we can strip out any unneeded partitions dynamically, 
> while the query is running.
> The approach is straight forward:
> - Insert synthetic conditions for each join representing "x in (keys of other 
> side in join)"
> - This conditions will be pushed as far down as possible
> - If the condition hits a table scan and the column involved is a partition 
> column:
>- Setup Operator to send key events to AM
> - else:
>- Remove synthetic predicate



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6847) Improve / fix bugs in Hive scratch dir setup

2014-08-26 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111444#comment-14111444
 ] 

Vaibhav Gumashta commented on HIVE-6847:


Whitespace cleanup got munged in the last 3 patches. I'll remove those and 
upload a new patch. 

> Improve / fix bugs in Hive scratch dir setup
> 
>
> Key: HIVE-6847
> URL: https://issues.apache.org/jira/browse/HIVE-6847
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-6847.1.patch, HIVE-6847.2.patch, HIVE-6847.3.patch, 
> HIVE-6847.4.patch, HIVE-6847.5.patch
>
>
> Currently, the hive server creates scratch directory and changes permission 
> to 777 however, this is not great with respect to security. We need to create 
> user specific scratch directories instead. Also refer to HIVE-6782 1st 
> iteration of the patch for approach.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names

2014-08-26 Thread Thiruvel Thirumoolan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111437#comment-14111437
 ] 

Thiruvel Thirumoolan commented on HIVE-7604:


[~ashutoshc] Do you have any comments on the API?

> Add Metastore API to fetch one or more partition names
> --
>
> Key: HIVE-7604
> URL: https://issues.apache.org/jira/browse/HIVE-7604
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 0.14.0
>
> Attachments: Design_HIVE_7604.txt
>
>
> We need a new API in Metastore to address the following use cases. Both use 
> cases arise from having tables with hundreds of thousands or in some cases 
> millions of partitions.
> 1. It should be quick and easy to obtain distinct values of a partition. Eg: 
> Obtain all dates for which partitions are available. This can be used by 
> tools/frameworks programmatically to understand gaps in partitions before 
> reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to 
> obtain this information which is unfriendly and heavy weight. And for tables 
> which have large number of partitions, it takes a long time to run the 
> queries and it also requires large heap space.
> 2. Typically users would like to know the list of partitions available and 
> would run queries that would only involve partition keys (select distinct 
> partkey1 from table) Or to obtain the latest date partition from a dimension 
> table to join against another fact table (select * from fact_table join 
> select max(dt) from dimension_table). Those queries (metadata only queries) 
> can be pushed to metastore and need not be run even locally in Hive. If the 
> queries can be converted into database based queries, the clients can be 
> light weight and need not fetch all partition names. The results can be 
> obtained much faster with less resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7886) Aggregation queries fail with RCFile based Hive tables with S3 storage

2014-08-26 Thread Venkata Puneet Ravuri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111436#comment-14111436
 ] 

Venkata Puneet Ravuri commented on HIVE-7886:
-

The same issue occurs in Hive 0.12. But it worked when column pruning was 
disabled by setting the property 'hive.optimize.cp' to false.
For Hive 0.13 this property was disabled as part of 
[HIVE-4113|https://issues.apache.org/jira/browse/HIVE-4113].


> Aggregation queries fail with RCFile based Hive tables with S3 storage
> --
>
> Key: HIVE-7886
> URL: https://issues.apache.org/jira/browse/HIVE-7886
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Venkata Puneet Ravuri
>
> Aggregation queries on Hive tables which use RCFile format and S3 storage are 
> failing.
> My setup is Hadoop 2.5.0 and Hive 0.13.1.
> I create a table with following schema:-
> CREATE EXTERNAL TABLE `testtable`(
>   `col1` string, 
>   `col2` tinyint, 
>   `col3` int, 
>   `col4` float, 
>   `col5` boolean, 
>   `col6` smallint)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' 
> WITH SERDEPROPERTIES (
>   'serialization.format'='\t',
>   'line.delim'='\n',
>   'field.delim'='\t'
> )
> STORED AS INPUTFORMAT 
>   'org.apache.hadoop.hive.ql.io.RCFileInputFormat' 
> OUTPUTFORMAT 
>   'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
> LOCATION
>   's3n:///testtable';
> When I run 'select count(*) from testtable', it gives the following exception 
> stack:-
> Error: java.io.IOException: java.io.IOException: java.io.EOFException: 
> Attempted to seek or read past the end of the file
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.io.IOException: java.io.EOFException: Attempted to seek or 
> read past the end of the file
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
>   ... 11 more
> Caused by: java.io.EOFException: Attempted to seek or read past the end of 
> the file
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:462)
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:234)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandle

[jira] [Commented] (HIVE-7673) Authorization api: missing privilege objects in create table/view

2014-08-26 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111442#comment-14111442
 ] 

Jason Dere commented on HIVE-7673:
--

+1
The changes here will affect a lot of .q files, will need to watch for broken q 
file tests after this one goes in.

> Authorization api: missing privilege objects in create table/view
> -
>
> Key: HIVE-7673
> URL: https://issues.apache.org/jira/browse/HIVE-7673
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, SQLStandardAuthorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7673.1.patch, HIVE-7673.2.patch, HIVE-7673.3.patch, 
> HIVE-7673.4.patch, HIVE-7673.5.patch
>
>
> Issues being addressed:
> - In case of create-table-as-select query, the database the table belongs to 
> is not among the objects to be authorized.
> - Create table has the objectName field of the table entry with the database 
> prefix - like testdb.testtable, instead of just the table name.
> - checkPrivileges(CREATEVIEW) does not include the name of the view being 
> created in outputHObjs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7353) HiveServer2 using embedded MetaStore leaks JDOPersistanceManager

2014-08-26 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111434#comment-14111434
 ] 

Szehon Ho commented on HIVE-7353:
-

Thanks, +1 on latest patch pending tests.

> HiveServer2 using embedded MetaStore leaks JDOPersistanceManager
> 
>
> Key: HIVE-7353
> URL: https://issues.apache.org/jira/browse/HIVE-7353
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-7353.1.patch, HIVE-7353.2.patch, HIVE-7353.3.patch, 
> HIVE-7353.4.patch, HIVE-7353.5.patch, HIVE-7353.6.patch, HIVE-7353.7.patch, 
> HIVE-7353.8.patch
>
>
> While using embedded metastore, while creating background threads to run 
> async operations, HiveServer2 ends up creating new instances of 
> JDOPersistanceManager which are cached in JDOPersistanceManagerFactory. Even 
> when the background thread is killed by the thread pool manager, the 
> JDOPersistanceManager are never GCed because they are cached by 
> JDOPersistanceManagerFactory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7886) Aggregation queries fail with RCFile based Hive tables with S3 storage

2014-08-26 Thread Venkata Puneet Ravuri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111429#comment-14111429
 ] 

Venkata Puneet Ravuri commented on HIVE-7886:
-

The data files are in correct RCFile format. When I run 'select *' on this 
table, the data is returned correctly.

> Aggregation queries fail with RCFile based Hive tables with S3 storage
> --
>
> Key: HIVE-7886
> URL: https://issues.apache.org/jira/browse/HIVE-7886
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Venkata Puneet Ravuri
>
> Aggregation queries on Hive tables which use RCFile format and S3 storage are 
> failing.
> My setup is Hadoop 2.5.0 and Hive 0.13.1.
> I create a table with following schema:-
> CREATE EXTERNAL TABLE `testtable`(
>   `col1` string, 
>   `col2` tinyint, 
>   `col3` int, 
>   `col4` float, 
>   `col5` boolean, 
>   `col6` smallint)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' 
> WITH SERDEPROPERTIES (
>   'serialization.format'='\t',
>   'line.delim'='\n',
>   'field.delim'='\t'
> )
> STORED AS INPUTFORMAT 
>   'org.apache.hadoop.hive.ql.io.RCFileInputFormat' 
> OUTPUTFORMAT 
>   'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
> LOCATION
>   's3n:///testtable';
> When I run 'select count(*) from testtable', it gives the following exception 
> stack:-
> Error: java.io.IOException: java.io.IOException: java.io.EOFException: 
> Attempted to seek or read past the end of the file
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.io.IOException: java.io.EOFException: Attempted to seek or 
> read past the end of the file
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
>   ... 11 more
> Caused by: java.io.EOFException: Attempted to seek or read past the end of 
> the file
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:462)
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
>   at 
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:234)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHan

[jira] [Commented] (HIVE-7885) CLIServer.openSessionWithImpersonation logs as if it were openSessionW

2014-08-26 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111427#comment-14111427
 ] 

Szehon Ho commented on HIVE-7885:
-

+1

> CLIServer.openSessionWithImpersonation logs as if it were openSessionW
> --
>
> Key: HIVE-7885
> URL: https://issues.apache.org/jira/browse/HIVE-7885
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-7885.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7886) Aggregation queries fail with RCFile based Hive tables with S3 storage

2014-08-26 Thread Venkata Puneet Ravuri (JIRA)

Venkata Puneet Ravuri created HIVE-7886:
---

 Summary: Aggregation queries fail with RCFile based Hive tables 
with S3 storage
 Key: HIVE-7886
 URL: https://issues.apache.org/jira/browse/HIVE-7886
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Venkata Puneet Ravuri


Aggregation queries on Hive tables which use RCFile format and S3 storage are 
failing.

My setup is Hadoop 2.5.0 and Hive 0.13.1.

I create a table with following schema:-
CREATE EXTERNAL TABLE `testtable`(
  `col1` string, 
  `col2` tinyint, 
  `col3` int, 
  `col4` float, 
  `col5` boolean, 
  `col6` smallint)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' 
WITH SERDEPROPERTIES (
  'serialization.format'='\t',
  'line.delim'='\n',
  'field.delim'='\t'
)
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.RCFileInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
  's3n:///testtable';

When I run 'select count(*) from testtable', it gives the following exception 
stack:-

Error: java.io.IOException: java.io.IOException: java.io.EOFException: 
Attempted to seek or read past the end of the file
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.IOException: java.io.EOFException: Attempted to seek or read 
past the end of the file
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
... 11 more
Caused by: java.io.EOFException: Attempted to seek or read past the end of the 
file
at 
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:462)
at 
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
at 
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:234)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at org.apache.hadoop.fs.s3native.$Proxy17.retrieve(Unknown Source)
at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:205)
at 
org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:96)
at 
org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:67)
at java.io.DataInputStream.skipBytes(DataInputStream.java:220

[jira] [Commented] (HIVE-7885) CLIServer.openSessionWithImpersonation logs as if it were openSessionW

2014-08-26 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111416#comment-14111416
 ] 

Brock Noland commented on HIVE-7885:


FYI [~szehon]

> CLIServer.openSessionWithImpersonation logs as if it were openSessionW
> --
>
> Key: HIVE-7885
> URL: https://issues.apache.org/jira/browse/HIVE-7885
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-7885.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7885) CLIServer.openSessionWithImpersonation logs as if it were openSessionW

2014-08-26 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7885:
---

Attachment: HIVE-7885.patch

> CLIServer.openSessionWithImpersonation logs as if it were openSessionW
> --
>
> Key: HIVE-7885
> URL: https://issues.apache.org/jira/browse/HIVE-7885
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Priority: Minor
> Attachments: HIVE-7885.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7885) CLIServer.openSessionWithImpersonation logs as if it were openSessionW

2014-08-26 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7885:
---

Assignee: Brock Noland
  Status: Patch Available  (was: Open)

> CLIServer.openSessionWithImpersonation logs as if it were openSessionW
> --
>
> Key: HIVE-7885
> URL: https://issues.apache.org/jira/browse/HIVE-7885
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-7885.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7885) CLIServer.openSessionWithImpersonation logs as if it were openSessionW

2014-08-26 Thread Brock Noland (JIRA)

Brock Noland created HIVE-7885:
--

 Summary: CLIServer.openSessionWithImpersonation logs as if it were 
openSessionW
 Key: HIVE-7885
 URL: https://issues.apache.org/jira/browse/HIVE-7885
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Priority: Minor
 Attachments: HIVE-7885.patch





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7720) CBO: rank translation to Optiq RelNode tree failing

2014-08-26 Thread Laljo John Pullokkaran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-7720:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> CBO: rank translation to Optiq RelNode tree failing
> ---
>
> Key: HIVE-7720
> URL: https://issues.apache.org/jira/browse/HIVE-7720
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7720.patch
>
>
> Following query:
> {code}
> explain select p_name
> from (select p_mfgr, p_name, p_size, rank() over(partition by p_mfgr order by 
> p_size) as r from part) a
> where r <= 2;
> {code}
> fails with 
> {quote}
> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more 
> arguments are expected.
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFRank.getEvaluator(GenericUDAFRank.java:61)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver.getEvaluator(AbstractGenericUDAFResolver.java:47)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:1110)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:3506)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.getHiveAggInfo(SemanticAnalyzer.java:12496)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$OptiqBasedPlanner.genWindowingProj(SemanticAnalyzer.java:12858)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7353) HiveServer2 using embedded MetaStore leaks JDOPersistanceManager

2014-08-26 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7353:
---

Status: Patch Available  (was: Open)

> HiveServer2 using embedded MetaStore leaks JDOPersistanceManager
> 
>
> Key: HIVE-7353
> URL: https://issues.apache.org/jira/browse/HIVE-7353
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-7353.1.patch, HIVE-7353.2.patch, HIVE-7353.3.patch, 
> HIVE-7353.4.patch, HIVE-7353.5.patch, HIVE-7353.6.patch, HIVE-7353.7.patch, 
> HIVE-7353.8.patch
>
>
> While using embedded metastore, while creating background threads to run 
> async operations, HiveServer2 ends up creating new instances of 
> JDOPersistanceManager which are cached in JDOPersistanceManagerFactory. Even 
> when the background thread is killed by the thread pool manager, the 
> JDOPersistanceManager are never GCed because they are cached by 
> JDOPersistanceManagerFactory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

1 2 3 >

1 - 100 of 223 matches

Mail list logo