[jira] [Created] (HIVE-6865) Failed to load data into Hive from Pig using HCatStorer()

2014-04-08 Thread Bing Li (JIRA)
Bing Li created HIVE-6865:
-

 Summary: Failed to load data into Hive from Pig using HCatStorer()
 Key: HIVE-6865
 URL: https://issues.apache.org/jira/browse/HIVE-6865
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Bing Li
Assignee: Bing Li


Reproduce steps:
1. create a hive table
hive create table t1 (c1 int, c2 int, c3 int);

2. start pig shell
grunt register $HIVE_HOME/lib/*.jar
grunt register $HIVE_HOME/hcatalog/share/hcatalog/*.jar
grunt A = load 'pig.txt' as (c1:int, c2:int, c3:int)
grunt store A into 't1' using org.apache.hive.hcatalog.HCatSrorer();

Error Message:
ERROR [main] org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: 
Unable to recreate exception from backend error: 
org.apache.hcatalog.common.HCatException : 2004 : HCatOutputFormat not 
initialized, setOutput has to be called
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:111)
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:97)
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:85)
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:75)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:187)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1000)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:963)
at 
java.security.AccessController.doPrivileged(AccessController.java:310)
at javax.security.auth.Subject.doAs(Subject.java:573)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:963)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:616)
at 
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at 
org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:191)
at java.lang.Thread.run(Thread.java:738)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6831) The job schedule in condition task could not be correct with skewed join optimization

2014-04-08 Thread william zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962630#comment-13962630
 ] 

william zhu commented on HIVE-6831:
---

1. The query can be simplied like this:
Select * from 
(select   *  from   TableA  Union all select * from TableB)  a
2.  TableA and TableB is consists of other select query.  
3.  And I set the hive parameter is : 
set hive.auto.convert.join=false;
set hive.optimize.skewjoin = true;
set hive.skewjoin.key = 50;
set hive.mapjoin.smalltable.filesize=5000;



 The job schedule in condition task could not be correct with skewed join 
 optimization
 -

 Key: HIVE-6831
 URL: https://issues.apache.org/jira/browse/HIVE-6831
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hive 0.11.0
Reporter: william zhu
 Attachments: 6831.patch


 Code snippet in  ConditionalTask.java as bellow:  
 // resolved task
 if (driverContext.addToRunnable(tsk)) {
   console.printInfo(tsk.getId() +  is selected by condition 
 resolver.);
 }
 The selected task is added into the runnable queue immediately without any 
 dependency checking. If the selected task is original task ,and its parent 
 task is not being executed, then the result will be incorrect.
 Like this:
 1. Before skew join optimization:
 Step1 ,Step 2 -- step 3   ( Step1 and Step2 is Step 3's parent)
 2. after skew join optimization:
 Step1 - Step4 (ConditionTask)- consists of [Step3,Step10]
 Step2 - Step5 (ConditionTask)- consists of [Step3,Step11]
 3. Runing
 Step3 is selected in Step4 and Step5
 Step3 will be execute immediately after Step4 , its not correct.
 Step3 will be execute after Step5 again, its not correct either.
 4. The correct scheduler is that step3 will be execute after step4 and step5.
 5. So, I add a checking operate in the snippet  as bellow:
 if (!driverContext.getRunnable().contains(tsk)) {
   console.printInfo(tsk.getId() +  is selected by condition 
 resolver.);
   if(DriverContext.isLaunchable(tsk)){
 driverContext.addToRunnable(tsk);
   }
 }
 So , that is work right for me in my enviroment. I am not sure whether it 
 will has some problems  in someother condition. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6831) The job schedule in condition task could not be correct with skewed join optimization

2014-04-08 Thread william zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962637#comment-13962637
 ] 

william zhu commented on HIVE-6831:
---

The problem is ,the union operate will be resolved two steps(A,B). The two 
steps have one child steps(C) ,C will aggregate A and B steps.
And in skew join condition task , c will be selected without any check that its 
parent steps(A,B) have been completed.

 The job schedule in condition task could not be correct with skewed join 
 optimization
 -

 Key: HIVE-6831
 URL: https://issues.apache.org/jira/browse/HIVE-6831
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hive 0.11.0
Reporter: william zhu
 Attachments: 6831.patch


 Code snippet in  ConditionalTask.java as bellow:  
 // resolved task
 if (driverContext.addToRunnable(tsk)) {
   console.printInfo(tsk.getId() +  is selected by condition 
 resolver.);
 }
 The selected task is added into the runnable queue immediately without any 
 dependency checking. If the selected task is original task ,and its parent 
 task is not being executed, then the result will be incorrect.
 Like this:
 1. Before skew join optimization:
 Step1 ,Step 2 -- step 3   ( Step1 and Step2 is Step 3's parent)
 2. after skew join optimization:
 Step1 - Step4 (ConditionTask)- consists of [Step3,Step10]
 Step2 - Step5 (ConditionTask)- consists of [Step3,Step11]
 3. Runing
 Step3 is selected in Step4 and Step5
 Step3 will be execute immediately after Step4 , its not correct.
 Step3 will be execute after Step5 again, its not correct either.
 4. The correct scheduler is that step3 will be execute after step4 and step5.
 5. So, I add a checking operate in the snippet  as bellow:
 if (!driverContext.getRunnable().contains(tsk)) {
   console.printInfo(tsk.getId() +  is selected by condition 
 resolver.);
   if(DriverContext.isLaunchable(tsk)){
 driverContext.addToRunnable(tsk);
   }
 }
 So , that is work right for me in my enviroment. I am not sure whether it 
 will has some problems  in someother condition. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6857) Refactor HiveServer2 TSetIpAddressProcessor

2014-04-08 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6857:
---

Summary: Refactor HiveServer2 TSetIpAddressProcessor  (was: Refactor 
HiveServer2 threadlocals)

 Refactor HiveServer2 TSetIpAddressProcessor
 ---

 Key: HIVE-6857
 URL: https://issues.apache.org/jira/browse/HIVE-6857
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 Excerpt HIVE-6837. Issues:
 1. SessionManager#openSession:
 {code}
 public SessionHandle openSession(TProtocolVersion protocol, String username, 
 String password,
   MapString, String sessionConf, boolean withImpersonation, String 
 delegationToken)
   throws HiveSQLException {
 HiveSession session;
 if (withImpersonation) {
   HiveSessionImplwithUGI hiveSessionUgi = new 
 HiveSessionImplwithUGI(protocol, username, password,
 hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
 delegationToken);
   session = HiveSessionProxy.getProxy(hiveSessionUgi, 
 hiveSessionUgi.getSessionUgi());
   hiveSessionUgi.setProxySession(session);
 } else {
   session = new HiveSessionImpl(protocol, username, password, hiveConf, 
 sessionConf,
   TSetIpAddressProcessor.getUserIpAddress());
 }
 session.setSessionManager(this);
 session.setOperationManager(operationManager);
 session.open();
 handleToSession.put(session.getSessionHandle(), session);
 try {
   executeSessionHooks(session);
 } catch (Exception e) {
   throw new HiveSQLException(Failed to execute session hooks, e);
 }
 return session.getSessionHandle();
   }
 {code}
 Notice that if withImpersonation is set to true, we're using 
 TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is 
 wrong for a kerberized setup (should use HiveAuthFactory#getIpAddress).
 2. Also, in case of a kerberized setup, we're wrapping the transport in a 
 doAs (with UGI of the HiveServer2 process) which doesn't make sense to me: 
 https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.
 3. The name TSetIpAddressProcessor should be replaced with something more 
 meaningful like TPlainSASLProcessor.
 4. Consolidate thread locals used for username, ipaddress
 5. Do not directly use TSetIpAddressProcessor; get it via factory like here:
 https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L161



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6857) Consolidate HiveServer2 threadlocals

2014-04-08 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6857:
---

Description: 
Excerpt HIVE-6837. Issues:
1. SessionManager#openSession:
{code}
public SessionHandle openSession(TProtocolVersion protocol, String username, 
String password,
  MapString, String sessionConf, boolean withImpersonation, String 
delegationToken)
  throws HiveSQLException {
HiveSession session;
if (withImpersonation) {
  HiveSessionImplwithUGI hiveSessionUgi = new 
HiveSessionImplwithUGI(protocol, username, password,
hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
delegationToken);
  session = HiveSessionProxy.getProxy(hiveSessionUgi, 
hiveSessionUgi.getSessionUgi());
  hiveSessionUgi.setProxySession(session);
} else {
  session = new HiveSessionImpl(protocol, username, password, hiveConf, 
sessionConf,
  TSetIpAddressProcessor.getUserIpAddress());
}
session.setSessionManager(this);
session.setOperationManager(operationManager);
session.open();
handleToSession.put(session.getSessionHandle(), session);

try {
  executeSessionHooks(session);
} catch (Exception e) {
  throw new HiveSQLException(Failed to execute session hooks, e);
}
return session.getSessionHandle();
  }
{code}
Notice that if withImpersonation is set to true, we're using 
TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is wrong 
for a kerberized setup (should use HiveAuthFactory#getIpAddress).

2. Also, in case of a kerberized setup, we're wrapping the transport in a doAs 
(with UGI of the HiveServer2 process) which doesn't make sense to me: 
https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.

3. The name TSetIpAddressProcessor should be replaced with something more 
meaningful like TPlainSASLProcessor.

4. Consolidate thread locals used for username, ipaddress

5. Do not directly use TSetIpAddressProcessor; get it via factory like here:
https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L161

  was:Check the discussion here: HIVE-6837


 Consolidate HiveServer2 threadlocals
 

 Key: HIVE-6857
 URL: https://issues.apache.org/jira/browse/HIVE-6857
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 Excerpt HIVE-6837. Issues:
 1. SessionManager#openSession:
 {code}
 public SessionHandle openSession(TProtocolVersion protocol, String username, 
 String password,
   MapString, String sessionConf, boolean withImpersonation, String 
 delegationToken)
   throws HiveSQLException {
 HiveSession session;
 if (withImpersonation) {
   HiveSessionImplwithUGI hiveSessionUgi = new 
 HiveSessionImplwithUGI(protocol, username, password,
 hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
 delegationToken);
   session = HiveSessionProxy.getProxy(hiveSessionUgi, 
 hiveSessionUgi.getSessionUgi());
   hiveSessionUgi.setProxySession(session);
 } else {
   session = new HiveSessionImpl(protocol, username, password, hiveConf, 
 sessionConf,
   TSetIpAddressProcessor.getUserIpAddress());
 }
 session.setSessionManager(this);
 session.setOperationManager(operationManager);
 session.open();
 handleToSession.put(session.getSessionHandle(), session);
 try {
   executeSessionHooks(session);
 } catch (Exception e) {
   throw new HiveSQLException(Failed to execute session hooks, e);
 }
 return session.getSessionHandle();
   }
 {code}
 Notice that if withImpersonation is set to true, we're using 
 TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is 
 wrong for a kerberized setup (should use HiveAuthFactory#getIpAddress).
 2. Also, in case of a kerberized setup, we're wrapping the transport in a 
 doAs (with UGI of the HiveServer2 process) which doesn't make sense to me: 
 https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.
 3. The name TSetIpAddressProcessor should be replaced with something more 
 meaningful like TPlainSASLProcessor.
 4. Consolidate thread locals used for username, ipaddress
 5. Do not directly use TSetIpAddressProcessor; get it via factory like here:
 https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L161



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6857) Refactor HiveServer2 threadlocals

2014-04-08 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6857:
---

Summary: Refactor HiveServer2 threadlocals  (was: Consolidate HiveServer2 
threadlocals)

 Refactor HiveServer2 threadlocals
 -

 Key: HIVE-6857
 URL: https://issues.apache.org/jira/browse/HIVE-6857
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 Excerpt HIVE-6837. Issues:
 1. SessionManager#openSession:
 {code}
 public SessionHandle openSession(TProtocolVersion protocol, String username, 
 String password,
   MapString, String sessionConf, boolean withImpersonation, String 
 delegationToken)
   throws HiveSQLException {
 HiveSession session;
 if (withImpersonation) {
   HiveSessionImplwithUGI hiveSessionUgi = new 
 HiveSessionImplwithUGI(protocol, username, password,
 hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
 delegationToken);
   session = HiveSessionProxy.getProxy(hiveSessionUgi, 
 hiveSessionUgi.getSessionUgi());
   hiveSessionUgi.setProxySession(session);
 } else {
   session = new HiveSessionImpl(protocol, username, password, hiveConf, 
 sessionConf,
   TSetIpAddressProcessor.getUserIpAddress());
 }
 session.setSessionManager(this);
 session.setOperationManager(operationManager);
 session.open();
 handleToSession.put(session.getSessionHandle(), session);
 try {
   executeSessionHooks(session);
 } catch (Exception e) {
   throw new HiveSQLException(Failed to execute session hooks, e);
 }
 return session.getSessionHandle();
   }
 {code}
 Notice that if withImpersonation is set to true, we're using 
 TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is 
 wrong for a kerberized setup (should use HiveAuthFactory#getIpAddress).
 2. Also, in case of a kerberized setup, we're wrapping the transport in a 
 doAs (with UGI of the HiveServer2 process) which doesn't make sense to me: 
 https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.
 3. The name TSetIpAddressProcessor should be replaced with something more 
 meaningful like TPlainSASLProcessor.
 4. Consolidate thread locals used for username, ipaddress
 5. Do not directly use TSetIpAddressProcessor; get it via factory like here:
 https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L161



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6857) Refactor HiveServer2 TSetIpAddressProcessor

2014-04-08 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962641#comment-13962641
 ] 

Vaibhav Gumashta commented on HIVE-6857:


[~thejas] I am just using this is a placeholder of issues I notice wrt 
TSetIpAddressProcessor, threadlocals. You might want to take a look. Some of 
these I'll resolve as part of HIVE-6864. 

 Refactor HiveServer2 TSetIpAddressProcessor
 ---

 Key: HIVE-6857
 URL: https://issues.apache.org/jira/browse/HIVE-6857
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 Excerpt HIVE-6837. Issues:
 1. SessionManager#openSession:
 {code}
 public SessionHandle openSession(TProtocolVersion protocol, String username, 
 String password,
   MapString, String sessionConf, boolean withImpersonation, String 
 delegationToken)
   throws HiveSQLException {
 HiveSession session;
 if (withImpersonation) {
   HiveSessionImplwithUGI hiveSessionUgi = new 
 HiveSessionImplwithUGI(protocol, username, password,
 hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
 delegationToken);
   session = HiveSessionProxy.getProxy(hiveSessionUgi, 
 hiveSessionUgi.getSessionUgi());
   hiveSessionUgi.setProxySession(session);
 } else {
   session = new HiveSessionImpl(protocol, username, password, hiveConf, 
 sessionConf,
   TSetIpAddressProcessor.getUserIpAddress());
 }
 session.setSessionManager(this);
 session.setOperationManager(operationManager);
 session.open();
 handleToSession.put(session.getSessionHandle(), session);
 try {
   executeSessionHooks(session);
 } catch (Exception e) {
   throw new HiveSQLException(Failed to execute session hooks, e);
 }
 return session.getSessionHandle();
   }
 {code}
 Notice that if withImpersonation is set to true, we're using 
 TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is 
 wrong for a kerberized setup (should use HiveAuthFactory#getIpAddress).
 2. Also, in case of a kerberized setup, we're wrapping the transport in a 
 doAs (with UGI of the HiveServer2 process) which doesn't make sense to me: 
 https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.
 3. The name TSetIpAddressProcessor should be replaced with something more 
 meaningful like TPlainSASLProcessor.
 4. Consolidate thread locals used for username, ipaddress
 5. Do not directly use TSetIpAddressProcessor; get it via factory like here:
 https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L161



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6857) Refactor HiveServer2 TSetIpAddressProcessor

2014-04-08 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6857:
---

Description: 
Excerpt from HIVE-6837 and related issues:
1. SessionManager#openSession:
{code}
public SessionHandle openSession(TProtocolVersion protocol, String username, 
String password,
  MapString, String sessionConf, boolean withImpersonation, String 
delegationToken)
  throws HiveSQLException {
HiveSession session;
if (withImpersonation) {
  HiveSessionImplwithUGI hiveSessionUgi = new 
HiveSessionImplwithUGI(protocol, username, password,
hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
delegationToken);
  session = HiveSessionProxy.getProxy(hiveSessionUgi, 
hiveSessionUgi.getSessionUgi());
  hiveSessionUgi.setProxySession(session);
} else {
  session = new HiveSessionImpl(protocol, username, password, hiveConf, 
sessionConf,
  TSetIpAddressProcessor.getUserIpAddress());
}
session.setSessionManager(this);
session.setOperationManager(operationManager);
session.open();
handleToSession.put(session.getSessionHandle(), session);

try {
  executeSessionHooks(session);
} catch (Exception e) {
  throw new HiveSQLException(Failed to execute session hooks, e);
}
return session.getSessionHandle();
  }
{code}
Notice that if withImpersonation is set to true, we're using 
TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is wrong 
for a kerberized setup (should use HiveAuthFactory#getIpAddress).

2. Also, in case of a kerberized setup, we're wrapping the transport in a doAs 
(with UGI of the HiveServer2 process) which doesn't make sense to me: 
https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.

3. The name TSetIpAddressProcessor should be replaced with something more 
meaningful like TPlainSASLProcessor.

4. Consolidate thread locals used for username, ipaddress

5. Do not directly use TSetIpAddressProcessor; get it via factory like here:
https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L161

  was:
Excerpt HIVE-6837. Issues:
1. SessionManager#openSession:
{code}
public SessionHandle openSession(TProtocolVersion protocol, String username, 
String password,
  MapString, String sessionConf, boolean withImpersonation, String 
delegationToken)
  throws HiveSQLException {
HiveSession session;
if (withImpersonation) {
  HiveSessionImplwithUGI hiveSessionUgi = new 
HiveSessionImplwithUGI(protocol, username, password,
hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
delegationToken);
  session = HiveSessionProxy.getProxy(hiveSessionUgi, 
hiveSessionUgi.getSessionUgi());
  hiveSessionUgi.setProxySession(session);
} else {
  session = new HiveSessionImpl(protocol, username, password, hiveConf, 
sessionConf,
  TSetIpAddressProcessor.getUserIpAddress());
}
session.setSessionManager(this);
session.setOperationManager(operationManager);
session.open();
handleToSession.put(session.getSessionHandle(), session);

try {
  executeSessionHooks(session);
} catch (Exception e) {
  throw new HiveSQLException(Failed to execute session hooks, e);
}
return session.getSessionHandle();
  }
{code}
Notice that if withImpersonation is set to true, we're using 
TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is wrong 
for a kerberized setup (should use HiveAuthFactory#getIpAddress).

2. Also, in case of a kerberized setup, we're wrapping the transport in a doAs 
(with UGI of the HiveServer2 process) which doesn't make sense to me: 
https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.

3. The name TSetIpAddressProcessor should be replaced with something more 
meaningful like TPlainSASLProcessor.

4. Consolidate thread locals used for username, ipaddress

5. Do not directly use TSetIpAddressProcessor; get it via factory like here:
https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L161


 Refactor HiveServer2 TSetIpAddressProcessor
 ---

 Key: HIVE-6857
 URL: https://issues.apache.org/jira/browse/HIVE-6857
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 Excerpt from HIVE-6837 and related issues:
 1. SessionManager#openSession:
 {code}
 public SessionHandle openSession(TProtocolVersion protocol, String username, 
 String password,
   MapString, String sessionConf, boolean 

[jira] [Updated] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing org.apache.tez.dag.api.SessionNotRunning: Application not running error

2014-04-08 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6782:
-

Attachment: HIVE-6782.11.patch

Address Lefty's comment.

 HiveServer2Concurrency issue when running with tez intermittently, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error
 -

 Key: HIVE-6782
 URL: https://issues.apache.org/jira/browse/HIVE-6782
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, 
 HIVE-6782.11.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, 
 HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, 
 HIVE-6782.9.patch


 HiveServer2 concurrency is failing intermittently when using tez, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing org.apache.tez.dag.api.SessionNotRunning: Application not running error

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962726#comment-13962726
 ] 

Hive QA commented on HIVE-6782:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639082/HIVE-6782.10.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5549 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2171/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2171/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639082

 HiveServer2Concurrency issue when running with tez intermittently, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error
 -

 Key: HIVE-6782
 URL: https://issues.apache.org/jira/browse/HIVE-6782
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, 
 HIVE-6782.11.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, 
 HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, 
 HIVE-6782.9.patch


 HiveServer2 concurrency is failing intermittently when using tez, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6866) Hive server2 jdbc driver connection leak with namenode

2014-04-08 Thread Shengjun Xin (JIRA)
Shengjun Xin created HIVE-6866:
--

 Summary: Hive server2 jdbc driver connection leak with namenode
 Key: HIVE-6866
 URL: https://issues.apache.org/jira/browse/HIVE-6866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Shengjun Xin


1. Set 'ipc.client.connection.maxidletime' to 360 in core-site.xml and 
start hive-server2
2. Connect hive server2 continuously
3. It seems that hive server2 will not close the connection until the time out, 
the error message is as the following:
{code}
2014-03-18 23:30:36,873 ERROR ql.Driver (SessionState.java:printError(386)) - 
FAILED: RuntimeException java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
java.lang.RuntimeException: java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:190)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:231)
at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:288)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1274)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8676)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:181)
at 
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:40)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:37)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
at 
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:37)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:483)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2259)
 

[jira] [Updated] (HIVE-6866) Hive server2 jdbc driver connection leak with namenode

2014-04-08 Thread Shengjun Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengjun Xin updated HIVE-6866:
---

Description: 
1. Set 'ipc.client.connection.maxidletime' to 360 in core-site.xml and 
start hive-server2.
2. Connect hive server2 repetitively in a while true loop.
3. The tcp connection number will increase until out of memory, it seems that 
hive server2 will not close the connection until the time out, the error 
message is as the following:
{code}
2014-03-18 23:30:36,873 ERROR ql.Driver (SessionState.java:printError(386)) - 
FAILED: RuntimeException java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
java.lang.RuntimeException: java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:190)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:231)
at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:288)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1274)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8676)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:181)
at 
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:40)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:37)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
at 
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:37)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:483)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2259)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2230)

[jira] [Updated] (HIVE-6866) Hive server2 jdbc driver connection leak with namenode

2014-04-08 Thread Shengjun Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengjun Xin updated HIVE-6866:
---

Description: 
1. Set 'ipc.client.connection.maxidletime' to 360 in core-site.xml and 
start hive-server2.
2. Connect hive server2 repetitively in a while true loop.
3. It seems that hive server2 will not close the connection until the time out, 
the error message is as the following:
{code}
2014-03-18 23:30:36,873 ERROR ql.Driver (SessionState.java:printError(386)) - 
FAILED: RuntimeException java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
java.lang.RuntimeException: java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:190)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:231)
at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:288)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1274)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8676)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:181)
at 
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:40)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:37)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
at 
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:37)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
Couldn't set up IO streams; Host Details : local host is: 
hdm1.hadoop.local/192.168.2.101; destination host is: 
hdm1.hadoop.local:8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:483)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2259)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2230)
at 

[jira] [Commented] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-08 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962753#comment-13962753
 ] 

Jason Dere commented on HIVE-6858:
--

Would you be able to fix groupby3_map_skew.q as well, which looks like it also 
has a similar issue? For that one maybe you could replace:
SELECT dest1.* FROM dest1;
with:
SELECT c1, c2, c3, c4, c5, c6, c7, ROUND(c8, 5), ROUND(c9, 5) FROM dest1;

And hopefully the values generated do not show differences between the jdk6/7 
formatting.

 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 ---

 Key: HIVE-6858
 URL: https://issues.apache.org/jira/browse/HIVE-6858
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6858.1.patch


 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 {noformat}
  -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
 ---
  -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
 {noformat}
 Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
 produces -0.004 while, jdk-6 produces -0.0040.
 {code}
 public class Main {
   public static void main(String[] a) throws Exception {
  double val = 0.004;
  System.out.println(Value = +val);
   }
 }
 {code}
 This happens to be a bug in jdk6, that has been fixed in jdk7.
 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5687) Streaming support in Hive

2014-04-08 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962755#comment-13962755
 ] 

Lars Francke commented on HIVE-5687:


Thanks, could you put a new version up on RB?

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
 Streaming Ingest API for v4 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing org.apache.tez.dag.api.SessionNotRunning: Application not running error

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962778#comment-13962778
 ] 

Hive QA commented on HIVE-6782:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639155/HIVE-6782.11.patch

{color:green}SUCCESS:{color} +1 5549 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2172/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2172/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639155

 HiveServer2Concurrency issue when running with tez intermittently, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error
 -

 Key: HIVE-6782
 URL: https://issues.apache.org/jira/browse/HIVE-6782
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, 
 HIVE-6782.11.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, 
 HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, 
 HIVE-6782.9.patch


 HiveServer2 concurrency is failing intermittently when using tez, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962947#comment-13962947
 ] 

Hive QA commented on HIVE-6825:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639084/HIVE-6825.01.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5549 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2173/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2173/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639084

 custom jars for Hive query should be uploaded to scratch dir per query; 
 and/or versioned
 

 Key: HIVE-6825
 URL: https://issues.apache.org/jira/browse/HIVE-6825
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-6825.01.patch, HIVE-6825.patch


 Currently the jars are uploaded to either user directory or global, whatever 
 is configured, which is a mess and can cause collisions. We can upload to 
 scratch directory, and/or version. 
 There's a tradeoff between having to upload files every time (for example, 
 for commonly used things like HBase input format) (which is what is done now, 
 into global/user path), and having a mess of one-off custom jars and files, 
 versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6860) Issue with FS based stats collection on Tez

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6860:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk  0.13

 Issue with FS based stats collection on Tez
 ---

 Key: HIVE-6860
 URL: https://issues.apache.org/jira/browse/HIVE-6860
 Project: Hive
  Issue Type: Bug
  Components: Statistics, Tez
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6860.patch


 Statistics from different tasks got overwritten while running on Tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6855) A couple of errors in MySQL db creation script for transaction tables

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6855:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk.

 A couple of errors in MySQL db creation script for transaction tables
 -

 Key: HIVE-6855
 URL: https://issues.apache.org/jira/browse/HIVE-6855
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6855.patch


 There are a few small issues in the database creation scripts for mysql.  A 
 couple of the tables don't set the engine to InnoDB.  None of the tables set 
 default character set to latin1.  And the syntax CREATE INDEX...USING HASH 
 doesn't work on older versions of MySQL.  Instead the index creation should 
 be done without specifying a method (no USING clause).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6113) Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2014-04-08 Thread Eli Acherkan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963055#comment-13963055
 ] 

Eli Acherkan commented on HIVE-6113:


The exact same issue reproduces here. Hive 0.12 on MapR 3.1.0 with MySQL 
metastore. The exception appears when there are several processes working with 
Hive concurrently.

From our analysis the problem seems related to the one described here: 
http://mail-archives.apache.org/mod_mbox/hive-user/201107.mbox/%3c4f6b25afffcafe44b6259a412d5f9b1033183...@exchmbx104.netflix.com%3E

h5. Analysis:
At certain times, Hive's DataNucleus decides to create and then drop tables 
called DELETEME+timestamp in the metastore schema on MySQL (see 
[ProbleTable|http://sourceforge.net/p/datanucleus/code/HEAD/tree/platform/store.rdbms/tags/datanucleus-rdbms-3.2.2/src/java/org/datanucleus/store/rdbms/table/ProbeTable.java]).

During other flows, DataNucleus queries MySQL for the list of all the columns 
of all the tables (see 
[RDBMSSchemaHandler.refreshTableData|http://sourceforge.net/p/datanucleus/code/HEAD/tree/platform/store.rdbms/tags/datanucleus-rdbms-3.2.2/src/java/org/datanucleus/store/rdbms/schema/RDBMSSchemaHandler.java#l872]).
 MySQL's JDBC driver implements the DatabaseMetaData.getColumns method by 
querying the DB for a list of all the tables, and then iterating over that list 
and querying for each table's columns (see 
[com.mysql.jdbc.DatabaseMetaData|http://bazaar.launchpad.net/~mysql/connectorj/5.1/view/head:/src/com/mysql/jdbc/DatabaseMetaData.java#L2581]).
 If a table is deleted from the DB during this operation, 
DatabaseMetaData.getColumns will throw an exception.

This exception is interpreted by Hive to mean that the default Hive database 
doesn't exist. Hive tries to create it, inserting a row into the metastore.DBS 
table in MySQL, which triggers the Duplicate entry 'default' for key 
'UNIQUE_DATABASE' exception.

I'm not completely clear about the conditions for a) DataNucleus creating and 
dropping a DELETEME table, and b) DataNucleus calling 
DatabaseMetaData.getColumns, so unfortunately I can't yet provide a clear test 
case. But in our lab environment under load we were able to reproduce the 
exception once every few minutes.

h5. Workaround:
As suggested by the link above, setting the *datanucleus.fixedDatastore* 
property to *true* (e.g. in hive-site.xml or elsewhere) seems to solve the 
problem. However, it means that the metastore schema is no longer automatically 
created on-demand, and requires using Hive's schematool command to manually 
create the metastore schema.

 Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
 --

 Key: HIVE-6113
 URL: https://issues.apache.org/jira/browse/HIVE-6113
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
 Environment: hadoop-0.20.2-cdh3u3,hive-0.12.0
Reporter: William Stone
Priority: Critical
  Labels: HiveMetaStoreClient, metastore, unable_instantiate

 When I exccute SQL use fdm; desc formatted fdm.tableName;  in python, throw 
 Error as followed.
 but when I tryit again , It will success.
 2013-12-25 03:01:32,290 ERROR exec.DDLTask (DDLTask.java:execute(435)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: 
 Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
   at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1143)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1128)
   at 
 org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3479)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:260)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:507)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:875)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:769)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:708)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 

[jira] [Updated] (HIVE-6830) After major compaction unable to read from partition with MR job

2014-04-08 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6830:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

The test case passed locally.

I just committed this. Thanks for the review Harish.

 After major compaction unable to read from partition with MR job
 

 Key: HIVE-6830
 URL: https://issues.apache.org/jira/browse/HIVE-6830
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Owen O'Malley
Priority: Critical
 Fix For: 0.13.0

 Attachments: HIVE-6830.patch


 After doing a major compaction any attempt to do read the data with an MR job 
 (select count(*), subsequent compaction) fails with:
 Caused by: java.lang.IllegalArgumentException: All base directories were 
 ignored, such as 
 hdfs://hdp.example.com:8020/apps/hive/warehouse/purchaselog/ds=201404031016/base_0044000
  by 5:4086:...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6830) After major compaction unable to read from partition with MR job

2014-04-08 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963062#comment-13963062
 ] 

Owen O'Malley commented on HIVE-6830:
-

Sergey, if bestBase is defined it adds which ever is older (either bestBase or 
child) to obsolete.

 After major compaction unable to read from partition with MR job
 

 Key: HIVE-6830
 URL: https://issues.apache.org/jira/browse/HIVE-6830
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Owen O'Malley
Priority: Critical
 Fix For: 0.13.0

 Attachments: HIVE-6830.patch


 After doing a major compaction any attempt to do read the data with an MR job 
 (select count(*), subsequent compaction) fails with:
 Caused by: java.lang.IllegalArgumentException: All base directories were 
 ignored, such as 
 hdfs://hdp.example.com:8020/apps/hive/warehouse/purchaselog/ds=201404031016/base_0044000
  by 5:4086:...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6787) ORC+ACID assumes all missing buckets are in ACID structure

2014-04-08 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6787:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review, Sergey!

I just committed this.

 ORC+ACID assumes all missing buckets are in ACID structure
 --

 Key: HIVE-6787
 URL: https://issues.apache.org/jira/browse/HIVE-6787
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6787.patch


 ORC+ACID creates ACID structure splits for all missing buckets in a table
 {code}
 java.lang.RuntimeException: java.io.IOException: java.io.IOException: 
 Vectorization and ACID tables are incompatible.
 at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:996)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:240)
   ... 15 more
 {code}
 The tables are normal ORC tables and are not using ACID structure at all.
 {code}
 @@ -539,7 +539,7 @@ public void run() {
  for(int b=0; b  context.numBuckets; ++b) {
if (!covered[b]) {
  context.splits.add(new OrcSplit(dir, b, 0, new String[0], null,
 -   false, false, deltas));
 +   isOriginal, false, deltas));
}
  }
 {code}
 seems to fix the issue. [~owen.omalley], please confirm if that is what I 
 should be doing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963069#comment-13963069
 ] 

Hive QA commented on HIVE-6846:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639099/HIVE-6846.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5553 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2174/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2174/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639099

 allow safe set commands with sql standard authorization
 ---

 Key: HIVE-6846
 URL: https://issues.apache.org/jira/browse/HIVE-6846
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch, HIVE-6846.3.patch


 HIVE-6827 disables all set commands when SQL standard authorization is turned 
 on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-04-08 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6757:


Assignee: Harish Butani  (was: Owen O'Malley)

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Harish Butani
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.2.patch, HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-04-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6757:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and 0.13
thanks Owen, Xuefu, Brock, Justin.

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Harish Butani
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.2.patch, HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4904) A little more CP crossing RS boundaries

2014-04-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963115#comment-13963115
 ] 

Ashutosh Chauhan commented on HIVE-4904:


+1

 A little more CP crossing RS boundaries
 ---

 Key: HIVE-4904
 URL: https://issues.apache.org/jira/browse/HIVE-4904
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-4904.3.patch, HIVE-4904.4.patch, HIVE-4904.5.patch, 
 HIVE-4904.D11757.1.patch, HIVE-4904.D11757.2.patch


 Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
 little more CP is possible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6818) Array out of bounds when ORC is used with ACID and predicate push down

2014-04-08 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963110#comment-13963110
 ] 

Owen O'Malley commented on HIVE-6818:
-

The three failures are unrelated and pass when I run it locally. I'll commit 
this after the 24 hours.

 Array out of bounds when ORC is used with ACID and predicate push down
 --

 Key: HIVE-6818
 URL: https://issues.apache.org/jira/browse/HIVE-6818
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6818.patch


 The users gets an ArrayOutOfBoundsException when using ORC, ACID, and 
 predicate push down.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963187#comment-13963187
 ] 

Hive QA commented on HIVE-6858:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639098/HIVE-6858.1.patch

{color:green}SUCCESS:{color} +1 5550 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2175/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2175/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639098

 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 ---

 Key: HIVE-6858
 URL: https://issues.apache.org/jira/browse/HIVE-6858
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6858.1.patch


 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 {noformat}
  -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
 ---
  -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
 {noformat}
 Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
 produces -0.004 while, jdk-6 produces -0.0040.
 {code}
 public class Main {
   public static void main(String[] a) throws Exception {
  double val = 0.004;
  System.out.println(Value = +val);
   }
 }
 {code}
 This happens to be a bug in jdk6, that has been fixed in jdk7.
 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2014-04-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963192#comment-13963192
 ] 

Sergey Shelukhin commented on HIVE-6430:


Tested the patch on real queries. I do see huge memory reduction (modified 
TPCDS query 72, worst map task goes from 7Gb to ~1.2Gb dump after populating 
hash tables, I'll need to download the dumps to analyze but it's pretty clear 
cut); and GC time counter goes down from ~1min total to few seconds, as 
expected, but I also see huge wall clock time increase (without corresponding 
CPU time increase it looks like) during processing. I would expect some 
tradeoff but not as much as I'm seeing... will profile more.

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
 HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
 HIVE-6430.06.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963194#comment-13963194
 ] 

Sergey Shelukhin commented on HIVE-6809:


Can you update RB also?

 Support bulk deleting directories for partition drop with partial spec
 --

 Key: HIVE-6809
 URL: https://issues.apache.org/jira/browse/HIVE-6809
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
 HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt


 In busy hadoop system, dropping many of partitions takes much more time than 
 expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
 took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
 I couldn't test this in recent hive, which has HIVE-6256 but if the 
 time-taking part is mostly from removing directories, it seemed not helpful 
 to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-08 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963205#comment-13963205
 ] 

Sergey Shelukhin commented on HIVE-6825:


The test failure looks unrelated and the test passes for me locally. Will 
commit today late afternoon (after 24h)

 custom jars for Hive query should be uploaded to scratch dir per query; 
 and/or versioned
 

 Key: HIVE-6825
 URL: https://issues.apache.org/jira/browse/HIVE-6825
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-6825.01.patch, HIVE-6825.patch


 Currently the jars are uploaded to either user directory or global, whatever 
 is configured, which is a mess and can cause collisions. We can upload to 
 scratch directory, and/or version. 
 There's a tradeoff between having to upload files every time (for example, 
 for commonly used things like HBase input format) (which is what is done now, 
 into global/user path), and having a mess of one-off custom jars and files, 
 versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6846:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk.

 allow safe set commands with sql standard authorization
 ---

 Key: HIVE-6846
 URL: https://issues.apache.org/jira/browse/HIVE-6846
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch, HIVE-6846.3.patch


 HIVE-6827 disables all set commands when SQL standard authorization is turned 
 on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6604) Fix vectorized input to work with ACID

2014-04-08 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6604:


Attachment: HIVE-6604.patch

This patch:
* Adds support for Decimal to the VectorizedBatchUtil.addRowToBatch.
* Make addRowToBatch copy the bytes for Strings to avoid them being overwritten 
by the next value.
* Add a unit test case with ACID, vectorization, and all of the handled types.
* Fixes some of the method names to use proper capitalization.
* Removes the unused parameter to setNullColIsNullValue.
* Adds important tracking of the number of insert, update, and deletes.
* Fix WriterImpl.writeIntermediateFooter to notify the callback.


 Fix vectorized input to work with ACID
 --

 Key: HIVE-6604
 URL: https://issues.apache.org/jira/browse/HIVE-6604
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6604.patch, HIVE-6604.patch


 Fix the VectorizedOrcInputFormat to work with the ACID directories.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6604) Fix vectorized input to work with ACID

2014-04-08 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963277#comment-13963277
 ] 

Owen O'Malley commented on HIVE-6604:
-

Jitendra, in regards to your other comments:
* For each row returned by next, it is added to the batch.
* Until we add ACID update and delete into Hive's SQL, we can't make a qfile 
test for this.

 Fix vectorized input to work with ACID
 --

 Key: HIVE-6604
 URL: https://issues.apache.org/jira/browse/HIVE-6604
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6604.patch, HIVE-6604.patch


 Fix the VectorizedOrcInputFormat to work with the ACID directories.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19754: Defines a api for streaming data into Hive using ACID support.

2014-04-08 Thread Roshan Naik

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19754/
---

(Updated April 8, 2014, 6:27 p.m.)


Review request for hive.


Changes
---

addressing review comments.
 - move to hcatalog
 - expose HiveConf to client API


Bugs: HIVE-5687
https://issues.apache.org/jira/browse/HIVE-5687


Repository: hive-git


Description
---

Defines an API for streaming data into Hive using ACID support.


Diffs (updated)
-

  hcatalog/pom.xml 50ce296 
  hcatalog/streaming/pom.xml PRE-CREATION 
  hcatalog/streaming/src/docs/package.html PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/ConnectionError.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HeartBeatFailure.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/ImpersonationFailed.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/InvalidColumn.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/InvalidPartition.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/InvalidTable.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/InvalidTrasactionState.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/PartitionCreationFailed.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/QueryFailedException.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/RecordWriter.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/SerializationError.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/StreamingConnection.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/StreamingException.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/StreamingIOFailure.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/StrictJsonWriter.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/TransactionBatch.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/TransactionBatchUnAvailable.java
 PRE-CREATION 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/TransactionError.java
 PRE-CREATION 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/StreamingIntegrationTester.java
 PRE-CREATION 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestDelimitedInputWriter.java
 PRE-CREATION 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 PRE-CREATION 
  hcatalog/streaming/src/test/sit PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1bbe02e 
  packaging/pom.xml de9b002 
  packaging/src/main/assembly/src.xml bdaa47b 

Diff: https://reviews.apache.org/r/19754/diff/


Testing
---

Unit tests included. Also done manual testing by streaming data using flume.


Thanks,

Roshan Naik



[jira] [Updated] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing org.apache.tez.dag.api.SessionNotRunning: Application not running error

2014-04-08 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6782:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to both trunk and branch-0.13

 HiveServer2Concurrency issue when running with tez intermittently, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error
 -

 Key: HIVE-6782
 URL: https://issues.apache.org/jira/browse/HIVE-6782
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, 
 HIVE-6782.11.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, 
 HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, 
 HIVE-6782.9.patch


 HiveServer2 concurrency is failing intermittently when using tez, throwing 
 org.apache.tez.dag.api.SessionNotRunning: Application not running error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6759) Fix reading partial ORC files while they are being written

2014-04-08 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6759:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review, Sergey and Harish. Thejas ran the unit tests internally 
and they passed.

I just committed this.

 Fix reading partial ORC files while they are being written
 --

 Key: HIVE-6759
 URL: https://issues.apache.org/jira/browse/HIVE-6759
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Critical
 Fix For: 0.13.0

 Attachments: HIVE-6759.patch


 HDFS with the hflush ensures the bytes are visible, but doesn't update the 
 file length on the NameNode. Currently the Orc reader will only read up to 
 the length on the NameNode. If the user specified a length from a 
 flush_length file, the Orc reader should trust it to be right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6822) TestAvroSerdeUtils fails with -Phadoop-2

2014-04-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963294#comment-13963294
 ] 

Ashutosh Chauhan commented on HIVE-6822:


+1

 TestAvroSerdeUtils fails with -Phadoop-2
 

 Key: HIVE-6822
 URL: https://issues.apache.org/jira/browse/HIVE-6822
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6822.1.patch


 Works fine with -Phadoop-1, but with -Phadoop-2 hits the following error:
 {noformat}
 Running org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
 Tests run: 10, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.603 sec 
  FAILURE! - in org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
 determineSchemaCanReadSchemaFromHDFS(org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils)
   Time elapsed: 0.688 sec   ERROR!
 java.lang.NoClassDefFoundError: 
 com/sun/jersey/spi/container/servlet/ServletContainer
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at 
 org.apache.hadoop.http.HttpServer2.addJerseyResourcePackage(HttpServer2.java:564)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.initWebHdfs(NameNodeHttpServer.java:84)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:121)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:601)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:500)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:658)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:643)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1259)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:914)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:805)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:663)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:603)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:474)
   at 
 org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS(TestAvroSerdeUtils.java:189)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6845) TestJdbcDriver.testShowRoleGrant can fail if TestJdbcDriver/TestJdbcDriver2 run together

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6845:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk.

 TestJdbcDriver.testShowRoleGrant can fail if TestJdbcDriver/TestJdbcDriver2 
 run together
 

 Key: HIVE-6845
 URL: https://issues.apache.org/jira/browse/HIVE-6845
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 0.13.0

 Attachments: HIVE-6845.1.patch


 Running both TestJdbcDriver/TestJdbcDriver2 together in the same run gives an 
 error in testShowRoleGrant() because both tests create the role role1.  
 When the 2nd test tries to create the role it fails:
 {noformat}
 testShowRoleGrant(org.apache.hive.jdbc.TestJdbcDriver2)  Time elapsed: 1.801 
 sec   ERROR!
 java.sql.SQLException: Error while processing statement: FAILED: Execution 
 Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:275)
   at 
 org.apache.hive.jdbc.TestJdbcDriver2.testShowRoleGrant(TestJdbcDriver2.java:2000)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6853) show create table for hbase tables should exclude LOCATION

2014-04-08 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963316#comment-13963316
 ] 

Szehon Ho commented on HIVE-6853:
-

Thanks for the fix, only one minor comment, is it needed to make a 
StringBuilder when there is only one string to return?

Also can you upload the patch in right name-format?  The precommit test takes 
patches in the form HIVE-.patch or HIVE-.n.patch only.

 show create table for hbase tables should exclude LOCATION 
 ---

 Key: HIVE-6853
 URL: https://issues.apache.org/jira/browse/HIVE-6853
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.10.0
Reporter: Miklos Christine
 Attachments: HIVE-6853-0.patch


 If you create a table on top of hbase in hive and issue a show create table 
 hbase_table, it gives a bad DDL. It should not show LOCATION:  
   
 
 [hive]$ cat /tmp/test_create.sql
 CREATE EXTERNAL TABLE nba_twitter.hbase2(
 key string COMMENT 'from deserializer',
 name string COMMENT 'from deserializer',
 pdt string COMMENT 'from deserializer',
 service string COMMENT 'from deserializer',
 term string COMMENT 'from deserializer',
 update1 string COMMENT 'from deserializer')
 ROW FORMAT SERDE
 'org.apache.hadoop.hive.hbase.HBaseSerDe'
 STORED BY
 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (
 'serialization.format'='1',
 'hbase.columns.mapping'=':key,srv:name,srv:pdt,srv:service,srv:term,srv:update')
 LOCATION
 'hdfs://nameservice1/user/hive/warehouse/nba_twitter.db/hbase'
 TBLPROPERTIES (
 'hbase.table.name'='NBATwitter',
 'transient_lastDdlTime'='1386172188')
 Trying to create a table using the above fails:
 [hive]$ hive -f /tmp/test_create.sql
 cli -f /tmp/test_create.sql
 Logging initialized using configuration in 
 jar:file:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/hive-common-0.10.0-cdh4.4.0.jar!/hive-log4j.properties
 FAILED: Error in metadata: MetaException(message:LOCATION may not be 
 specified for HBase.)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 However, if I remove the LOCATION, then the DDL is valid.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6812) show compactions returns error when there are no compactions

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6812:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk.

 show compactions returns error when there are no compactions
 

 Key: HIVE-6812
 URL: https://issues.apache.org/jira/browse/HIVE-6812
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6812.patch


 Doing show compactions when there are no current transactions in process or 
 in the queue results in: 
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. null



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1396#comment-1396
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639100/HIVE-6809.4.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5551 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2176/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2176/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639100

 Support bulk deleting directories for partition drop with partial spec
 --

 Key: HIVE-6809
 URL: https://issues.apache.org/jira/browse/HIVE-6809
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
 HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt


 In busy hadoop system, dropping many of partitions takes much more time than 
 expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
 took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
 I couldn't test this in recent hive, which has HIVE-6256 but if the 
 time-taking part is mostly from removing directories, it seemed not helpful 
 to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6604) Fix vectorized input to work with ACID

2014-04-08 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963335#comment-13963335
 ] 

Jitendra Nath Pandey commented on HIVE-6604:


+1

 Fix vectorized input to work with ACID
 --

 Key: HIVE-6604
 URL: https://issues.apache.org/jira/browse/HIVE-6604
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6604.patch, HIVE-6604.patch


 Fix the VectorizedOrcInputFormat to work with the ACID directories.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6773) Update readme for ptest2 framework

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6773:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

 Update readme for ptest2 framework
 --

 Key: HIVE-6773
 URL: https://issues.apache.org/jira/browse/HIVE-6773
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Szehon Ho
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-6773.patch


 Approvals dependency is needed for testing.  Need to add instructions.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6844) support separate configuration param for enabling authorization using new interface

2014-04-08 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963364#comment-13963364
 ] 

Thejas M Nair commented on HIVE-6844:
-

I think the configuration doc is fine. It was just me not RTFM ! :)



 support separate configuration param for enabling authorization using new 
 interface
 ---

 Key: HIVE-6844
 URL: https://issues.apache.org/jira/browse/HIVE-6844
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair

 The existing configuration parameter *hive.security.authorization.enabled* is 
 used for both SQL query level authorization at sql query compilation, and 
 at metatore api authorization for the thrift metastore api calls. This 
 makes it hard to flexibly/correctly configure the security settings.
 It should be possible to enable SQL query level authorization and 
 metastore api authorization independently of each other.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6863) HiveServer2 binary mode throws exception with PAM

2014-04-08 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963387#comment-13963387
 ] 

Thejas M Nair commented on HIVE-6863:
-

+1

 HiveServer2 binary mode throws exception with PAM
 -

 Key: HIVE-6863
 URL: https://issues.apache.org/jira/browse/HIVE-6863
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6863.1.patch


 Works fine in http mode



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs

2014-04-08 Thread Ashu Pachauri (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963416#comment-13963416
 ] 

Ashu Pachauri commented on HIVE-4629:
-

Any estimate on when this will be accepted into trunk?

 HS2 should support an API to retrieve query logs
 

 Key: HIVE-4629
 URL: https://issues.apache.org/jira/browse/HIVE-4629
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, 
 HIVE-4629.2.patch


 HiveServer2 should support an API to retrieve query logs. This is 
 particularly relevant because HiveServer2 supports async execution but 
 doesn't provide a way to report progress. Providing an API to retrieve query 
 logs will help report progress to the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5888) group by after join operation product no result when hive.optimize.skewjoin = true

2014-04-08 Thread Muthu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963454#comment-13963454
 ] 

Muthu commented on HIVE-5888:
-

[~navis] After applying the patch from HIVE-6041 to hive 0.12, queries with 
auto MAPJOIN fails with the following error:  Any workarounds?
set hive.optimize.skewjoin=true; set hive.auto.convert.join=true; SELECT 
ru.userid, SUM(ru.total_count) FROM BIGTABLE ru JOIN SMALLTABLE c on 
c.creative_id = ru.creative_id JOIN placement_dapi p ON p.placement_id = 
c.placement_id WHERE ru.realdate = '2014-01-02' AND ru.userid  0 GROUP BY 
ru.userid;

Stage-1 is selected by condition resolver.
java.io.FileNotFoundException: java.io.FileNotFoundException: File does not 
exist: 
/tmp/hive-muthu.nivas/tmp/hive-muthu.nivas/hive_2014-02-26_18-17-04_075_3879899075227148508-1/-mr-10002
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
at org.apache.hadoop.hdfs.DFSClient.getContentSummary(DFSClient.java:917)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:232)
at 
org.apache.hadoop.hive.ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask(ConditionalResolverCommonJoin.java:185)
at 
org.apache.hadoop.hive.ql.plan.ConditionalResolverCommonJoin.getTasks(ConditionalResolverCommonJoin.java:117)
at 
org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:55)


 group by after join operation product no result when  hive.optimize.skewjoin 
 = true 
 

 Key: HIVE-5888
 URL: https://issues.apache.org/jira/browse/HIVE-5888
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0
Reporter: cyril liao
Priority: Critical





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6861) more hadoop2 only golden files to fix

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963462#comment-13963462
 ] 

Hive QA commented on HIVE-6861:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639124/HIVE-6861.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s),  tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2179/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2179/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639124

 more hadoop2 only golden files to fix
 -

 Key: HIVE-6861
 URL: https://issues.apache.org/jira/browse/HIVE-6861
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6861.1.patch


 More hadoop2 golden files to fix due to HIVE-6643, HIVE-6642, HIVE-6808, 
 HIVE-6144.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-08 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6858:
---

Attachment: HIVE-6858.2.patch

Updated patch with the fix to groupby3_map_skew.q as well, as suggested by 
[~jdere]. Verfied that it passes in both jdk6, jdk7.

 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 ---

 Key: HIVE-6858
 URL: https://issues.apache.org/jira/browse/HIVE-6858
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6858.1.patch, HIVE-6858.2.patch


 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 {noformat}
  -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
 ---
  -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
 {noformat}
 Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
 produces -0.004 while, jdk-6 produces -0.0040.
 {code}
 public class Main {
   public static void main(String[] a) throws Exception {
  double val = 0.004;
  System.out.println(Value = +val);
   }
 }
 {code}
 This happens to be a bug in jdk6, that has been fixed in jdk7.
 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6867) Bucketized Table feature fails in some cases

2014-04-08 Thread Laljo John Pullokkaran (JIRA)
Laljo John Pullokkaran created HIVE-6867:


 Summary: Bucketized Table feature fails in some cases
 Key: HIVE-6867
 URL: https://issues.apache.org/jira/browse/HIVE-6867
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran


Bucketized Table feature fails in some cases. if src  destination is bucketed 
on same key, and if actual data in the src is not bucketed (because data got 
loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while 
writing to destination.
Example
--
CREATE TABLE P1(key STRING, val STRING)
CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/Users/jpullokkaran/apache-hive1/data/files/P1.txt' 
INTO TABLE P1;
– perform an insert to make sure there are 2 files
INSERT OVERWRITE TABLE P1 select key, val from P1;
--
This is not a regression. This has never worked.
This got only discovered due to Hadoop2 changes.
In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
what is requested by app. Hadoop2 now honors the number of reducer setting in 
local mode (by spawning threads).
Long term solution seems to be to prevent load data for bucketed table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6867) Bucketized Table feature fails in some cases

2014-04-08 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963472#comment-13963472
 ] 

Laljo John Pullokkaran commented on HIVE-6867:
--

BucketingSortingReduceSinkOptimizer removes RS op if src  destination is 
bucketed on same key.

 Bucketized Table feature fails in some cases
 

 Key: HIVE-6867
 URL: https://issues.apache.org/jira/browse/HIVE-6867
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran

 Bucketized Table feature fails in some cases. if src  destination is 
 bucketed on same key, and if actual data in the src is not bucketed (because 
 data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
 bucketed while writing to destination.
 Example
 --
 CREATE TABLE P1(key STRING, val STRING)
 CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '/Users/jpullokkaran/apache-hive1/data/files/P1.txt' 
 INTO TABLE P1;
 – perform an insert to make sure there are 2 files
 INSERT OVERWRITE TABLE P1 select key, val from P1;
 --
 This is not a regression. This has never worked.
 This got only discovered due to Hadoop2 changes.
 In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
 what is requested by app. Hadoop2 now honors the number of reducer setting in 
 local mode (by spawning threads).
 Long term solution seems to be to prevent load data for bucketed table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-08 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6858:
---

Status: Open  (was: Patch Available)

 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 ---

 Key: HIVE-6858
 URL: https://issues.apache.org/jira/browse/HIVE-6858
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6858.1.patch, HIVE-6858.2.patch


 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 {noformat}
  -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
 ---
  -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
 {noformat}
 Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
 produces -0.004 while, jdk-6 produces -0.0040.
 {code}
 public class Main {
   public static void main(String[] a) throws Exception {
  double val = 0.004;
  System.out.println(Value = +val);
   }
 }
 {code}
 This happens to be a bug in jdk6, that has been fixed in jdk7.
 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-08 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6858:
---

Status: Patch Available  (was: Open)

 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 ---

 Key: HIVE-6858
 URL: https://issues.apache.org/jira/browse/HIVE-6858
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6858.1.patch, HIVE-6858.2.patch


 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 {noformat}
  -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
 ---
  -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
 {noformat}
 Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
 produces -0.004 while, jdk-6 produces -0.0040.
 {code}
 public class Main {
   public static void main(String[] a) throws Exception {
  double val = 0.004;
  System.out.println(Value = +val);
   }
 }
 {code}
 This happens to be a bug in jdk6, that has been fixed in jdk7.
 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-08 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963470#comment-13963470
 ] 

Jason Dere commented on HIVE-6858:
--

Thanks for tracking this one down. 
+1

 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 ---

 Key: HIVE-6858
 URL: https://issues.apache.org/jira/browse/HIVE-6858
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6858.1.patch, HIVE-6858.2.patch


 Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
 {noformat}
  -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
 ---
  -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
 {noformat}
 Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
 produces -0.004 while, jdk-6 produces -0.0040.
 {code}
 public class Main {
   public static void main(String[] a) throws Exception {
  double val = 0.004;
  System.out.println(Value = +val);
   }
 }
 {code}
 This happens to be a bug in jdk6, that has been fixed in jdk7.
 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6862) add DB schema DDL and upgrade 12to13 scripts for MS SQL Server

2014-04-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6862:
-

Summary: add DB schema DDL and upgrade 12to13 scripts for MS SQL Server  
(was: add DB schema DDL statements for MS SQL Server)

 add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
 --

 Key: HIVE-6862
 URL: https://issues.apache.org/jira/browse/HIVE-6862
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 need to add a unifed 0.13 script and a separate script for ACID support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6867) Bucketized Table feature fails in some cases

2014-04-08 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6867:
-

Description: 
Bucketized Table feature fails in some cases. if src  destination is bucketed 
on same key, and if actual data in the src is not bucketed (because data got 
loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while 
writing to destination.
Example
--
CREATE TABLE P1(key STRING, val STRING)
CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
– perform an insert to make sure there are 2 files
INSERT OVERWRITE TABLE P1 select key, val from P1;
--
This is not a regression. This has never worked.
This got only discovered due to Hadoop2 changes.
In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
what is requested by app. Hadoop2 now honors the number of reducer setting in 
local mode (by spawning threads).
Long term solution seems to be to prevent load data for bucketed table.

  was:
Bucketized Table feature fails in some cases. if src  destination is bucketed 
on same key, and if actual data in the src is not bucketed (because data got 
loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while 
writing to destination.
Example
--
CREATE TABLE P1(key STRING, val STRING)
CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/Users/jpullokkaran/apache-hive1/data/files/P1.txt' 
INTO TABLE P1;
– perform an insert to make sure there are 2 files
INSERT OVERWRITE TABLE P1 select key, val from P1;
--
This is not a regression. This has never worked.
This got only discovered due to Hadoop2 changes.
In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
what is requested by app. Hadoop2 now honors the number of reducer setting in 
local mode (by spawning threads).
Long term solution seems to be to prevent load data for bucketed table.


 Bucketized Table feature fails in some cases
 

 Key: HIVE-6867
 URL: https://issues.apache.org/jira/browse/HIVE-6867
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran

 Bucketized Table feature fails in some cases. if src  destination is 
 bucketed on same key, and if actual data in the src is not bucketed (because 
 data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
 bucketed while writing to destination.
 Example
 --
 CREATE TABLE P1(key STRING, val STRING)
 CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE 
 P1;
 – perform an insert to make sure there are 2 files
 INSERT OVERWRITE TABLE P1 select key, val from P1;
 --
 This is not a regression. This has never worked.
 This got only discovered due to Hadoop2 changes.
 In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
 what is requested by app. Hadoop2 now honors the number of reducer setting in 
 local mode (by spawning threads).
 Long term solution seems to be to prevent load data for bucketed table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-08 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-5687:


Attachment: package.html

Remove the hcatalog/streaming/src/docs/package.html and put this file into 
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/package.html.

I've removed all of the complicated formatting and non-standard characters that 
Microsoft Word added. It is important for open source projects to have 
documentation that can be edited. It is also better to include this 
documentation as part of the javadoc and have links to the API's javadoc rather 
than reproduce it.

Other than replacing the documentation, it is ok. +1

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
 Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6861) more hadoop2 only golden files to fix

2014-04-08 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963515#comment-13963515
 ] 

Ashutosh Chauhan commented on HIVE-6861:


+1

 more hadoop2 only golden files to fix
 -

 Key: HIVE-6861
 URL: https://issues.apache.org/jira/browse/HIVE-6861
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6861.1.patch


 More hadoop2 golden files to fix due to HIVE-6643, HIVE-6642, HIVE-6808, 
 HIVE-6144.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6868) Create table in HCatalog sets different SerDe defaults than what is set through the CLI

2014-04-08 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6868:


Attachment: HIVE-6868.1.patch

 Create table in HCatalog sets different SerDe defaults than what is set 
 through the CLI
 ---

 Key: HIVE-6868
 URL: https://issues.apache.org/jira/browse/HIVE-6868
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Harish Butani
 Attachments: HIVE-6868.1.patch


 HCatCreateTableDesc doesn't invoke the getEmptyTable function on 
 org.apache.hadoop.hive.ql.metadata.Table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6863) HiveServer2 binary mode throws exception with PAM

2014-04-08 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963545#comment-13963545
 ] 

Harish Butani commented on HIVE-6863:
-

+1 for 0.13

 HiveServer2 binary mode throws exception with PAM
 -

 Key: HIVE-6863
 URL: https://issues.apache.org/jira/browse/HIVE-6863
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6863.1.patch


 Works fine in http mode



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6822) TestAvroSerdeUtils fails with -Phadoop-2

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6822:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 TestAvroSerdeUtils fails with -Phadoop-2
 

 Key: HIVE-6822
 URL: https://issues.apache.org/jira/browse/HIVE-6822
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 0.14.0

 Attachments: HIVE-6822.1.patch


 Works fine with -Phadoop-1, but with -Phadoop-2 hits the following error:
 {noformat}
 Running org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
 Tests run: 10, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.603 sec 
  FAILURE! - in org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
 determineSchemaCanReadSchemaFromHDFS(org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils)
   Time elapsed: 0.688 sec   ERROR!
 java.lang.NoClassDefFoundError: 
 com/sun/jersey/spi/container/servlet/ServletContainer
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
   at 
 org.apache.hadoop.http.HttpServer2.addJerseyResourcePackage(HttpServer2.java:564)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.initWebHdfs(NameNodeHttpServer.java:84)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:121)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:601)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:500)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:658)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:643)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1259)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:914)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:805)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:663)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:603)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:474)
   at 
 org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS(TestAvroSerdeUtils.java:189)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5687) Streaming support in Hive

2014-04-08 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963560#comment-13963560
 ] 

Roshan Naik commented on HIVE-5687:
---

Owen: Thanks a lot for revising package.html

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
 Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6868) Create table in HCatalog sets different SerDe defaults than what is set through the CLI

2014-04-08 Thread Harish Butani (JIRA)
Harish Butani created HIVE-6868:
---

 Summary: Create table in HCatalog sets different SerDe defaults 
than what is set through the CLI
 Key: HIVE-6868
 URL: https://issues.apache.org/jira/browse/HIVE-6868
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Harish Butani


HCatCreateTableDesc doesn't invoke the getEmptyTable function on 
org.apache.hadoop.hive.ql.metadata.Table



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6869) Golden file updates for tez tests.

2014-04-08 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-6869:
--

 Summary: Golden file updates for tez tests.
 Key: HIVE-6869
 URL: https://issues.apache.org/jira/browse/HIVE-6869
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6869) Golden file updates for tez tests.

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6869:
---

Attachment: HIVE-6869.patch

 Golden file updates for tez tests.
 --

 Key: HIVE-6869
 URL: https://issues.apache.org/jira/browse/HIVE-6869
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6869.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6869) Golden file updates for tez tests.

2014-04-08 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6869:
---

Status: Patch Available  (was: Open)

 Golden file updates for tez tests.
 --

 Key: HIVE-6869
 URL: https://issues.apache.org/jira/browse/HIVE-6869
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6869.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4790) MapredLocalTask task does not make virtual columns

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963612#comment-13963612
 ] 

Hive QA commented on HIVE-4790:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639138/HIVE-4790.7.patch.txt

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s),  tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketizedhiveinputformat_auto
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats11
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2180/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2180/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639138

 MapredLocalTask task does not make virtual columns
 --

 Key: HIVE-4790
 URL: https://issues.apache.org/jira/browse/HIVE-4790
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D11511.3.patch, D11511.4.patch, HIVE-4790.5.patch.txt, 
 HIVE-4790.6.patch.txt, HIVE-4790.7.patch.txt, HIVE-4790.D11511.1.patch, 
 HIVE-4790.D11511.2.patch


 From mailing list, 
 http://www.mail-archive.com/user@hive.apache.org/msg08264.html
 {noformat}
 SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON 
 b.rownumber = a.number;
 fails with this error:
  
  SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = 
 a.number;
 Automatically selecting local only mode for query
 Total MapReduce jobs = 1
 setting HADOOP_USER_NAMEpmarron
 13/06/25 10:52:56 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Execution log at: /tmp/pmarron/.log
 2013-06-25 10:52:56 Starting to launch local task to process map join;
   maximum memory = 932118528
 java.lang.RuntimeException: cannot find field block__offset__inside__file 
 from [0:rownumber, 1:offset]
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
 at 
 org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.DelegatedStructObjectInspector.getStructFieldRef(DelegatedStructObjectInspector.java:74)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
 at 
 org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68)
   

[jira] [Updated] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton

2014-04-08 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-5072:
-

Attachment: HIVE-5072.3.patch

Rebased the patch and add e2e test for Templeton-Sqoop action in 
HIVE-5072,3,patch

 [WebHCat]Enable directly invoke Sqoop job through Templeton
 ---

 Key: HIVE-5072
 URL: https://issues.apache.org/jira/browse/HIVE-5072
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, 
 Templeton-Sqoop-Action.pdf


 Now it is hard to invoke a Sqoop job through templeton. The only way is to 
 use the classpath jar generated by a sqoop job and use the jar delegator in 
 Templeton. We should implement Sqoop Delegator to enable directly invoke 
 Sqoop job through Templeton.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data

2014-04-08 Thread Pala M Muthaia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963641#comment-13963641
 ] 

Pala M Muthaia commented on HIVE-6131:
--

I looked into the failures above and revisited HIVE-3833 with more context now:
1. LazyBinaryColumnarSerde requires partition level metadata to read existing 
data, it needs exact metadata used when serializing the data. So cannot use 
table level metadata which could have changed.
2. Other serdes/format, which support schema change, needs updated schema to 
support newly appended data with new columns.

So, it seems we should pass the table metadata or partition metadata 
selectively, depending on what the storage/serde supports. Is there a way to 
identify the serdes/format that do not support newer schema, programmatically? 
I don't see anything obvious. Alternative is to
a.  Add such metadata to serde info and populate that for all serdes. This may 
have been discussed briefly in HIVE-3833, and looks like this will be a large 
change because it essentially modifies interface for a plugin.
b.  Hardcode a white or blacklist of serdes and pass table/partition level 
metadata accordingly.

[~ashutoshc], [~szehon], any thoughts on the above, particularly are there 
other alternatives?



 New columns after table alter result in null values despite data
 

 Key: HIVE-6131
 URL: https://issues.apache.org/jira/browse/HIVE-6131
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: James Vaughan
Priority: Minor
 Attachments: HIVE-6131.1.patch


 Hi folks,
 I found and verified a bug on our CDH 4.0.3 install of Hive when adding 
 columns to tables with Partitions using 'REPLACE COLUMNS'.  I dug through the 
 Jira a little bit and didn't see anything for it so hopefully this isn't just 
 noise on the radar.
 Basically, when you alter a table with partitions and then reupload data to 
 that partition, it doesn't seem to recognize the extra data that actually 
 exists in HDFS- as in, returns NULL values on the new column despite having 
 the data and recognizing the new column in the metadata.
 Here's some steps to reproduce using a basic table:
 1.  Run this hive command:  CREATE TABLE jvaughan_test (col1 string) 
 partitioned by (day string);
 2.  Create a simple file on the system with a couple of entries, something 
 like hi and hi2 separated by newlines.
 3.  Run this hive command, pointing it at the file:  LOAD DATA LOCAL INPATH 
 'FILEDIR' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02');
 4.  Confirm the data with:  SELECT * FROM jvaughan_test WHERE day = 
 '2014-01-02';
 5.  Alter the column definitions:  ALTER TABLE jvaughan_test REPLACE COLUMNS 
 (col1 string, col2 string);
 6.  Edit your file and add a second column using the default separator 
 (ctrl+v, then ctrl+a in Vim) and add two more entries, such as hi3 on the 
 first row and hi4 on the second
 7.  Run step 3 again
 8.  Check the data again like in step 4
 For me, this is the results that get returned:
 hive select * from jvaughan_test where day = '2014-01-01';
 OK
 hiNULL2014-01-02
 hi2   NULL2014-01-02
 This is despite the fact that there is data in the file stored by the 
 partition in HDFS.
 Let me know if you need any other information.  The only workaround for me 
 currently is to drop partitions for any I'm replacing data in and THEN 
 reupload the new data file.
 Thanks,
 -James



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6869) Golden file updates for tez tests.

2014-04-08 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963689#comment-13963689
 ] 

Vikram Dixit K commented on HIVE-6869:
--

Thanks Ashutosh! LGTM +1

 Golden file updates for tez tests.
 --

 Key: HIVE-6869
 URL: https://issues.apache.org/jira/browse/HIVE-6869
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6869.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view

2014-04-08 Thread Adrian Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Wang updated HIVE-6765:
--

Status: Patch Available  (was: Open)

 ASTNodeOrigin unserializable leads to fail when join with view
 --

 Key: HIVE-6765
 URL: https://issues.apache.org/jira/browse/HIVE-6765
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Adrian Wang
 Fix For: 0.13.0

 Attachments: HIVE-6765.patch.1


 when a view contains a UDF, and the view comes into a JOIN operation, Hive 
 will encounter a bug with stack trace like
 Caused by: java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
   at java.lang.Class.newInstance0(Class.java:359)
   at java.lang.Class.newInstance(Class.java:327)
   at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:616)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6863) HiveServer2 binary mode throws exception with PAM

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963699#comment-13963699
 ] 

Hive QA commented on HIVE-6863:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639141/HIVE-6863.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s),  tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2181/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2181/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639141

 HiveServer2 binary mode throws exception with PAM
 -

 Key: HIVE-6863
 URL: https://issues.apache.org/jira/browse/HIVE-6863
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6863.1.patch


 Works fine in http mode



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 20145: HIVE-6648 - Permissions are not inherited correctly when tables have multiple partition columns

2014-04-08 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20145/
---

Review request for hive.


Repository: hive-git


Description
---

Hive.copyFiles behaves correctly for subdirectory permission-inheritance only 
in case of one-level insert.

To handle static partition (or any multi-directory case), I keep track of the 
permission of the first existing parent, and then apply it the entire sub-tree. 
 Had to do this manually, as FileSystem.mkdir(child, perm) will only apply perm 
on the child itself, and not on other intermediate parents created.


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestFolderPermissions.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java e6cb70f 

Diff: https://reviews.apache.org/r/20145/diff/


Testing
---

Fortunately, copyFiles uses the same code for hdfs/local case, so I was able to 
write a unit test to reproduce the issue.

Tried to write a qfile test but did not work as 'dfs -ls' output is masked and 
cannot be compared, so ended up writing a junit test.


Thanks,

Szehon Ho



[jira] [Updated] (HIVE-6648) Permissions are not inherited correctly when tables have multiple partition columns

2014-04-08 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6648:


Attachment: HIVE-6648.patch

This patch fixes the issue described.  After creating static partitions (insert 
into ... partition...), the partition directory and its intermediate-created 
directories now have the parent's permission.  See newly-added test case.

One note- the fix is in Hive.copyFile().  This JIRA also describes another 
problematic method (Warehouse.mkdirs()).  It is actually not invoked in this 
particular code path, and probably I can take a look in a follow-up JIRA about 
it to fix cases where it is invoked.

 Permissions are not inherited correctly when tables have multiple partition 
 columns
 ---

 Key: HIVE-6648
 URL: https://issues.apache.org/jira/browse/HIVE-6648
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Henry Robinson
Assignee: Szehon Ho
 Attachments: HIVE-6648.patch


 {{Warehouse.mkdirs()}} always looks at the immediate parent of the path that 
 it creates when determining what permissions to inherit. However, it may have 
 created that parent directory as well, in which case it will have the default 
 permissions and will not have inherited them.
 This is a problem when performing an {{INSERT}} into a table with more than 
 one partition column. E.g., in an empty table:
 {{INSERT INTO TABLE tbl PARTITION(p1=1, p2=2) ... }}
 A new subdirectory /p1=1/p2=2  will be created, and with permission 
 inheritance (per HIVE-2504) enabled, the intention is presumably for both new 
 directories to inherit the root table dir's permissions. However, 
 {{mkdirs()}} will only set the permission of the leaf directory (i.e. 
 /p2=2/), and then only to the permissions of /p1=1/, which was just created.
 {code}
 public boolean mkdirs(Path f) throws MetaException {
 FileSystem fs = null;
 try {
   fs = getFs(f);
   LOG.debug(Creating directory if it doesn't exist:  + f);
   //Check if the directory already exists. We want to change the 
 permission
   //to that of the parent directory only for newly created directories.
   if (this.inheritPerms) {
 try {
   return fs.getFileStatus(f).isDir();
 } catch (FileNotFoundException ignore) {
 }
   }
   boolean success = fs.mkdirs(f);
   if (this.inheritPerms  success) {
 // Set the permission of parent directory.
 // HNR: This is the bug - getParent() may refer to a just-created 
 directory.
 fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission());
   }
   return success;
 } catch (IOException e) {
   closeFs(fs);
   MetaStoreUtils.logAndThrowMetaException(e);
 }
 return false;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6648) Permissions are not inherited correctly when tables have multiple partition columns

2014-04-08 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6648:


Affects Version/s: 0.13.0
   Status: Patch Available  (was: Open)

 Permissions are not inherited correctly when tables have multiple partition 
 columns
 ---

 Key: HIVE-6648
 URL: https://issues.apache.org/jira/browse/HIVE-6648
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0
Reporter: Henry Robinson
Assignee: Szehon Ho
 Attachments: HIVE-6648.patch


 {{Warehouse.mkdirs()}} always looks at the immediate parent of the path that 
 it creates when determining what permissions to inherit. However, it may have 
 created that parent directory as well, in which case it will have the default 
 permissions and will not have inherited them.
 This is a problem when performing an {{INSERT}} into a table with more than 
 one partition column. E.g., in an empty table:
 {{INSERT INTO TABLE tbl PARTITION(p1=1, p2=2) ... }}
 A new subdirectory /p1=1/p2=2  will be created, and with permission 
 inheritance (per HIVE-2504) enabled, the intention is presumably for both new 
 directories to inherit the root table dir's permissions. However, 
 {{mkdirs()}} will only set the permission of the leaf directory (i.e. 
 /p2=2/), and then only to the permissions of /p1=1/, which was just created.
 {code}
 public boolean mkdirs(Path f) throws MetaException {
 FileSystem fs = null;
 try {
   fs = getFs(f);
   LOG.debug(Creating directory if it doesn't exist:  + f);
   //Check if the directory already exists. We want to change the 
 permission
   //to that of the parent directory only for newly created directories.
   if (this.inheritPerms) {
 try {
   return fs.getFileStatus(f).isDir();
 } catch (FileNotFoundException ignore) {
 }
   }
   boolean success = fs.mkdirs(f);
   if (this.inheritPerms  success) {
 // Set the permission of parent directory.
 // HNR: This is the bug - getParent() may refer to a just-created 
 directory.
 fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission());
   }
   return success;
 } catch (IOException e) {
   closeFs(fs);
   MetaStoreUtils.logAndThrowMetaException(e);
 }
 return false;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6825:
---

Fix Version/s: (was: 0.14.0)
   0.13.0

 custom jars for Hive query should be uploaded to scratch dir per query; 
 and/or versioned
 

 Key: HIVE-6825
 URL: https://issues.apache.org/jira/browse/HIVE-6825
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.13.0

 Attachments: HIVE-6825.01.patch, HIVE-6825.patch


 Currently the jars are uploaded to either user directory or global, whatever 
 is configured, which is a mess and can cause collisions. We can upload to 
 scratch directory, and/or version. 
 There's a tradeoff between having to upload files every time (for example, 
 for commonly used things like HBase input format) (which is what is done now, 
 into global/user path), and having a mess of one-off custom jars and files, 
 versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19754: Defines a api for streaming data into Hive using ACID support.

2014-04-08 Thread Lefty Leverenz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19754/#review39817
---



hcatalog/streaming/pom.xml
https://reviews.apache.org/r/19754/#comment72461

typo:  artifectId should be artifactId



hcatalog/streaming/pom.xml
https://reviews.apache.org/r/19754/#comment72462

typo:  artifectId should be artifactId



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
https://reviews.apache.org/r/19754/#comment72463

suggestion for Txnid:  either spell out transaction (transaction ID -- 
preferable) or use capital I like the parameter (TxnId)



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
https://reviews.apache.org/r/19754/#comment72464

Why does the parameter name have both-caps ID for maxTxnID while it's 
init-cap Id for minTxnId?  Are parameter names case-sensitive?

Also a suggestion for Txnid in description:  either spell out transaction 
(transaction ID -- preferable) or use capital ID like the parameter (TxnID).



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
https://reviews.apache.org/r/19754/#comment72465

Same question as line 108 about minTxnId vs maxTxnID capitalization



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72520

Nit:  period at the end (next line too)



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72466

Editorial nits:  Please capitalize nulls and end the second sentence with 
a period (next line) just for consistency with the first sentence.



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72467

Grammar nit:  Remove s from indicates because the subjects are plural.



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72468

Consistency nit:  Since other param descriptions are capitalized on the 
first word, please do the same here.

Bonus points if you capitalize all the param descriptions in this patch, 
but I'm not going to comment on all of them.  You could argue for a rule that 
only capitalizes full sentences and proper nouns like Hive, in which case [pun 
alert] it's okay to leave input uncapitalized.  But I favor visual 
consistency over rule consistency, except when I'm inconsistent.

Terminal periods aren't essential (given the typical style of javadocs) but 
they're recommended when a description has multiple sentences.  Hm, but that's 
inconsistent with my visual consistency preference.  Why am I wasting your time 
with this trivia?



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72517

should endpoint be explained? (your call)



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72469

Editorial nit:  non existing seems okay in this context, but 
nonexistent is the real word (your choice).

Consistency nit again:  Since other exception descriptions are capitalized 
on the first word, please do the same here.



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72470

ditto line 57



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72471

ditto line 58



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72472

ditto line 59 (capitalization)



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72474

ditto line 60



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72473

Hive nit:  please capitalize hive

Editorial nits:  please capitalize a and perhaps spell out configuration 
in conf object unless conf is the proper term for the object



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72475

ditto line 65



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java
https://reviews.apache.org/r/19754/#comment72516

ditto line 59



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java

[jira] [Updated] (HIVE-6862) add DB schema DDL and upgrade 12to13 scripts for MS SQL Server

2014-04-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6862:
-

Description: 
need to add a unifed 0.13 script and a separate script for ACID support

NO PRECOMMIT TESTS


  was:need to add a unifed 0.13 script and a separate script for ACID support


 add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
 --

 Key: HIVE-6862
 URL: https://issues.apache.org/jira/browse/HIVE-6862
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6862.patch


 need to add a unifed 0.13 script and a separate script for ACID support
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6862) add DB schema DDL and upgrade 12to13 scripts for MS SQL Server

2014-04-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6862:
-

Attachment: HIVE-6862.patch

 add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
 --

 Key: HIVE-6862
 URL: https://issues.apache.org/jira/browse/HIVE-6862
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6862.patch


 need to add a unifed 0.13 script and a separate script for ACID support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6862) add DB schema DDL and upgrade 12to13 scripts for MS SQL Server

2014-04-08 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6862:
-

Status: Patch Available  (was: Open)

 add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
 --

 Key: HIVE-6862
 URL: https://issues.apache.org/jira/browse/HIVE-6862
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6862.patch


 need to add a unifed 0.13 script and a separate script for ACID support
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-08 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6825:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to trunk and 13. Resolved a trivial conflict on commit

 custom jars for Hive query should be uploaded to scratch dir per query; 
 and/or versioned
 

 Key: HIVE-6825
 URL: https://issues.apache.org/jira/browse/HIVE-6825
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.13.0

 Attachments: HIVE-6825.01.patch, HIVE-6825.patch


 Currently the jars are uploaded to either user directory or global, whatever 
 is configured, which is a mess and can cause collisions. We can upload to 
 scratch directory, and/or version. 
 There's a tradeoff between having to upload files every time (for example, 
 for commonly used things like HBase input format) (which is what is done now, 
 into global/user path), and having a mess of one-off custom jars and files, 
 versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data

2014-04-08 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963728#comment-13963728
 ] 

Szehon Ho commented on HIVE-6131:
-

Hm I understand file-format may differ between partition and table, that was 
the point of HIVE-3833.   But just for my understanding, did you find any use 
for the partition-columns being different from table-columns (being the 
original)? 

In my experience, I had seen that LazyBinaryColumnarSerde (and other serde) can 
use a schema with more columns to de-serialize data than what the data was 
written with.  If thats the case, cant we make the column set same for 
partition and table, during 'alter table'?

 New columns after table alter result in null values despite data
 

 Key: HIVE-6131
 URL: https://issues.apache.org/jira/browse/HIVE-6131
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: James Vaughan
Priority: Minor
 Attachments: HIVE-6131.1.patch


 Hi folks,
 I found and verified a bug on our CDH 4.0.3 install of Hive when adding 
 columns to tables with Partitions using 'REPLACE COLUMNS'.  I dug through the 
 Jira a little bit and didn't see anything for it so hopefully this isn't just 
 noise on the radar.
 Basically, when you alter a table with partitions and then reupload data to 
 that partition, it doesn't seem to recognize the extra data that actually 
 exists in HDFS- as in, returns NULL values on the new column despite having 
 the data and recognizing the new column in the metadata.
 Here's some steps to reproduce using a basic table:
 1.  Run this hive command:  CREATE TABLE jvaughan_test (col1 string) 
 partitioned by (day string);
 2.  Create a simple file on the system with a couple of entries, something 
 like hi and hi2 separated by newlines.
 3.  Run this hive command, pointing it at the file:  LOAD DATA LOCAL INPATH 
 'FILEDIR' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02');
 4.  Confirm the data with:  SELECT * FROM jvaughan_test WHERE day = 
 '2014-01-02';
 5.  Alter the column definitions:  ALTER TABLE jvaughan_test REPLACE COLUMNS 
 (col1 string, col2 string);
 6.  Edit your file and add a second column using the default separator 
 (ctrl+v, then ctrl+a in Vim) and add two more entries, such as hi3 on the 
 first row and hi4 on the second
 7.  Run step 3 again
 8.  Check the data again like in step 4
 For me, this is the results that get returned:
 hive select * from jvaughan_test where day = '2014-01-01';
 OK
 hiNULL2014-01-02
 hi2   NULL2014-01-02
 This is despite the fact that there is data in the file stored by the 
 partition in HDFS.
 Let me know if you need any other information.  The only workaround for me 
 currently is to drop partitions for any I'm replacing data in and THEN 
 reupload the new data file.
 Thanks,
 -James



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19903: Support bulk deleting directories for partition drop with partial spec

2014-04-08 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19903/
---

(Updated April 9, 2014, 3:26 a.m.)


Review request for hive.


Changes
---

Fixed test fail


Bugs: HIVE-6809
https://issues.apache.org/jira/browse/HIVE-6809


Repository: hive-git


Description
---

In busy hadoop system, dropping many of partitions takes much more time than 
expected. In hive-0.11.0, removing 1700 partitions by single partial spec took 
90 minutes, which is reduced to 3 minutes when deleteData is set false. I 
couldn't test this in recent hive, which has HIVE-6256 but if the time-taking 
part is mostly from removing directories, it seemed not helpful to reduce whole 
processing time.


Diffs (updated)
-

  
hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
 d348b9b 
  metastore/if/hive_metastore.thrift eef1b80 
  metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 2a1b4d7 
  metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 9567874 
  metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
b18009c 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 4f051af 
  metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php c79624f 
  metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
fdedb57 
  metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 23679be 
  metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 56c23e6 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
18e62d8 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
664dccd 
  metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
0c2209b 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6a0eabe 
  metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java e0de0e0 
  metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java f731dab 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 5c00aa1 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 5025b83 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 5cb030c 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java e6cb70f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java a40a88d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/DropTableDesc.java ba30e1f 
  ql/src/test/queries/clientpositive/drop_partitions_partialspec.q PRE-CREATION 
  ql/src/test/results/clientnegative/drop_partition_failure.q.out cde0abb 
  ql/src/test/results/clientnegative/drop_partition_filter_failure.q.out 
c4f533b 
  ql/src/test/results/clientpositive/drop_multi_partitions.q.out eae57f3 
  ql/src/test/results/clientpositive/drop_partitions_partialspec.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/19903/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Commented] (HIVE-3972) Support using multiple reducer for fetching order by results

2014-04-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963757#comment-13963757
 ] 

Hive QA commented on HIVE-3972:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639142/HIVE-3972.8.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5556 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orderby_query_bucketing
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hive.service.cli.thrift.TestThriftBinaryCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2182/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2182/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639142

 Support using multiple reducer for fetching order by results
 

 Key: HIVE-3972
 URL: https://issues.apache.org/jira/browse/HIVE-3972
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D8349.5.patch, D8349.6.patch, D8349.7.patch, 
 HIVE-3972.8.patch.txt, HIVE-3972.D8349.1.patch, HIVE-3972.D8349.2.patch, 
 HIVE-3972.D8349.3.patch, HIVE-3972.D8349.4.patch


 Queries for fetching results which have lastly order by clause make final 
 MR run with single reducer, which can be too much. For example, 
 {code}
 select value, sum(key) as sum from src group by value order by sum;
 {code}
 If number of reducer is reasonable, multiple result files could be merged 
 into single sorted stream in the fetcher level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6853) show create table for hbase tables should exclude LOCATION

2014-04-08 Thread Miklos Christine (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Christine updated HIVE-6853:
---

Attachment: HIVE-6853.patch

bq: is it needed to make a StringBuilder when there is only one string to 
return?
Fixed. I removed it and just returned the string. 


 show create table for hbase tables should exclude LOCATION 
 ---

 Key: HIVE-6853
 URL: https://issues.apache.org/jira/browse/HIVE-6853
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.10.0
Reporter: Miklos Christine
 Attachments: HIVE-6853-0.patch, HIVE-6853.patch


 If you create a table on top of hbase in hive and issue a show create table 
 hbase_table, it gives a bad DDL. It should not show LOCATION:  
   
 
 [hive]$ cat /tmp/test_create.sql
 CREATE EXTERNAL TABLE nba_twitter.hbase2(
 key string COMMENT 'from deserializer',
 name string COMMENT 'from deserializer',
 pdt string COMMENT 'from deserializer',
 service string COMMENT 'from deserializer',
 term string COMMENT 'from deserializer',
 update1 string COMMENT 'from deserializer')
 ROW FORMAT SERDE
 'org.apache.hadoop.hive.hbase.HBaseSerDe'
 STORED BY
 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (
 'serialization.format'='1',
 'hbase.columns.mapping'=':key,srv:name,srv:pdt,srv:service,srv:term,srv:update')
 LOCATION
 'hdfs://nameservice1/user/hive/warehouse/nba_twitter.db/hbase'
 TBLPROPERTIES (
 'hbase.table.name'='NBATwitter',
 'transient_lastDdlTime'='1386172188')
 Trying to create a table using the above fails:
 [hive]$ hive -f /tmp/test_create.sql
 cli -f /tmp/test_create.sql
 Logging initialized using configuration in 
 jar:file:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/hive-common-0.10.0-cdh4.4.0.jar!/hive-log4j.properties
 FAILED: Error in metadata: MetaException(message:LOCATION may not be 
 specified for HBase.)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 However, if I remove the LOCATION, then the DDL is valid.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-3972) Support using multiple reducer for fetching order by results

2014-04-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3972:


Attachment: HIVE-3972.9.patch.txt

 Support using multiple reducer for fetching order by results
 

 Key: HIVE-3972
 URL: https://issues.apache.org/jira/browse/HIVE-3972
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D8349.5.patch, D8349.6.patch, D8349.7.patch, 
 HIVE-3972.8.patch.txt, HIVE-3972.9.patch.txt, HIVE-3972.D8349.1.patch, 
 HIVE-3972.D8349.2.patch, HIVE-3972.D8349.3.patch, HIVE-3972.D8349.4.patch


 Queries for fetching results which have lastly order by clause make final 
 MR run with single reducer, which can be too much. For example, 
 {code}
 select value, sum(key) as sum from src group by value order by sum;
 {code}
 If number of reducer is reasonable, multiple result files could be merged 
 into single sorted stream in the fetcher level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4790) MapredLocalTask task does not make virtual columns

2014-04-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4790:


Status: Open  (was: Patch Available)

Things are fucked-up rebasing on trunk. We should skip this in hive-0.13.0.

 MapredLocalTask task does not make virtual columns
 --

 Key: HIVE-4790
 URL: https://issues.apache.org/jira/browse/HIVE-4790
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D11511.3.patch, D11511.4.patch, HIVE-4790.5.patch.txt, 
 HIVE-4790.6.patch.txt, HIVE-4790.7.patch.txt, HIVE-4790.D11511.1.patch, 
 HIVE-4790.D11511.2.patch


 From mailing list, 
 http://www.mail-archive.com/user@hive.apache.org/msg08264.html
 {noformat}
 SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON 
 b.rownumber = a.number;
 fails with this error:
  
  SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = 
 a.number;
 Automatically selecting local only mode for query
 Total MapReduce jobs = 1
 setting HADOOP_USER_NAMEpmarron
 13/06/25 10:52:56 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Execution log at: /tmp/pmarron/.log
 2013-06-25 10:52:56 Starting to launch local task to process map join;
   maximum memory = 932118528
 java.lang.RuntimeException: cannot find field block__offset__inside__file 
 from [0:rownumber, 1:offset]
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
 at 
 org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.DelegatedStructObjectInspector.getStructFieldRef(DelegatedStructObjectInspector.java:74)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
 at 
 org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:222)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:186)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:394)
 at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Execution failed with exit status: 2
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3972) Support using multiple reducer for fetching order by results

2014-04-08 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963766#comment-13963766
 ] 

Brock Noland commented on HIVE-3972:


Looks like the .out file contains a ^A or something:

diff --git ql/src/test/results/clientpositive/orderby_query_bucketing.q.out 
ql/src/test/results/clientpositive/orderby_query_bucketing.q.out
new file mode 100644
index 000..c02b1c9
Binary files /dev/null and 
ql/src/test/results/clientpositive/orderby_query_bucketing.q.out differ

 Support using multiple reducer for fetching order by results
 

 Key: HIVE-3972
 URL: https://issues.apache.org/jira/browse/HIVE-3972
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D8349.5.patch, D8349.6.patch, D8349.7.patch, 
 HIVE-3972.8.patch.txt, HIVE-3972.9.patch.txt, HIVE-3972.D8349.1.patch, 
 HIVE-3972.D8349.2.patch, HIVE-3972.D8349.3.patch, HIVE-3972.D8349.4.patch


 Queries for fetching results which have lastly order by clause make final 
 MR run with single reducer, which can be too much. For example, 
 {code}
 select value, sum(key) as sum from src group by value order by sum;
 {code}
 If number of reducer is reasonable, multiple result files could be merged 
 into single sorted stream in the fetcher level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6870) Fix maven.repo.local setting in Hive build

2014-04-08 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6870:


 Summary: Fix maven.repo.local setting in Hive build
 Key: HIVE-6870
 URL: https://issues.apache.org/jira/browse/HIVE-6870
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Jason Dere
Assignee: Jason Dere


The pom.xml currently assumes maven.repo.local should be 
${user.home}/.m2/repository.  If the user has overridden the local repository 
through Maven settings, tests which assume the hive-exec JAR is at 
${user.home}/.m2/repository will fail because the artifacts will not be 
installed at that location.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6870) Fix maven.repo.local setting in Hive build

2014-04-08 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6870:
-

Status: Patch Available  (was: Open)

 Fix maven.repo.local setting in Hive build
 --

 Key: HIVE-6870
 URL: https://issues.apache.org/jira/browse/HIVE-6870
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6870.1.patch


 The pom.xml currently assumes maven.repo.local should be 
 ${user.home}/.m2/repository.  If the user has overridden the local repository 
 through Maven settings, tests which assume the hive-exec JAR is at 
 ${user.home}/.m2/repository will fail because the artifacts will not be 
 installed at that location.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6870) Fix maven.repo.local setting in Hive build

2014-04-08 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6870:
-

Attachment: HIVE-6870.1.patch

Use ${settings.localRepository} for the maven.repo.local property.

 Fix maven.repo.local setting in Hive build
 --

 Key: HIVE-6870
 URL: https://issues.apache.org/jira/browse/HIVE-6870
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6870.1.patch


 The pom.xml currently assumes maven.repo.local should be 
 ${user.home}/.m2/repository.  If the user has overridden the local repository 
 through Maven settings, tests which assume the hive-exec JAR is at 
 ${user.home}/.m2/repository will fail because the artifacts will not be 
 installed at that location.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6853) show create table for hbase tables should exclude LOCATION

2014-04-08 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963806#comment-13963806
 ] 

Szehon Ho commented on HIVE-6853:
-

Thanks , for the most part it LGTM.  I guess its not the cleanest, as its 
breaking the StorageHandler abstraction.  Probably cleaner to add some hook to 
StorageHandler interface, but due to backward compatibility, its probably not 
worth it for this use-case.

+1 (non-binding)

 show create table for hbase tables should exclude LOCATION 
 ---

 Key: HIVE-6853
 URL: https://issues.apache.org/jira/browse/HIVE-6853
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.10.0
Reporter: Miklos Christine
 Attachments: HIVE-6853-0.patch, HIVE-6853.patch


 If you create a table on top of hbase in hive and issue a show create table 
 hbase_table, it gives a bad DDL. It should not show LOCATION:  
   
 
 [hive]$ cat /tmp/test_create.sql
 CREATE EXTERNAL TABLE nba_twitter.hbase2(
 key string COMMENT 'from deserializer',
 name string COMMENT 'from deserializer',
 pdt string COMMENT 'from deserializer',
 service string COMMENT 'from deserializer',
 term string COMMENT 'from deserializer',
 update1 string COMMENT 'from deserializer')
 ROW FORMAT SERDE
 'org.apache.hadoop.hive.hbase.HBaseSerDe'
 STORED BY
 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (
 'serialization.format'='1',
 'hbase.columns.mapping'=':key,srv:name,srv:pdt,srv:service,srv:term,srv:update')
 LOCATION
 'hdfs://nameservice1/user/hive/warehouse/nba_twitter.db/hbase'
 TBLPROPERTIES (
 'hbase.table.name'='NBATwitter',
 'transient_lastDdlTime'='1386172188')
 Trying to create a table using the above fails:
 [hive]$ hive -f /tmp/test_create.sql
 cli -f /tmp/test_create.sql
 Logging initialized using configuration in 
 jar:file:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive/lib/hive-common-0.10.0-cdh4.4.0.jar!/hive-log4j.properties
 FAILED: Error in metadata: MetaException(message:LOCATION may not be 
 specified for HBase.)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 However, if I remove the LOCATION, then the DDL is valid.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5687) Streaming support in Hive

2014-04-08 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963820#comment-13963820
 ] 

Roshan Naik commented on HIVE-5687:
---

I had posted the revised patch on RB

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
 Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Timeline for the Hive 0.13 release?

2014-04-08 Thread Harish Butani
Hi,

We are getting close to having all the issues resolved.
I have the following list of open jiras as needing to go into 0.13
6863, 5687, 6604,6850,6818,6732,4904,5376, and 6319.

These are all close to being committed. Some are waiting for the 24hr period, a 
couple have been reviewed and +1ed, waiting for tests to pass.

So lets shoot for closing out all these issues by Thursday 12pm PST.  Would 
like to cut an rc by Thursday afternoon.

regards,
Harish.


On Mar 26, 2014, at 7:14 PM, Prasanth Jayachandran 
pjayachand...@hortonworks.com wrote:

 Hi Harish
 
 Can we have the following bugs for 0.13? These bugs are related to feature 
 HIVE-6455 added as part of 0.13.
 https://issues.apache.org/jira/browse/HIVE-6748 (Resource leak bug)
 https://issues.apache.org/jira/browse/HIVE-6760 (Bug in handling list 
 bucketing)
 https://issues.apache.org/jira/browse/HIVE-6761 (Bug with hashcodes 
 generation)
 
 Thanks
 Prasanth Jayachandran
 
 On Mar 26, 2014, at 1:22 PM, Hari Subramaniyan 
 hsubramani...@hortonworks.com wrote:
 
 Hi Harish
 Can you include HIVE-6708. It covers quite a number of issues associated
 with Vectorization(including some correctness issues and exceptions).
 
 Thanks
 Hari
 
 
 On Tue, Mar 25, 2014 at 12:01 PM, Xuefu Zhang xzh...@cloudera.com wrote:
 
 Harish,
 
 Could we include HIVE-6740?
 
 Thanks,
 Xuefu
 
 
 On Thu, Mar 20, 2014 at 7:27 PM, Prasanth Jayachandran 
 pjayachand...@hortonworks.com wrote:
 
 Harish,
 
 Could you add the following bugs as well?
 Following are related to LazyMap bug
 https://issues.apache.org/jira/browse/HIVE-6707
 https://issues.apache.org/jira/browse/HIVE-6714
 https://issues.apache.org/jira/browse/HIVE-6711
 
 Following is NPE bug with orc struct
 https://issues.apache.org/jira/browse/HIVE-6716
 
 Thanks
 Prasanth Jayachandran
 
 On Mar 14, 2014, at 6:26 PM, Eugene Koifman ekoif...@hortonworks.com
 wrote:
 
 could you add https://issues.apache.org/jira/browse/HIVE-6676 please.
 It's
 a blocker as well.
 
 Thanks,
 Eugene
 
 
 On Fri, Mar 14, 2014 at 5:30 PM, Vaibhav Gumashta 
 vgumas...@hortonworks.com
 wrote:
 
 Harish,
 
 Can we have this in as well:
 https://issues.apache.org/jira/browse/HIVE-6660.
 Blocker bug in my opinion.
 
 Thanks,
 --Vaibhav
 
 
 On Fri, Mar 14, 2014 at 2:21 PM, Thejas Nair the...@hortonworks.com
 wrote:
 
 Harish,
 Can you also include  HIVE-6673
 https://issues.apache.org/jira/browse/HIVE-6673
 -  show grant statement for all principals throws NPE
 This variant of 'show grant' is very useful, and the fix for NPE is
 straightforward. It is patch available now.
 
 
 
 On Fri, Mar 14, 2014 at 10:25 AM, Yin Huai huaiyin@gmail.com
 wrote:
 
 Guys,
 
 Seems ConditionalResolverCommonJoin is not working correctly? I
 created
 https://issues.apache.org/jira/browse/HIVE-6668 and set it as a
 blocker.
 
 thanks,
 
 Yin
 
 
 On Fri, Mar 14, 2014 at 11:34 AM, Thejas Nair 
 the...@hortonworks.com
 wrote:
 
 Can you also add HIVE-6647 
 https://issues.apache.org/jira/browse/HIVE-6647 to
 the list? It is marked as a blocker for 0.13.
 It has a necessary version number upgrade for HS2. It is ready to
 be
 committed.
 
 
 On Fri, Mar 14, 2014 at 12:38 AM, Prasanth Jayachandran 
 pjayachand...@hortonworks.com wrote:
 
 Harish
 
 Can you please make the following changes to my earlier request?
 
 HIVE-4177 is not required.. instead the same work is tracked under
 HIVE-6578.
 
 Can you also consider HIVE-6656?
 HIVE-6656 is bug fix for ORC reader when reading timestamp
 nanoseconds.
 This bug exists in earlier versions as well, so it will be good
 have
 this
 fixed in 0.13.0
 
 Thanks
 Prasanth Jayachandran
 
 On Mar 13, 2014, at 8:52 AM, Thejas Nair the...@hortonworks.com
 wrote:
 
 Harish,
 I think we should include the following -
 HIVE-6547 - This is a cleanup of metastore api changes introduced
 in
 0.13 .
 This can't be done post release. I will get a patch out in few
 hours.
 HIVE-6567 -  fixes a NPE in 'show grant .. on all
 HIVE-6629 - change in syntax for 'set role none' . marked as a
 blocker
 bug.
 
 
 On Tue, Mar 11, 2014 at 8:39 AM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 yes sure.
 
 
 On Mar 10, 2014, at 3:55 PM, Gopal V gop...@apache.org wrote:
 
 Can I add HIVE-6518 as well to the merge queue on
 
 
 
 
 
 
 
 
 
 https://cwiki.apache.org/confluence/display/Hive/Hive+0.13+release+status
 
 It is a relatively simple OOM safety patch to vectorized
 group-by.
 
 Tests pass locally for vec group-by, but the pre-commit tests
 haven't
 fired eventhough it's been PA for a while now.
 
 Cheers,
 Gopal
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual
 or
 entity to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If
 the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure
 or
 forwarding of