[jira] [Commented] (HIVE-3709) Stop storing default ConfVars in temp file

2012-11-28 Thread Chris McConnell (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505455#comment-13505455
 ] 

Chris McConnell commented on HIVE-3709:
---

I was also looking into this with 3596, I was able to fix utilizing a location 
similar to the suggestion Carl made above, however I think that pushes the 
problem to another location, rather than addressing the actual issue. I like 
where Kevin is going with this fix, I had thought about the possibility of 
checking the confVarURL in the copy constructor, removing and re-creating if it 
did not exist, but even that would not be perfect depending upon timing. 

 Stop storing default ConfVars in temp file
 --

 Key: HIVE-3709
 URL: https://issues.apache.org/jira/browse/HIVE-3709
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3709.1.patch.txt, HIVE-3709.2.patch.txt, 
 HIVE-3709.3.patch.txt


 To work around issues with Hadoop's Configuration object, specifically it's 
 addResource(InputStream), default configurations are written to a temp file 
 (I think HIVE-2362 introduced this).
 This, however, introduces the problem that once that file is deleted from 
 /tmp the client crashes.  This is particularly problematic for long running 
 services like the metastore server.
 Writing a custom InputStream to deal with the problems in the Configuration 
 object should provide a work around, which does not introduce a time bomb 
 into Hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3697) External JAR files on HDFS can lead to race condition with hive.downloaded.resources.dir

2012-11-12 Thread Chris McConnell (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495261#comment-13495261
 ] 

Chris McConnell commented on HIVE-3697:
---

Hey Joe,

Thanks for the work around, seems to work well for me also. I'll look into the 
possibility of changing the parent directory as well within the HiveConf to do 
something similar. 

Cheers,
Chris

 External JAR files on HDFS can lead to race condition with 
 hive.downloaded.resources.dir
 

 Key: HIVE-3697
 URL: https://issues.apache.org/jira/browse/HIVE-3697
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Chris McConnell

 I've seen situations where utilizing JAR files on HDFS can cause job failures 
 via CNFE or JVM crashes. 
 This is difficult to replicate, seems to be related to JAR size, latency 
 between client and HDFS cluster, but I've got some example stack traces 
 below. Seems that the calls made to FileSystem (copyToLocal) which are static 
 and will be executed to delete the current local copy can cause the file(s) 
 to be removed during job processing.
 We should consider changing the default for hive.downloaded.resources.dir to 
 include some level of uniqueness per job. We should not consider 
 hive.session.id however, as execution of multiple statements via the same 
 user/session which might access the same JAR files will utilize the same 
 session.
 A proposal might be to utilize System.nanoTime() -- which might be enough to 
 avoid the issue, although it's not perfect (depends on JVM and system for 
 level of precision) as part of the default 
 (/tmp/${user.name}/resources/System.nanoTime()/). 
 If anyone else has hit this, would like to capture environment information as 
 well. Perhaps there is something else at play here. 
 Here are some examples of the errors:
 for i in {0..2}; do hive -S -f query.q done
 [2] 48405
 [3] 48406
 [4] 48407
 % #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGBUS (0x7) at pc=0x7fb10bd931f0, pid=48407, tid=140398456698624
 #
 # JRE version: 6.0_31-b04
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode linux-amd64 
 compressed oops)
 # Problematic frame:
 # C  [libzip.so+0xb1f0]  __int128+0x60
 #
 # An error report file with more information is saved as:
 # /home/.../hs_err_pid48407.log
 #
 # If you would like to submit a bug report, please visit:
 #   http://java.sun.com/webapps/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
 #
 java.lang.NoClassDefFoundError: com/example/udf/Lower
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at 
 org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(FunctionTask.java:105)
 at 
 org.apache.hadoop.hive.ql.exec.FunctionTask.createFunction(FunctionTask.java:75)
 at 
 org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:63)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1331)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:439)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:449)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processInitFiles(CliDriver.java:485)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:692)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
 Caused by: java.lang.ClassNotFoundException: com.example.udf.Lower
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at 

[jira] [Created] (HIVE-3697) External JAR files on HDFS can lead to race condition with hive.downloaded.resources.dir

2012-11-09 Thread Chris McConnell (JIRA)
Chris McConnell created HIVE-3697:
-

 Summary: External JAR files on HDFS can lead to race condition 
with hive.downloaded.resources.dir
 Key: HIVE-3697
 URL: https://issues.apache.org/jira/browse/HIVE-3697
 Project: Hive
  Issue Type: Bug
Reporter: Chris McConnell


I've seen situations where utilizing JAR files on HDFS can cause job failures 
via CNFE or JVM crashes. 

This is difficult to replicate, seems to be related to JAR size, latency 
between client and HDFS cluster, but I've got some example stack traces below. 
Seems that the calls made to FileSystem (copyToLocal) which are static and will 
be executed to delete the current local copy can cause the file(s) to be 
removed during job processing.

We should consider changing the default for hive.downloaded.resources.dir to 
include some level of uniqueness per job. We should not consider 
hive.session.id however, as execution of multiple statements via the same 
user/session which might access the same JAR files will utilize the same 
session.

A proposal might be to utilize System.nanoTime() -- which might be enough to 
avoid the issue, although it's not perfect (depends on JVM and system for level 
of precision) as part of the default 
(/tmp/${user.name}/resources/System.nanoTime()/). 

If anyone else has hit this, would like to capture environment information as 
well. Perhaps there is something else at play here. 

Here are some examples of the errors:

for i in {0..2}; do hive -S -f query.q done
[2] 48405
[3] 48406
[4] 48407
% #
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7fb10bd931f0, pid=48407, tid=140398456698624
#
# JRE version: 6.0_31-b04
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [libzip.so+0xb1f0]  __int128+0x60
#
# An error report file with more information is saved as:
# /home/.../hs_err_pid48407.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
java.lang.NoClassDefFoundError: com/example/udf/Lower
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(FunctionTask.java:105)
at 
org.apache.hadoop.hive.ql.exec.FunctionTask.createFunction(FunctionTask.java:75)
at 
org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:63)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1331)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341)
at 
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:439)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:449)
at 
org.apache.hadoop.hive.cli.CliDriver.processInitFiles(CliDriver.java:485)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:692)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.ClassNotFoundException: com.example.udf.Lower
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 24 more
FAILED: Execution Error, return code -101 from 
org.apache.hadoop.hive.ql.exec.FunctionTask

Another:
for i in {0..2}; do hive -S -f query.q done
[1] 16294 
[2] 16295 
[3] 16296 
[]$ Couldn't create directory /tmp/ctm/resources/
Couldn't create directory /tmp/ctm/resources/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA 

[jira] [Updated] (HIVE-3697) External JAR files on HDFS can lead to race condition with hive.downloaded.resources.dir

2012-11-09 Thread Chris McConnell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris McConnell updated HIVE-3697:
--

Affects Version/s: 0.9.0

 External JAR files on HDFS can lead to race condition with 
 hive.downloaded.resources.dir
 

 Key: HIVE-3697
 URL: https://issues.apache.org/jira/browse/HIVE-3697
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Chris McConnell

 I've seen situations where utilizing JAR files on HDFS can cause job failures 
 via CNFE or JVM crashes. 
 This is difficult to replicate, seems to be related to JAR size, latency 
 between client and HDFS cluster, but I've got some example stack traces 
 below. Seems that the calls made to FileSystem (copyToLocal) which are static 
 and will be executed to delete the current local copy can cause the file(s) 
 to be removed during job processing.
 We should consider changing the default for hive.downloaded.resources.dir to 
 include some level of uniqueness per job. We should not consider 
 hive.session.id however, as execution of multiple statements via the same 
 user/session which might access the same JAR files will utilize the same 
 session.
 A proposal might be to utilize System.nanoTime() -- which might be enough to 
 avoid the issue, although it's not perfect (depends on JVM and system for 
 level of precision) as part of the default 
 (/tmp/${user.name}/resources/System.nanoTime()/). 
 If anyone else has hit this, would like to capture environment information as 
 well. Perhaps there is something else at play here. 
 Here are some examples of the errors:
 for i in {0..2}; do hive -S -f query.q done
 [2] 48405
 [3] 48406
 [4] 48407
 % #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGBUS (0x7) at pc=0x7fb10bd931f0, pid=48407, tid=140398456698624
 #
 # JRE version: 6.0_31-b04
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode linux-amd64 
 compressed oops)
 # Problematic frame:
 # C  [libzip.so+0xb1f0]  __int128+0x60
 #
 # An error report file with more information is saved as:
 # /home/.../hs_err_pid48407.log
 #
 # If you would like to submit a bug report, please visit:
 #   http://java.sun.com/webapps/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
 #
 java.lang.NoClassDefFoundError: com/example/udf/Lower
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at 
 org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(FunctionTask.java:105)
 at 
 org.apache.hadoop.hive.ql.exec.FunctionTask.createFunction(FunctionTask.java:75)
 at 
 org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:63)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1331)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:439)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:449)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processInitFiles(CliDriver.java:485)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:692)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
 Caused by: java.lang.ClassNotFoundException: com.example.udf.Lower
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 ... 24 more
 FAILED: Execution Error, return code -101 from 
 

[jira] [Created] (HIVE-3656) Fix ERROR logging related to hive.metastore.uris and javax.jdo.option.ConnectionURL being set

2012-11-02 Thread Chris McConnell (JIRA)
Chris McConnell created HIVE-3656:
-

 Summary: Fix ERROR logging related to hive.metastore.uris and 
javax.jdo.option.ConnectionURL being set
 Key: HIVE-3656
 URL: https://issues.apache.org/jira/browse/HIVE-3656
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.9.0
Reporter: Chris McConnell
Priority: Minor


In order to ensure the defaults are set, HiveConf sets 
javax.jdo.option.ConnectionURL to the default, embedded Derby DB. 

When utilizing the hive.metastore.uris (for example, using a 
thrift://host:port) to connect to an external metastore, the code block below 
is executed:

{code}
if (null != this.get(ConfVars.METASTOREURIS.varname, null) 
  null != this.get(ConfVars.METASTORECONNECTURLKEY.varname, null)) {
l4j.error(Found both  + ConfVars.METASTOREURIS.varname +  and  +
  ConfVars.METASTORECONNECTURLKEY +  Recommended to have exactly one of 
those config key +
  in configuration);
}
{code}

Quick way to test is with HiveServer2:

hive --config /path/to/hive-site.xml --service hiveserver2 

with hive-site.xml utilizing the hive.metastore.uris

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3596) Regression - HiveConf static variable causes issues in long running JVM instances with /tmp/ data

2012-10-26 Thread Chris McConnell (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13485028#comment-13485028
 ] 

Chris McConnell commented on HIVE-3596:
---

Good catch Carl -- I believe when I tested my rm of the data files required 
confirmation and perhaps I did not say yes. 

Deeper digging the issue, will update when I have a clear option. 

 Regression - HiveConf static variable causes issues in long running JVM 
 instances with /tmp/ data
 -

 Key: HIVE-3596
 URL: https://issues.apache.org/jira/browse/HIVE-3596
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.8.0, 0.8.1, 0.9.0
Reporter: Chris McConnell
Assignee: Chris McConnell
 Fix For: 0.8.1, 0.9.0, 0.10.0

 Attachments: HIVE-3596.patch


 With Hive 0.8.x, HiveConf was changed to utilize the private, static member 
 confVarURL which points to /tmp/hive-user-tmp_number.xml for job 
 configuration settings. 
 During long running JVMs, such as a Beeswax server, which creates multiple 
 HiveConf objects over time this variable does not properly get updated 
 between jobs and can cause job failure if the OS cleans /tmp/ during a cron 
 job. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3596) Regression - HiveConf static variable causes issues in long running JVM instances with /tmp/ data

2012-10-19 Thread Chris McConnell (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480209#comment-13480209
 ] 

Chris McConnell commented on HIVE-3596:
---

Review: https://reviews.facebook.net/D6093 

 Regression - HiveConf static variable causes issues in long running JVM 
 instances with /tmp/ data
 -

 Key: HIVE-3596
 URL: https://issues.apache.org/jira/browse/HIVE-3596
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.8.0, 0.8.1, 0.9.0
Reporter: Chris McConnell
 Fix For: 0.8.1, 0.9.0, 0.10.0

 Attachments: HIVE-3596.patch


 With Hive 0.8.x, HiveConf was changed to utilize the private, static member 
 confVarURL which points to /tmp/hive-user-tmp_number.xml for job 
 configuration settings. 
 During long running JVMs, such as a Beeswax server, which creates multiple 
 HiveConf objects over time this variable does not properly get updated 
 between jobs and can cause job failure if the OS cleans /tmp/ during a cron 
 job. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3596) Regression - HiveConf static variable causes issues in long running JVM instances with /tmp/ data

2012-10-18 Thread Chris McConnell (JIRA)
Chris McConnell created HIVE-3596:
-

 Summary: Regression - HiveConf static variable causes issues in 
long running JVM instances with /tmp/ data
 Key: HIVE-3596
 URL: https://issues.apache.org/jira/browse/HIVE-3596
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.9.0, 0.8.1, 0.8.0
Reporter: Chris McConnell


With Hive 0.8.x, HiveConf was changed to utilize the private, static member 
confVarURL which points to /tmp/hive-user-tmp_number.xml for job 
configuration settings. 

During long running JVMs, such as a Beeswax server, which creates multiple 
HiveConf objects over time this variable does not properly get updated between 
jobs and can cause job failure if the OS cleans /tmp/ during a cron job. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3596) Regression - HiveConf static variable causes issues in long running JVM instances with /tmp/ data

2012-10-18 Thread Chris McConnell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris McConnell updated HIVE-3596:
--

Attachment: HIVE-3596.patch

Patch to remove static attached.

 Regression - HiveConf static variable causes issues in long running JVM 
 instances with /tmp/ data
 -

 Key: HIVE-3596
 URL: https://issues.apache.org/jira/browse/HIVE-3596
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.8.0, 0.8.1, 0.9.0
Reporter: Chris McConnell
 Attachments: HIVE-3596.patch


 With Hive 0.8.x, HiveConf was changed to utilize the private, static member 
 confVarURL which points to /tmp/hive-user-tmp_number.xml for job 
 configuration settings. 
 During long running JVMs, such as a Beeswax server, which creates multiple 
 HiveConf objects over time this variable does not properly get updated 
 between jobs and can cause job failure if the OS cleans /tmp/ during a cron 
 job. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3596) Regression - HiveConf static variable causes issues in long running JVM instances with /tmp/ data

2012-10-18 Thread Chris McConnell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris McConnell updated HIVE-3596:
--

Attachment: HIVE-3596.patch

New patch, added comments for change as well.

 Regression - HiveConf static variable causes issues in long running JVM 
 instances with /tmp/ data
 -

 Key: HIVE-3596
 URL: https://issues.apache.org/jira/browse/HIVE-3596
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.8.0, 0.8.1, 0.9.0
Reporter: Chris McConnell
 Attachments: HIVE-3596.patch


 With Hive 0.8.x, HiveConf was changed to utilize the private, static member 
 confVarURL which points to /tmp/hive-user-tmp_number.xml for job 
 configuration settings. 
 During long running JVMs, such as a Beeswax server, which creates multiple 
 HiveConf objects over time this variable does not properly get updated 
 between jobs and can cause job failure if the OS cleans /tmp/ during a cron 
 job. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3540) Non-local Hive query with custom InputFormat via CombineFileInputFormat fails with zipped data

2012-10-05 Thread Chris McConnell (JIRA)
Chris McConnell created HIVE-3540:
-

 Summary: Non-local Hive query with custom InputFormat via 
CombineFileInputFormat fails with zipped data
 Key: HIVE-3540
 URL: https://issues.apache.org/jira/browse/HIVE-3540
 Project: Hive
  Issue Type: Bug
  Components: CLI, Query Processor
Affects Versions: 0.8.1
Reporter: Chris McConnell
Priority: Minor


When accessing a table with zipped data using a custom InputFormat which 
extends CombineFileInputFormat, any non-local Hive execution will pick up the 
default org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.

Issue spawns from the fact that (by default) InputFormat cannot handle 
concatenated zip files. 

To reproduce:

Create data that is text based, zip it
Create table with a custom input that extends CombineFileInputFormat
Execute query that is not local (select * from table limit 1;)



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira