[
https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290798#comment-13290798
]
Koji Noguchi commented on PIG-2741:
-----------------------------------
bq. But the weird thing is the script fail with the same error message if I
apply the patch PIG-2665.
This jira was failing because python.cachedir was set to incorrect path when
initialized.
For PIG-2665 with jython-standalone-2.5.2.jar, it seems to be failing due to
'python.cachedir.skip' somehow set to true as default.
Error message is same but the cause is different.
bq. To make sure later patch does not break this script, please add a test
case.
Adding testcase for PIG-2665 failing is probably easy. As for this jira, I
don't know of a good way. Owner of jython-2.5.0.jar (dir) and the user of the
test needs to be different for this issue to happen. When I manually tested, I
just mkdir ./build/ivy/lib/Pig/cachedir ; followed by chmod 000
./build/ivy/lib/Pig/cachedir.
> Python script throws an NameError: name 'Configuration' is not defined in
> case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error
> messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded
> script: jython
> 2012-06-06 01:20:43,603 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO
> org.apache.pig.scripting.jython.JythonScriptEngine - created tmp
> python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python
> Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does
> not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on
> the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira