[
https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koji Noguchi updated PIG-2741:
------------------------------
Attachment: pig-2741-testfailing-pig2665-v5.patch.txt
Thanks Daniel for adding e2e testcase.
Added 1 line to the testase so that it would now fail without this patch.
{noformat}
#jython uses 'python.home'/cachedir when python.cachedir is not specified.
#To test python.cachedir is set correctly by the framework,
#setting python.home to a random path
'java_params' => ['-Dpython.home=/dev/null/fake'],
{noformat}
Confirmed that this test case
i) Fails without the patch (due to using /dev/null/fake as the cache dir)
ii) Succeeds with the patch (by using cache dir set by the framework.)
iii) Fails with PIG-2665 current patch due to 'python.cachedir.skip set to true
in a standalone mode.
> Python script throws an NameError: name 'Configuration' is not defined in
> case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt,
> pig-2741-testfailing-pig2665-v2.patch.txt,
> pig-2741-testfailing-pig2665-v3.patch.txt,
> pig-2741-testfailing-pig2665-v4.patch.txt,
> pig-2741-testfailing-pig2665-v5.patch.txt,
> pig-2741-testfailing-pig2665-v5.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error
> messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded
> script: jython
> 2012-06-06 01:20:43,603 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO
> org.apache.pig.scripting.jython.JythonScriptEngine - created tmp
> python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python
> Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does
> not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on
> the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira