[
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025007#comment-13025007
]
Woody Anderson commented on PIG-1824:
-------------------------------------
agree:
inre: PYTHON_CACHEDIR: the code behaves as you wish, in that it only deletes
the dir if it (pig) created it.
sorry for not being being clear in comments about that, but if you read the
code you'll see it.
if we can't write, i (pig) was creating an alternate directory. It may be
possible to pre-populate this, and i understand (and had) the desire to have an
error instead of a new directory, but I was initially experiencing this error:
{code}
*sys-package-mgr*: can't create package cache dir,
'/grid/0/Releases/pig-0.8.0..1103222002-20110401-000/share/pig-0.8.0..1103222002/lib/cachedir/packages'
{code}
which is why i added the 'is writable' check, but after reviewing (per your
comment), it seems that cachedir is not set on the grid (at least at the point
when the static block runs). If left as null, it seems to default to some grid
location that is not writable (and thus doesn't work), but if i set it to a
writable tmp first, it works.
so.. i can safely agree that an error if the dir isn't writable is both
desirable and works.
as for the getScriptAsStream():
i followed the existing code convention on that one, though i didn't like it
either.
again, if you read down a bit you'll see that the impl of getScriptAsStream()
is:
{code}
..
if (is == null) {
throw new IllegalStateException(
"Could not initialize interpreter (from file system or
classpath) with " + scriptPath);
}
return is;
{code}
so, the null check is superfluous but does quiet the "not null check" warnings.
i didn't add an additional throw statement in this case b/c essentially, my
code wouldn't add any _new_ errors that the existing code didn't already
exhibit if somehow the impl of getScriptAsStream changed and could return null.
anyway, ill upload a new patch to address the writable issue, if you think it's
a big deal we can add an 'else throw' statement around getScriptAsStream
> Support import modules in Jython UDF
> ------------------------------------
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.8.0, 0.9.0
> Reporter: Richard Ding
> Assignee: Woody Anderson
> Fix For: 0.9.0
>
> Attachments: 1824.patch, 1824a.patch, 1824b.patch
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the
> backend? Or should we add a ship clause to let user explicitly specify the
> module to ship?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira