[
https://issues.apache.org/jira/browse/PIG-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-3441:
----------------------------
Attachment: PIG-3441-2.patch
Finally get a chance to look into it. What I find is HExecutionEngine.init will
instantiate the Configuration with defaults, and put all entries into
PigContext.properites. Every time Pig needs a Configuration object, it will
create a Configuration object with no defaults and inject
PigContext.properites, and thus Configuration should contains default
resources. My guess for this issue is at the time Pig invokes
HExecutionEngine.init, Configuration.addDefaultResource("myfs-site.xml") it not
hit yet, so we miss it in PigContext.properites. I'd like to fix the issue by
updating PigContext.properites after the new default resources is added.
Attached is a patch which update PigContext.properites again before we launch
Pig job (the patch also includes several clean up for the configuration part).
Can you try if it works for your issue?
> Allow Pig to use default resources from Configuration objects
> -------------------------------------------------------------
>
> Key: PIG-3441
> URL: https://issues.apache.org/jira/browse/PIG-3441
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.11.1
> Reporter: Bhooshan Mogal
> Attachments: PIG-3441-2.patch, PIG-3441.patch, PIG-3441_1.patch
>
>
> Pig currently ignores parameters from configuration files added statically to
> Configuration objects as Configuration.addDefaultResource(filename.xml).
> Consider the following scenario -
> In a hadoop FileSystem driver for a non-HDFS filesystem you load properties
> specific to that FileSystem in a static initializer block in the class that
> extends org.apache.hadoop.fs.Filesystem for your FileSystem like below -
> {code}
> class MyFileSystem extends FileSystem {
> static {
> Configuration.addDefaultResource("myfs-default.xml");
> Configuration.addDefaultResource("myfs-site.xml");
> }
> }
> {code}
> Interfaces like the Hadoop CLI, Hive, Hadoop M/R can find configuration
> parameters defined in these configuration files as long as they are on the
> classpath.
> However, Pig cannot find parameters from these files, because it ignores
> configuration files added statically.
> Pig should allow users to specify if they would like pig to read parameters
> from resources loaded statically.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)