[jira] Updated: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] V.V.Chaitanya Krishna updated PIG-1381: --- Status: Open (was: Patch Available) Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] V.V.Chaitanya Krishna updated PIG-1381: --- Attachment: PIG-1381_cli_1.patch Uploading patch that implements option 1 of command-line option of providing a properties' file by user. The patch has the following changes: * Renaming PropertiesUtil.loadPropertiesFromFile to PropertiesUtil.loadDefaultProperties * Refactoring the code implementing the loading of properties from pig-default.properties and pig.properties (to avoid code duplication). * Extracting the code that loads properties from the deprecated .pigrc file. This makes it easier to use the method again to load properties from the user-specified properties' file. * load the properties from deprecated .pigrc file _before_ the other default files (i.e., pig-default.properties and pig.properties). This will make the code simpler as we dont need to check for the existence of property before loading it from .pigrc file, because it will later get overriden by the value in pid-default.properties or pig.properties. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] V.V.Chaitanya Krishna updated PIG-1381: --- Status: Patch Available (was: Open) Running through hudson. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-928) UDFs in scripting languages
[ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arnab Nandi updated PIG-928: Attachment: pig.scripting.patch.arnab test.zip calltrace.png Building on Julien's and Woody's code, this patch provides pluggable scripting support in native Pig. ##Syntax:## register 'test.py' USING org.apache.pig.scripting.jython.JythonScriptEngine; This makes all functions inside test.py available as Pig functions. ##Things in this patch: ## 1. Modifications to parser .jjt file 2. ScriptEngine abstract class and Jython instantiation. 3. Ability to ship .py files similar to .jars, loaded on demand. 4. Input checking and Schema support. ##Things NOT in this patch: ## 1. Inline code support: (Replace 'test.py' with `multiline inline code`, prefer to submit as separate bug) 2. Scripting engines and examples other than Jython(e.g. beanshell and rhino) 3. Junit-based test harness (provided as test.zip) 4. Python-Pig Object transforms are not very efficient (see calltrace.zip). Preferred the cleaner implementation first. (non-obvious optimizations such as object reuse can be introduced as separate bug) ##Notes: ## 1. I went with register instead of define since files can contain multiple functions, similar to .jars. imho this makes more sense, using define would introduce the concept of codeblock aliases and function names would look like alias.functionName(), which is possible but inconsistent since we cannot have alias2.functionName() (which would require separate interpreter instances, etc etc). 2. This has been tested both locally and in mapred mode. 3. We assume .py files are simply a list of functions. Since the entire file is loaded, you can have dependent functions. No effort is made to resolve imports, though. 4. You'll need to add jython.jar into classpath, or compile it into pig.jar. Would love comments and code-followups! UDFs in scripting languages --- Key: PIG-928 URL: https://issues.apache.org/jira/browse/PIG-928 Project: Pig Issue Type: New Feature Reporter: Alan Gates Fix For: 0.8.0 Attachments: calltrace.png, package.zip, pig-greek.tgz, pig.scripting.patch.arnab, pyg.tgz, scripting.tgz, scripting.tgz, test.zip It should be possible to write UDFs in scripting languages such as python, ruby, etc. This frees users from needing to compile Java, generate a jar, etc. It also opens Pig to programmers who prefer scripting languages over Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1249) Safe-guards against misconfigured Pig scripts without PARALLEL keyword
[ https://issues.apache.org/jira/browse/PIG-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated PIG-1249: Status: Open (was: Patch Available) Safe-guards against misconfigured Pig scripts without PARALLEL keyword -- Key: PIG-1249 URL: https://issues.apache.org/jira/browse/PIG-1249 Project: Pig Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Arun C Murthy Assignee: Jeff Zhang Priority: Critical Fix For: 0.8.0 Attachments: PIG-1249.patch, PIG_1249_2.patch It would be *very* useful for Pig to have safe-guards against naive scripts which process a *lot* of data without the use of PARALLEL keyword. We've seen a fair number of instances where naive users process huge data-sets (10TB) with badly mis-configured #reduces e.g. 1 reduce. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
sub
sub
Re: About PigPen
The one on the JIRA is more up to date. However, be aware that PigPen has not been updated since Pig 0.2 and does not work with new versions of Pig. Alan. On May 23, 2010, at 11:25 PM, Renato MarroquĂn Mogrovejo wrote: Hi, does anybody know which the PigPen release is? I found two links. The first one is from the wiki and the second one is from the jira. http://issues.apache.org/jira/secure/attachment/12393772/org.apache.pig.pigpen_0.0.1.jar https://issues.apache.org/jira/secure/attachment/12400858/PigPen.tgz Thanks in advance. Renato M.
[jira] Commented: (PIG-928) UDFs in scripting languages
[ https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870739#action_12870739 ] Dmitriy V. Ryaboy commented on PIG-928: --- I've found that using lazy conversion from objects to tuples can save significant amounts of time when records get later filtered out, only parts of the output used, etc. Perhaps this is something to try if you say pythonToPig is slow? Here's what I did with Protocol Buffers: http://github.com/dvryaboy/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/util/ProtobufTuple.java UDFs in scripting languages --- Key: PIG-928 URL: https://issues.apache.org/jira/browse/PIG-928 Project: Pig Issue Type: New Feature Reporter: Alan Gates Fix For: 0.8.0 Attachments: calltrace.png, package.zip, pig-greek.tgz, pig.scripting.patch.arnab, pyg.tgz, scripting.tgz, scripting.tgz, test.zip It should be possible to write UDFs in scripting languages such as python, ruby, etc. This frees users from needing to compile Java, generate a jar, etc. It also opens Pig to programmers who prefer scripting languages over Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870796#action_12870796 ] Daniel Dai commented on PIG-1381: - I reviewed the patch. Command line properties file will override default properties, and we can have multiple number of -propertyFile entry in command line. Command line switch is -P or -propertyFile. That's good. I have a comment for the line: opts.registerOpt('P', propertyFile, CmdLineParser.ValueExpected.OPTIONAL); I think value of perpertyFile perperty is not OPTIONAL, should change it to REQUIRED. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1249) Safe-guards against misconfigured Pig scripts without PARALLEL keyword
[ https://issues.apache.org/jira/browse/PIG-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870896#action_12870896 ] Alan Gates commented on PIG-1249: - Questions/Comments: # In this code, what happens if a loader is not loading from a file (like an HBase loader)? It looks to me like it will end up throwing an IOException when it tries to stat the 'file' which won't exist and that will cause Pig to die. Ideally in this case it should decide that it cannot make a rational estimate and not try to estimate. # I'm curious where the values of ~1GB per reducer and 999 reducers came from. # Does this estimate apply only to the first job or to all jobs? # How does this work in the case of joins, where there are multiple inputs to a job? Safe-guards against misconfigured Pig scripts without PARALLEL keyword -- Key: PIG-1249 URL: https://issues.apache.org/jira/browse/PIG-1249 Project: Pig Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Arun C Murthy Assignee: Jeff Zhang Priority: Critical Fix For: 0.8.0 Attachments: PIG-1249.patch, PIG_1249_2.patch It would be *very* useful for Pig to have safe-guards against naive scripts which process a *lot* of data without the use of PARALLEL keyword. We've seen a fair number of instances where naive users process huge data-sets (10TB) with badly mis-configured #reduces e.g. 1 reduce. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1426) Change the size of Tuple from Int to VInt when Serialize Tuple
[ https://issues.apache.org/jira/browse/PIG-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870899#action_12870899 ] Alan Gates commented on PIG-1426: - This looks cool. Eventually we could extend it to length of strings, bags, databyte arrays, etc. One question. Zebra and BinStorage use the code in DataReaderWriter to read this data off disk. Is WritableUtils.readVInt compatible with a regular integer as well? If not, it seems we're introducing a data incompatibility for data stored using these formats. Change the size of Tuple from Int to VInt when Serialize Tuple -- Key: PIG-1426 URL: https://issues.apache.org/jira/browse/PIG-1426 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.8.0 Reporter: Jeff Zhang Assignee: Jeff Zhang Fix For: 0.8.0 Attachments: PIG_1426.patch Most of time, the size of tuple is not very large, one byte is enough for store the size of tuple. So I suggest to use VInt instead of Int for the size of tuple when doing Serialization. Because the key type of map output is Tuple, so this can reduce the amount of data transferred from mapper to reducer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1426) Change the size of Tuple from Int to VInt when Serialize Tuple
[ https://issues.apache.org/jira/browse/PIG-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870937#action_12870937 ] Jeff Zhang commented on PIG-1426: - Alan, It won't affect Zebra and BinStorage. Here I only apply VInt in tuple, and now we always use VInt to write the size, we use VInt to read. It won't affect other data types. Change the size of Tuple from Int to VInt when Serialize Tuple -- Key: PIG-1426 URL: https://issues.apache.org/jira/browse/PIG-1426 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.8.0 Reporter: Jeff Zhang Assignee: Jeff Zhang Fix For: 0.8.0 Attachments: PIG_1426.patch Most of time, the size of tuple is not very large, one byte is enough for store the size of tuple. So I suggest to use VInt instead of Int for the size of tuple when doing Serialization. Because the key type of map output is Tuple, so this can reduce the amount of data transferred from mapper to reducer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1249) Safe-guards against misconfigured Pig scripts without PARALLEL keyword
[ https://issues.apache.org/jira/browse/PIG-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870948#action_12870948 ] Jeff Zhang commented on PIG-1249: - Response to Alan's questions, 1. In this code, what happens if a loader is not loading from a file (like an HBase loader)? It looks to me like it will end up throwing an IOException when it tries to stat the 'file' which won't exist and that will cause Pig to die. Ideally in this case it should decide that it cannot make a rational estimate and not try to estimate. {color:blue} It won't throw IOException when file doesn't exit, getTotalInputFileSize will return 0 if not loading from file or file doesn't exit. And the final estimated reducer number will be 1. {color} 2. I'm curious where the values of ~1GB per reducer and 999 reducers came from. {color:blue} These two numbers is what Hive use, I'm not sure how they came from. Maybe from their experience. {color} 3. Does this estimate apply only to the first job or to all jobs? {color:blue} It will apply to all the jobs {color} 4. How does this work in the case of joins, where there are multiple inputs to a job? {color:blue} it will estimate the reducer number according the all the inputs files' size {color} Safe-guards against misconfigured Pig scripts without PARALLEL keyword -- Key: PIG-1249 URL: https://issues.apache.org/jira/browse/PIG-1249 Project: Pig Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Arun C Murthy Assignee: Jeff Zhang Priority: Critical Fix For: 0.8.0 Attachments: PIG-1249.patch, PIG_1249_2.patch It would be *very* useful for Pig to have safe-guards against naive scripts which process a *lot* of data without the use of PARALLEL keyword. We've seen a fair number of instances where naive users process huge data-sets (10TB) with badly mis-configured #reduces e.g. 1 reduce. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870970#action_12870970 ] V.V.Chaitanya Krishna commented on PIG-1381: bq. I think value of perpertyFile perperty is not OPTIONAL, should change it to REQUIRED If the option is made mandatory for the user, then the following scenarios might occur: # User who want to run just with the default properties and does not need any properties to be set will still be forced to submit a blank properties file. # If this is made mandatory, then the presence of pig.properties might not make much sense. I believe this option should be an alternative to pig.properties in user providing properties. Ideally, I think users should be able to work without submitting their own set of properties. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870972#action_12870972 ] Daniel Dai commented on PIG-1381: - Hi, V.V. Chaitanya, It is ValueExpected.REQUIRED. It does not require that the option appeared in command line. But if it does, then you you must give a value after -propertyfile/-P. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870975#action_12870975 ] V.V.Chaitanya Krishna commented on PIG-1381: My bad. Got carried away for a while :) Yes. It makes sense to have it as REQUIRED. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] V.V.Chaitanya Krishna updated PIG-1381: --- Attachment: PIG-1381_cli_2.patch Uploading patch with Daniel's review comments incorporated. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch, PIG-1381_cli_2.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] V.V.Chaitanya Krishna updated PIG-1381: --- Status: Patch Available (was: Open) Submitting the new patch to Hudson. Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch, PIG-1381_cli_2.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1381) Need a way for Pig to take an alternative property file
[ https://issues.apache.org/jira/browse/PIG-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] V.V.Chaitanya Krishna updated PIG-1381: --- Status: Open (was: Patch Available) Need a way for Pig to take an alternative property file --- Key: PIG-1381 URL: https://issues.apache.org/jira/browse/PIG-1381 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: V.V.Chaitanya Krishna Fix For: 0.8.0 Attachments: PIG-1381-1.patch, PIG-1381-2.patch, PIG-1381-3.patch, PIG-1381-4.patch, PIG-1381-5.patch, PIG-1381_cli_1.patch, PIG-1381_cli_2.patch Currently, Pig read the first ever pig.properties in CLASSPATH. Pig has a default pig.properties and if user have a different pig.properties, there will be a conflict since we can only read one. There are couple of ways to solve it: 1. Give a command line option for user to pass an additional property file 2. Change the name for default pig.properties to pig-default.properties, and user can give a pig.properties to override 3. Further, can we consider to use pig-default.xml/pig-site.xml, which seems to be more natural for hadoop community. If so, we shall provide backward compatibility to also read pig.properties, pig-cluster-hadoop-site.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.