[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585479#comment-13585479 ] Prashant Kommireddi commented on PIG-3135: -- Hi [~cheolsoo], I don't see the new tests test/org/apache/pig/test/TestHExecutionEngine.java making it in with this commit. Can you please take a look when you get a chance? {code} $ svn diff --summarize -c r1449436 M conf/pig.properties M CHANGES.txt M src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java {code} HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585501#comment-13585501 ] Cheolsoo Park commented on PIG-3135: My bad. I forgot to add it. It is committed now: http://svn.apache.org/viewvc?view=revisionrevision=1449553 Thanks! HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585503#comment-13585503 ] Prashant Kommireddi commented on PIG-3135: -- Thanks Cheolsoo. HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582366#comment-13582366 ] Cheolsoo Park commented on PIG-3135: [~prkommireddi], I totally agree with you. Can we fix PIG-3200 first then? HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581825#comment-13581825 ] Prashant Kommireddi commented on PIG-3135: -- Hey [~cheolsoo], I think the issue is with the fact that MiniCluster.buildCluster() generates a hadoop-site.xml file under build/classes. However, MiniCluster.shutDown() does not delete this file. Ideally, hadoop-site.xml should be deleted on cluster shutdown. With respect to TestHExecutionEngine, the issue is with the fact a previous test run that uses mini-cluster generates hadoop-site.xml and does not delete it. At the time TestHExecutionEngine runs this file is present on the classpath but mini-dfs and mini-mr were shutdown at the completion of previous test. pigContext.connect() in TestHExecutionEngine tries to establish a connection with mini-cluster which is no longer up. The fix would be to: 1. Delete hadoop-site-xml on mini-cluster shutdown (to be done in another JIRA) 2. Check if this patch stabilizes (and nothing else breaks) Does that make sense? HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575580#comment-13575580 ] Prashant Kommireddi commented on PIG-3135: -- Sure, I will take a look. HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570739#comment-13570739 ] Prashant Kommireddi commented on PIG-3135: -- Thanks for the review and commit, Cheolsoo. HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570012#comment-13570012 ] Cheolsoo Park commented on PIG-3135: +1. I will commit it after running tests. HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Attachments: PIG-3135_1.patch, PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13566853#comment-13566853 ] Cheolsoo Park commented on PIG-3135: [~prkommireddi] Overall looks good to me. I have 3 questions. * Can we call the property pig.use.overriden.hadoop.configs? What we're overriding here are basically Hadoop confs, so I think that a more specific name is better. Do you agree? * On a related note, can you update the following comment in testJobConfGeneration? From: {code} // This should fail as pig expects classpath to be set {code} To something like: {code} // This should fail as pig expects Hadoop configs are present in classpath. {code} * Can you add this new property to conf/pig.properties with some explanation, so people can know about it? It would be nice if we could mention that this is a Mr-mode-specific property too. Please let me know if you have a better suggestion regarding how to document this. Thanks! HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Attachments: PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13567030#comment-13567030 ] Prashant Kommireddi commented on PIG-3135: -- Thanks for the quick review. Your comments make sense, I will make the above changes and upload a new patch. HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Attachments: PIG-3135.patch Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561272#comment-13561272 ] Prashant Kommireddi commented on PIG-3135: -- To get around the current limitation of depending on *site.xml files being set on classpath, we could add a property pig.use.override.configs to be able to provide your own configs and have JobConf be built from that. Currently: {code} jc = new JobConf(); {code} Proposal: 1. If pig.use.override.configs is present, generate JobConf using properties {code} jc = new JobConf(ConfigurationUtil.toConfiguration(properties)); {code} This change would be backward compatible, and those who wish to bypass classpath limitation can do so by setting the override property. Thoughts? HExecutionEngine should look for resources in user passed Properties Key: PIG-3135 URL: https://issues.apache.org/jira/browse/PIG-3135 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Looking at this snippet: {code} private void init(Properties properties) throws ExecException { . . . // Check existence of hadoop-site.xml or core-site.xml Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource( HADOOP_SITE ); URL core_site = cl.getResource( CORE_SITE ); if( hadoop_site == null core_site == null ) { throw new ExecException(Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). + If you plan to use local mode, please put -x local option in command line, 4010); } {code} This assumes the resources (*-site.xml) are set on the classpath, but this will not always be the case when run with Pig's Java APIs. One could want to programatically set the resources and the code here should additionally check if they are available in there. Example: When a Configuration object is created and resources are added before passing it on to Pig. {code} Configuration conf = new Configuration(false); conf.addResource(foo/core-site.xml); conf.addResource(bar/hadoop-site.xml); PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); {code} The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira