[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792295#action_12792295 ] Olga Natkovich commented on PIG-948: +1. Changes look good. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Daniel Dai Priority: Minor Fix For: 0.7.0 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, PIG-948-5.patch, PIG-948-6.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792340#action_12792340 ] Hadoop QA commented on PIG-948: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428337/PIG-948-6.patch against trunk revision 891499. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/138/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/138/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/138/console This message is automatically generated. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Daniel Dai Priority: Minor Fix For: 0.7.0 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, PIG-948-5.patch, PIG-948-6.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791299#action_12791299 ] Hadoop QA commented on PIG-948: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428111/PIG-948-5.patch against trunk revision 890596. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/128/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/128/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/128/console This message is automatically generated. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, PIG-948-5.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763593#action_12763593 ] Olga Natkovich commented on PIG-948: +1, please, commit [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763169#action_12763169 ] Ashutosh Chauhan commented on PIG-948: -- +1 Change looks good. It should be log.info instead of log.error. In local hadoop mode, since its all running in one java process there is no port address of job tracker to get. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12762861#action_12762861 ] Hadoop QA commented on PIG-948: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421472/PIG-948-4.patch against trunk revision 822382. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/62/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/62/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/62/console This message is automatically generated. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760909#action_12760909 ] Ashutosh Chauhan commented on PIG-948: -- +1 for the patch. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760981#action_12760981 ] Pradeep Kamath commented on PIG-948: +1 [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760986#action_12760986 ] Daniel Dai commented on PIG-948: Patch committed. Thanks Ashutosh for contributing! [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760749#action_12760749 ] Daniel Dai commented on PIG-948: Since most comments pro for this change, so I am going to commit this patch including the url construction part. I will change sleepTime from 500 to 1000. In all cases I have experimented, I can get the correct jobid after 1000ms. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760808#action_12760808 ] Hadoop QA commented on PIG-948: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420851/pig-948-3.patch against trunk revision 820111. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 405 javac compiler warnings (more than the trunk's current 403 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/11/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/11/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/11/console This message is automatically generated. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760121#action_12760121 ] Ashutosh Chauhan commented on PIG-948: -- @Daniel bq. Also I notice in many cases we cannot get first job id correctly (job id is null in this case). If I change sleepTime (MapReduceLauncher.java:100) from 500 to 1000 (ms), things look fine. Does anyone else also see that? Reason for that is JobControlCompiler compiles a set of inter-dependent MR jobs and generates a job-control object which is then submitted asynchronously to hadoop for execution. Since we dont block on those thread, its possible that job-ids are not yet assigned when we ask for them. Setting sleep time to higher value like 1000ms should be sufficient for most cases and should work. Note increasing this sleep time doesn't affect execution in anyway since we are sleeping in a thread which only does reporting. Another fool-proof though complicated approach is to sleep for shorter time duration, then check if id is assigned, if not sleep again in a while loop until ids are assigned. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759926#action_12759926 ] Kevin Weil commented on PIG-948: FWIW, I'd +1 Dmitriy's comment. Yes, it's a shame this isn't programmatically available via Hadoop, but come on. It's a single-line string concatenation. And it's FAR more convenient to print out the full url than to expect people to memorize the jobtracker url and sub in a parameter. One of these options is strictly correct, and the other has the overhead of a single line of code and is far more convenient to the end user :) [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757960#action_12757960 ] Daniel Dai commented on PIG-948: I am going to commit the jobid part of the patch first, since it is a very useful feature. I will leave the issue open until we reach consensus on the rest part. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758060#action_12758060 ] Hadoop QA commented on PIG-948: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420207/pig-948-2.patch against trunk revision 817319. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 405 javac compiler warnings (more than the trunk's current 403 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/41/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/41/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/41/console This message is automatically generated. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758066#action_12758066 ] Daniel Dai commented on PIG-948: javac warnings is due to the following two statements: line 33: import org.apache.hadoop.mapred.JobConf; line 142: JobConf jConf = job.getJobConf(); JobConf is marked deprecated. However, we are now in the progress to move to new hadoop API. We do not aim to fix it in this patch. For unit test, it is hard to add a unit test for this. I manually tested it and it works. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758085#action_12758085 ] Daniel Dai commented on PIG-948: Also I notice in many cases we cannot get first job id correctly (job id is null in this case). If I change sleepTime (MapReduceLauncher.java:100) from 500 to 1000 (ms), things look fine. Does anyone else also see that? [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757310#action_12757310 ] Dmitriy V. Ryaboy commented on PIG-948: --- I don't see a problem with url construction in Pig code. If Hadoop exposed this, then sure, it would be better to use such a feature. Since Hadoop does not expose it (afaik), it's more useful for the end-user to have this url than to have a jobid. Maintenance on this piece of code is minimal -- after all, it's just a simple string concatenation we are talking about. If Hadoop changes how this url is constructed, it will take about 3 minutes to fix, 2.5 of which will be spent opening a Jira ticket. In the meantime, users will have a more usable product than they would without this one line of code. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757330#action_12757330 ] Daniel Dai commented on PIG-948: If Hadoop expose it, then there is no problem to include. But I don't think Hadoop expose this construction. I agree it is minimal to maintain and useful to many users. What I concern is to put undocumented Hadoop features into Pig code. I do not object to that, but I feel I need more inputs because we break a convention. How does other developers feel? [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757587#action_12757587 ] Pradeep Kamath commented on PIG-948: In my opinion, pig should only print the jobid. Users of pig would most likely already be using the hadoop job UI and should be able to track down the job given the job id. Giving the job id I think achieves the original issue of relating pig script to the corresponding MR jobs. While constructing the url is not complicated, embedding it in pig code seems ugly since we will most likely not track the changes in this url until a user notices it is broken - giving just the job id is useful in itself I think. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753173#action_12753173 ] Daniel Dai commented on PIG-948: One thing I am not sure is the way you interpolate the job tracker url {code} http://+ jobTrackerAdd+port+/jobdetails.jsp?jobid=+job.getAssignedJobID(); {code} I am not sure if we shall have this logic in pig, looks hacky to me. Other part is good. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753260#action_12753260 ] Ashutosh Chauhan commented on PIG-948: -- In this string, we are determining job-tracker address, port number and job-ids through apis, so thats fine. I agree that hardcoding other parts of url ( jobdetails.jsp?jobid= ) is not the best way to do it, as it will break the link if that web-url changes in later hadoop releases. But since there is no way to programatically get that url, I went ahead with this. If there is a way to get that url programatically, let me know. If not, I think its useful enough to have it like this and update it if it gets changed in later hadoop releases. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.