[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HADOOP-12325: --- Resolution: Fixed Fix Version/s: 2.7.4 Status: Resolved (was: Patch Available) I verified test failures and pushed to branch-2.7. > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1 > > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325-branch-2.7.00.patch, HADOOP-12325.001.patch, > HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, > HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HADOOP-12325: --- Status: Patch Available (was: Reopened) > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325-branch-2.7.00.patch, HADOOP-12325.001.patch, > HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, > HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HADOOP-12325: --- Attachment: HADOOP-12325-branch-2.7.00.patch > RPC Metrics : Add the ability track and log slow RPCs > - > > Key: HADOOP-12325 > URL: https://issues.apache.org/jira/browse/HADOOP-12325 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, metrics >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: Callers of WritableRpcEngine.call.png, > HADOOP-12325-branch-2.7.00.patch, HADOOP-12325.001.patch, > HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, > HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch > > > This JIRA proposes to add a counter called RpcSlowCalls and also a > configuration setting that allows users to log really slow RPCs. Slow RPCs > are RPCs that fall at 99th percentile. This is useful to troubleshoot why > certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.006.patch [~ajisakaa] Thanks for your review and changes to the test file. Please see my comments below bq. 1. Would you add a whitespace before took in the log message? fixed. bq. 2. After running the regression test locally, I can't see any logs about sleep RPC. On my machine if I open the file org.apache.hadoop.ipc.TestProtoBufRpc-output.txt in the sure-fire reports directory, I am able to see the following line. {code} 2015-08-24 10:52:16,713 WARN ipc.Server (Server.java:logSlowRpcCalls(438)) - Slow RPC : sleep took 3004 milliseconds to process from client 10.0.1.35:57223 {code} bq. Attaching a patch to verify that the slow call is logged. Now the test fails. With the new call {code} long after = getLongCounter(RpcSlowCalls, rpcMetrics); {code} somehow the mocking layer is still returning the old snap-shotted value. I have modified the tests to call server layer directly and tests are now behaving as expected. RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HADOOP-12325: --- Attachment: HADOOP-12325.005.test.patch Attaching a patch to verify that the slow call is logged. Now the test fails. RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, HADOOP-12325.005.patch, HADOOP-12325.005.test.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HADOOP-12325: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks [~anu] for the contribution and [~ajisakaa] for the review. I've commit the change to trunk and branch-2. RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: 2.8.0 Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, HADOOP-12325.005.patch, HADOOP-12325.005.test.patch, HADOOP-12325.006.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HADOOP-12325: -- Status: Patch Available (was: Open) RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, HADOOP-12325.005.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HADOOP-12325: -- Status: Open (was: Patch Available) RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, HADOOP-12325.005.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.003.patch Fixed CheckStyle issues , two issues still remain, please ignore them. * {{Server.java}} - File is too long * Variable 'logSlowRPC' must be private - it is a hadoop metric and follows the general pattern in the file. RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HADOOP-12325: Attachment: Callers of WritableRpcEngine.call.png RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.004.patch fix java doc RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.005.patch Support Slow RPC logging for WriteableRpcEngine also RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: Callers of WritableRpcEngine.call.png, HADOOP-12325.001.patch, HADOOP-12325.002.patch, HADOOP-12325.003.patch, HADOOP-12325.004.patch, HADOOP-12325.005.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.002.patch Thanks for detailed review [~xyao] . I have attached a revised patch. Please see below for my detailed comments. bq. 1. Do you miss updating all the caller of ProtobufRpcEngine.call() to pass receiveTime using Time.monotonicNow() instead of the Time.now()? fixed, I have reverted to using Time.now() for in this patch. bq. 2. Do we need update WritableRpcEngine.java class with logSlowRpcCalls()? I could not find any place where we were using WritableRpcEngine for real, hence I did not make that change. bq. 3. NIT: Can you put the magic number 1024 as final variable like fixed bq. 4. Can you change the following from fixed bq. 5. NIT: Rpc - RPC to be consistent fixed bq. Can you make the SleepRequestProto accepting a duration parameter instead of the fixed SLEEP_DURATION (1000ms)? done bq. 7. Is it possible to test with 1K fast calls instead of 10K calls to save test resources without affecting the results? I had benchmarked these calls and even with 10K it is in milliseconds. The reason I was making 10 K calls is to make sure that the test exercises the computation and the statistical significance properly. RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: HADOOP-12325.001.patch, HADOOP-12325.002.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Attachment: HADOOP-12325.001.patch This patch adds : * An metric called RpcSlowCalls * Ability to log Slow calls if ipc.server.log.slow.rpc is set to true RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: HADOOP-12325.001.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs
[ https://issues.apache.org/jira/browse/HADOOP-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HADOOP-12325: -- Status: Patch Available (was: Open) RPC Metrics : Add the ability track and log slow RPCs - Key: HADOOP-12325 URL: https://issues.apache.org/jira/browse/HADOOP-12325 Project: Hadoop Common Issue Type: Improvement Components: ipc, metrics Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Attachments: HADOOP-12325.001.patch This JIRA proposes to add a counter called RpcSlowCalls and also a configuration setting that allows users to log really slow RPCs. Slow RPCs are RPCs that fall at 99th percentile. This is useful to troubleshoot why certain services like name node freezes under heavy load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)