[
https://issues.apache.org/jira/browse/KNOX-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sandeep More updated KNOX-1436:
-------------------------------
Fix Version/s: 1.2.0
> AbstractHdfsHaDispatch failoverRequest - Improve Failover Logging
> -----------------------------------------------------------------
>
> Key: KNOX-1436
> URL: https://issues.apache.org/jira/browse/KNOX-1436
> Project: Apache Knox
> Issue Type: Bug
> Reporter: Matthew Sharp
> Priority: Minor
> Fix For: 1.2.0
>
> Attachments: KNOX-1436.patch
>
>
> The current WebHDFS failoverRequest method makes it a bit difficult to track
> which host it failed on vs. which it is retrying next.
> Example:
> {code:java}
> 2018-09-06 07:49:07,245 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:executeRequest(85)) - Received an error from a
> node in SafeMode: org.apache.knox.gateway.hdfs.dispatch.SafeModeException
> 2018-09-06 07:49:07,246 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:failoverRequest(115)) - Failing over request to
> a different server:
> http://host1.example.com:50070/webhdfs/v1/user/matt/test.txt?op=CREATE&doAs=matt
> 2018-09-06 07:49:08,278 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:executeRequest(82)) - Received an error from a
> node in Standby: org.apache.knox.gateway.hdfs.dispatch.StandbyException
> 2018-09-06 07:49:08,279 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:failoverRequest(115)) - Failing over request to
> a different server:
> http://host2.example.com:50070/webhdfs/v1/user/matt/test.txt?op=CREATE&doAs=matt
> 2018-09-06 07:49:09,291 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:executeRequest(85)) - Received an error from a
> node in SafeMode: org.apache.knox.gateway.hdfs.dispatch.SafeModeException
> 2018-09-06 07:49:09,291 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:failoverRequest(115)) - Failing over request to
> a different server:
> http://host1.example.com:50070/webhdfs/v1/user/matt/test.txt?op=CREATE&doAs=matt
> 2018-09-06 07:49:10,366 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:executeRequest(82)) - Received an error from a
> node in Standby: org.apache.knox.gateway.hdfs.dispatch.StandbyException
> 2018-09-06 07:49:10,367 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:failoverRequest(115)) - Failing over request to
> a different server:
> http://host2.example.com:50070/webhdfs/v1/user/matt/test.txt?op=CREATE&doAs=matt
> 2018-09-06 07:49:10,368 INFO knox.gateway
> (AbstractHdfsHaDispatch.java:failoverRequest(136)) - Maximum attempts 3 to
> failover reached for service: WEBHDFS
> {code}
> In the example above, host1.example.com already failed initially and the
> message states failing over to a different host with host1.example.com still.
> Suggestion:
> The HaDispatchMessages for failingOverRequest should be moved down below the
> markFailedURL call, so it is actually returning the next URI it is trying to
> failover to (not the current it already failed on).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)