[
https://issues.apache.org/jira/browse/HDFS-13399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464160#comment-16464160
]
Plamen Jeliazkov edited comment on HDFS-13399 at 5/4/18 5:21 PM:
-----------------------------------------------------------------
Yes I propose to remove it from {{DFSClient}}. I think for now I will create a
new {{ProxyProvider}} that is only used in tests and makes use of
{{AlignmentContext}}. I will be able to pull it because I'll have access to the
instance within my tests. Similar to how others created their own {{RpcEngine}}
implementations within unit tests. This should be enough to showcase the
stateId transfer. We can remove my class if we want after we have the
{{StandbyReadsProxyProvider}} working.
Regarding the issue about the transactionId – I want to clear up below what I
am talking about:
Imagine a fresh HA-enabled DFS, at transactionId 0, is initialized. A client
connects and makes a single directory. We should now expect to be at
transactionId 1 and expect that the client received, in the RPC response
header, a stateId of 1. However this is not the case. The reason it is not the
case is because HA-enabled NameNodes utilize {{FSEditLogAsync}} which updates
the txid field, the field we rely on in
{{FSNamesystem.getLastWrittenTransactionId}}, asynchronously from the client
call. The result is that in the RPC response header the client receives a
stateId of 0. Not 1. This is clearly incorrect. We do not want a client to
connect to a NameNode that is behind in state.
Clearly this is just a race condition but it has already appeared in my unit
tests.
One idea is to modify {{FSEditLogAsync}} like so:
{code:java}
@Override
long getLastWrittenTxIdWithoutLock() {
return super.getLastWrittenTxIdWithoutLock() + editPendingQ.size() +
syncWaitQ.size();
}
{code}
However I am unsure if this would be correct / safe to do. Input from others
would be desired here.
was (Author: zero45):
Yes I propose to remove it from {{DFSClient}}. I think for now I will create a
new {{ProxyProvider}} that is only used in tests and makes use of
{{AlignmentContext}}. I will be able to pull it because I'll have access to the
instance within my tests. Similar to how others created their own {{RpcEngine}}
implementations within unit tests. This should be enough to showcase the
stateId transfer. We can remove my class if we want after we have the
{{StandbyReadsProxyProvider}} working.
Regarding the issue about the transactionId – I want to clear up below what I
am talking about:
Imagine a fresh HA-enabled DFS, at transactionId 0, is initialized. A client
connects and makes a single directory. We should now expect to be at
transactionId 1 and expect that the client received, in the RPC response
header, a stateId of 1. However this is not the case. The reason it is not the
case is because HA-enabled NameNodes utilize {{FSEditLogAsync}} which updates
the txid field, the field we rely on in
{{FSNamesystem.getLastWrittenTransactionId}}, asynchronously from the client
call. The result is that in the RPC response header the client receives they
get a stateId of 0. Not 1. This is clearly incorrect. We do not want a client
to connect to a NameNode that is behind in state.
Clearly this is just a race condition but it has already appeared in my unit
tests.
One idea is to modify {{FSEditLogAsync}} like so:
{code:java}
@Override
long getLastWrittenTxIdWithoutLock() {
return super.getLastWrittenTxIdWithoutLock() + editPendingQ.size() +
syncWaitQ.size();
}
{code}
However I am unsure if this would be correct / safe to do. Input from others
would be desired here.
> Make Client field AlignmentContext non-static.
> ----------------------------------------------
>
> Key: HDFS-13399
> URL: https://issues.apache.org/jira/browse/HDFS-13399
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Affects Versions: HDFS-12943
> Reporter: Plamen Jeliazkov
> Assignee: Plamen Jeliazkov
> Priority: Major
> Attachments: HDFS-13399-HDFS-12943.000.patch,
> HDFS-13399-HDFS-12943.001.patch, HDFS-13399-HDFS-12943.002.patch,
> HDFS-13399-HDFS-12943.003.patch, HDFS-13399-HDFS-12943.004.patch,
> HDFS-13399-HDFS-12943.005.patch, HDFS-13399-HDFS-12943.006.patch
>
>
> In HDFS-12977, DFSClient's constructor was altered to make use of a new
> static method in Client that allowed one to set an AlignmentContext. This
> work is to remove that static field and make each DFSClient pass it's
> AlignmentContext down to the proxy Call level.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]