[ 
https://issues.apache.org/jira/browse/RATIS-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256865#comment-17256865
 ] 

runzhiwang edited comment on RATIS-1277 at 12/31/20, 7:29 AM:
--------------------------------------------------------------

[~szetszwo] This problem caused by the key of NettyClientStreamRpc#replies 
which should not use streamId as the key, it should use clientId + streamId as 
key. In my previous test, different client choose different primary, and 
primary connect to two peers, so the NettyClientStreamRpc#replies of primary 
does not accpet request from different clients, so this problem did not happen 
in my previous test. But now, we use primary -> peer1 -> peer2, though 
different client choose different primary, but peer1 can accpet request from 
different clients, then reply and request are out of order.  We can prove this 
from the following message, the reply and request has the same streamId, but 
has different clientId.

{code:java}
succ:true reply written:0 expected:1000000 
clientId:client-6C3D34F237DA,type:STREAM_HEADER,streamId4,offset:0,datalength:0 
localWrite:1927248929 remoteWrites:789787006 
request:DataStreamRequestByteBuf:clientId=client-F1F2794DEFF0,type=STREAM_DATA,id=4,offset=48000000,length=1000000
{code}



was (Author: yjxxtd):
[~szetszwo] This problem caused by the key of NettyClientStreamRpc#replies 
which should not use streamId as the key, it should use clientId + streamId as 
key. In my previous test, different client choose different primary, and 
primary connect to two peers, so the NettyClientStreamRpc#replies of primary 
does not accpet request from different clients, so this problem did not happen 
in my previous test. But now, we use primary -> peer1 -> peer2, though 
different client choose different primary, but peer1 can accpet request from 
different clients, then reply and request are out of order.  We can find from 
the following message, the reply and request has the same streamId, but has 
different clientId.

{code:java}
succ:true reply written:0 expected:1000000 
clientId:client-6C3D34F237DA,type:STREAM_HEADER,streamId4,offset:0,datalength:0 
localWrite:1927248929 remoteWrites:789787006 
request:DataStreamRequestByteBuf:clientId=client-F1F2794DEFF0,type=STREAM_DATA,id=4,offset=48000000,length=1000000
{code}


> FileStore write failed because out of order
> -------------------------------------------
>
>                 Key: RATIS-1277
>                 URL: https://issues.apache.org/jira/browse/RATIS-1277
>             Project: Ratis
>          Issue Type: Sub-task
>            Reporter: runzhiwang
>            Assignee: runzhiwang
>            Priority: Major
>         Attachments: screenshot-2.png, screenshot-3.png
>
>
>  !screenshot-3.png! 
> As the following image and code shows, the code check the byteWritten of 
> STREAM_HEADER, i.e. 0,  equals to 10000, of course failed.
>  !screenshot-2.png! 
> {code:java}
> static boolean 
> checkSuccessRemoteWrite(List<CompletableFuture<DataStreamReply>> 
> replyFutures, long bytesWritten) {
>     for (CompletableFuture<DataStreamReply> replyFuture : replyFutures) {
>       final DataStreamReply reply = replyFuture.join();
>       if (!reply.isSuccess() || reply.getBytesWritten() != bytesWritten) {
>         + System.err.println("succ:" + reply.isSuccess() + " reply written:" 
> + reply.getBytesWritten() +
>         +    " expected:" + bytesWritten + " clientId:" + reply.getClientId() 
> + ",type:" + reply.getType() + ",streamId" +
>         +    reply.getStreamId() + ",offset:" + reply.getStreamOffset() + 
> ",datalength:" + reply.getDataLength());
>         return false;
>       }
>     }
>     return true;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to