[ 
https://issues.apache.org/jira/browse/RATIS-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256778#comment-17256778
 ] 

runzhiwang edited comment on RATIS-1277 at 12/31/20, 1:50 AM:
--------------------------------------------------------------

Start 3 servers:

{code:java}
 BIN=ratis-examples/src/main/bin
PEERS=n0:ip1:6000:7000,n1:ip2:6001:7001,n2:ip3:6002:7002
nohup ${BIN}/server.sh filestore server --id n0 --storage /data/ratis/n0 
--storage /data1/ratis/n0 --storage /data2/ratis/n0 --storage /data3/ratis/n0 
--storage /data4/ratis/n0 --storage /data5/ratis/n0 --storage /data6/ratis/n0 
--storage /data7/ratis/n0 --storage /data8/ratis/n0 --storage /data9/ratis/n0 
--storage /data10/ratis/n0 --storage /data11/ratis/n0 --peers ${PEERS} 
--writeThreadNum 100 --readThreadNum 100 --commitThreadNum 20 --deleteThreadNum 
20 >> n0.log 2>&1 &

nohup ${BIN}/server.sh filestore server --id n1 --storage /data/ratis/n1 
--storage /data1/ratis/n1 --storage /data2/ratis/n1 --storage /data3/ratis/n1 
--storage /data4/ratis/n1 --storage /data5/ratis/n1 --storage /data6/ratis/n1 
--storage /data7/ratis/n1 --storage /data8/ratis/n1 --storage /data9/ratis/n1 
--storage /data10/ratis/n1 --storage /data11/ratis/n1 --peers ${PEERS} 
--writeThreadNum 100 --readThreadNum 100 --commitThreadNum 20 --deleteThreadNum 
20 >> n1.log 2>&1 &

nohup ${BIN}/server.sh filestore server --id n2 --storage /data/ratis/n2 
--storage /data1/ratis/n2 --storage /data2/ratis/n2 --storage /data3/ratis/n2 
--storage /data4/ratis/n2 --storage /data5/ratis/n2 --storage /data6/ratis/n2 
--storage /data7/ratis/n2 --storage /data8/ratis/n2 --storage /data9/ratis/n2 
--storage /data10/ratis/n2 --storage /data11/ratis/n2 --peers ${PEERS} 
--writeThreadNum 100 --readThreadNum 100 --commitThreadNum 20 --deleteThreadNum 
20 >> n2.log 2>&1 &
{code}



start 3 clients on 3 machines:
 
{code:java}
${BIN}/client.sh filestore datastream --size 128000000 --numFiles 600 
--bufferSize 1000000 --syncSize 0 --type DirectByteBuffer --peers ${PEERS} 
--storage /data/ratis/n2 --storage /data1/ratis/n2 --storage /data2/ratis/n2 
--storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage /data5/ratis/n2 
--storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage /data8/ratis/n2 
--storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage /data11/ratis/n2
{code}



Then it will reproduce, but it did not happen in my previous test, not sure why.


was (Author: yjxxtd):
Start 3 servers:
 BIN=ratis-examples/src/main/bin
PEERS=n0:ip1:6000:7000,n1:ip2:6001:7001,n2:ip3:6002:7002
nohup ${BIN}/server.sh filestore server --id n0 --storage /data/ratis/n0 
--storage /data1/ratis/n0 --storage /data2/ratis/n0 --storage /data3/ratis/n0 
--storage /data4/ratis/n0 --storage /data5/ratis/n0 --storage /data6/ratis/n0 
--storage /data7/ratis/n0 --storage /data8/ratis/n0 --storage /data9/ratis/n0 
--storage /data10/ratis/n0 --storage /data11/ratis/n0 --peers ${PEERS} 
--writeThreadNum 100 --readThreadNum 100 --commitThreadNum 20 --deleteThreadNum 
20 >> n0.log 2>&1 &

nohup ${BIN}/server.sh filestore server --id n1 --storage /data/ratis/n1 
--storage /data1/ratis/n1 --storage /data2/ratis/n1 --storage /data3/ratis/n1 
--storage /data4/ratis/n1 --storage /data5/ratis/n1 --storage /data6/ratis/n1 
--storage /data7/ratis/n1 --storage /data8/ratis/n1 --storage /data9/ratis/n1 
--storage /data10/ratis/n1 --storage /data11/ratis/n1 --peers ${PEERS} 
--writeThreadNum 100 --readThreadNum 100 --commitThreadNum 20 --deleteThreadNum 
20 >> n1.log 2>&1 &

nohup ${BIN}/server.sh filestore server --id n2 --storage /data/ratis/n2 
--storage /data1/ratis/n2 --storage /data2/ratis/n2 --storage /data3/ratis/n2 
--storage /data4/ratis/n2 --storage /data5/ratis/n2 --storage /data6/ratis/n2 
--storage /data7/ratis/n2 --storage /data8/ratis/n2 --storage /data9/ratis/n2 
--storage /data10/ratis/n2 --storage /data11/ratis/n2 --peers ${PEERS} 
--writeThreadNum 100 --readThreadNum 100 --commitThreadNum 20 --deleteThreadNum 
20 >> n2.log 2>&1 &


start 3 clients on 3 machines:
 ${BIN}/client.sh filestore datastream --size 128000000 --numFiles 600 
--bufferSize 1000000 --syncSize 0 --type DirectByteBuffer --peers ${PEERS} 
--storage /data/ratis/n2 --storage /data1/ratis/n2 --storage /data2/ratis/n2 
--storage /data3/ratis/n2 --storage /data4/ratis/n2 --storage /data5/ratis/n2 
--storage /data6/ratis/n2 --storage /data7/ratis/n2 --storage /data8/ratis/n2 
--storage /data9/ratis/n2 --storage /data10/ratis/n2 --storage /data11/ratis/n2


Then it will reproduce, but it did not happen in my previous test, not sure why.

> FileStore write failed because out of order
> -------------------------------------------
>
>                 Key: RATIS-1277
>                 URL: https://issues.apache.org/jira/browse/RATIS-1277
>             Project: Ratis
>          Issue Type: Sub-task
>            Reporter: runzhiwang
>            Assignee: runzhiwang
>            Priority: Major
>         Attachments: screenshot-2.png, screenshot-3.png
>
>
>  !screenshot-3.png! 
> As the following image and code shows, the code check the byteWritten of 
> STREAM_HEADER, i.e. 0,  equals to 10000, of course failed.
>  !screenshot-2.png! 
> {code:java}
> static boolean 
> checkSuccessRemoteWrite(List<CompletableFuture<DataStreamReply>> 
> replyFutures, long bytesWritten) {
>     for (CompletableFuture<DataStreamReply> replyFuture : replyFutures) {
>       final DataStreamReply reply = replyFuture.join();
>       if (!reply.isSuccess() || reply.getBytesWritten() != bytesWritten) {
>         + System.err.println("succ:" + reply.isSuccess() + " reply written:" 
> + reply.getBytesWritten() +
>         +    " expected:" + bytesWritten + " clientId:" + reply.getClientId() 
> + ",type:" + reply.getType() + ",streamId" +
>         +    reply.getStreamId() + ",offset:" + reply.getStreamOffset() + 
> ",datalength:" + reply.getDataLength());
>         return false;
>       }
>     }
>     return true;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to