[ 
https://issues.apache.org/jira/browse/HDDS-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Shao Hong updated HDDS-6203:
-------------------------------
    Description: 
Right now, the container re-replication will be sent with GRPC as gz files to 
the temporary dir. If the temporary dir is small, there will be a contest for 
space left that concurrent threads downloading the containers will compete to 
write the downloaded byte buffer to the actual files with FileOutputStream. 

Once the thread fails to write the buffer to file, the current logic will not 
clean up the failed and incomplete file and just complete exceptionally as the 
code shows.

 
{code:java}
GrpcReplicationClient

@Override
public void onNext(CopyContainerResponseProto chunk) {
  try {
    chunk.getData().writeTo(stream);
  } catch (IOException e) {
    response.completeExceptionally(e);
  }
} {code}
the exception will be caught at ```getContainerDataFromReplicas``` and only 
will be logged as an error.

Thus it is necessary to clean up the possible incomplete files which failed in 
this case.

 

 

>From https://issues.apache.org/jira/browse/HDDS-5188, maybe we should improve 
>the protocol in the future. 

 

In addition, I have tested manually to mimic such a contest case and proved the 
incomplete files remained, the example could be seen in the attachment. I 
manually create a mounted disk of 5G size as temp file dir.

 

  was:
Right now, the container re-replication will be sent with GRPC as gz files to 
the temporary dir. If the temporary dir is small, there will be a contest for 
space left that concurrent threads downloading the containers will compete to 
write the downloaded byte buffer to the actual files with FileOutputStream. 

Once the thread fails to write the buffer to file, the current logic will not 
clean up the failed and incomplete file and just complete exceptionally as the 
code shows.

 
{code:java}
GrpcReplicationClient

@Override
public void onNext(CopyContainerResponseProto chunk) {
  try {
    chunk.getData().writeTo(stream);
  } catch (IOException e) {
    response.completeExceptionally(e);
  }
} {code}
the exception will be caught at ```getContainerDataFromReplicas``` and only 
will be logged as an error.

Thus it is necessary to clean up the possible incomplete files which failed in 
such case.

(I have tested manually to mimic such a contest case and proved the incomplete 
files remained, the example could be seen in attachment)

 


> CleanUp gz files failed to be fully written during Container move
> -----------------------------------------------------------------
>
>                 Key: HDDS-6203
>                 URL: https://issues.apache.org/jira/browse/HDDS-6203
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Xu Shao Hong
>            Assignee: Xu Shao Hong
>            Priority: Major
>
> Right now, the container re-replication will be sent with GRPC as gz files to 
> the temporary dir. If the temporary dir is small, there will be a contest for 
> space left that concurrent threads downloading the containers will compete to 
> write the downloaded byte buffer to the actual files with FileOutputStream. 
> Once the thread fails to write the buffer to file, the current logic will not 
> clean up the failed and incomplete file and just complete exceptionally as 
> the code shows.
>  
> {code:java}
> GrpcReplicationClient
> @Override
> public void onNext(CopyContainerResponseProto chunk) {
>   try {
>     chunk.getData().writeTo(stream);
>   } catch (IOException e) {
>     response.completeExceptionally(e);
>   }
> } {code}
> the exception will be caught at ```getContainerDataFromReplicas``` and only 
> will be logged as an error.
> Thus it is necessary to clean up the possible incomplete files which failed 
> in this case.
>  
>  
> From https://issues.apache.org/jira/browse/HDDS-5188, maybe we should improve 
> the protocol in the future. 
>  
> In addition, I have tested manually to mimic such a contest case and proved 
> the incomplete files remained, the example could be seen in the attachment. I 
> manually create a mounted disk of 5G size as temp file dir.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to