[ 
https://issues.apache.org/jira/browse/HDFS-15079?focusedWorklogId=787609&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-787609
 ]

ASF GitHub Bot logged work on HDFS-15079:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Jul/22 14:04
            Start Date: 04/Jul/22 14:04
    Worklog Time Spent: 10m 
      Work Description: ZanderXu opened a new pull request, #4530:
URL: https://github.com/apache/hadoop/pull/4530

   ### Description of PR
   Jira: [HDFS-15079](https://issues.apache.org/jira/browse/HDFS-15079)
   
   Similarly with 
[HDFS-13248](https://issues.apache.org/jira/browse/HDFS-13248),  RBF adds the 
actual client Id and client call Id in CallerContext and carries them to 
NameNode. Then nameNode try to obtain the actual client Id and client call Id 
to ensure CacheEntry mechanism.
   
   




Issue Time Tracking
-------------------

            Worklog Id:     (was: 787609)
    Remaining Estimate: 0h
            Time Spent: 10m

> RBF: Client maybe get an unexpected result with network anomaly 
> ----------------------------------------------------------------
>
>                 Key: HDFS-15079
>                 URL: https://issues.apache.org/jira/browse/HDFS-15079
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: rbf
>    Affects Versions: 3.3.0
>            Reporter: Hui Fei
>            Priority: Critical
>         Attachments: HDFS-15079.001.patch, HDFS-15079.002.patch, 
> UnexpectedOverWriteUT.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
>  I find there is a critical problem on RBF, HDFS-15078 can resolve it on some 
> Scenarios, but i have no idea about the overall resolution.
> The problem is that
> Client with RBF(r0, r1) create a file HDFS file via r0, it gets Exception and 
> failovers to r1
> r0 has been send create rpc to namenode(1st create)
> Client create a HDFS file via r1(2nd create)
> Client writes the HDFS file and close it finally(3rd close)
> Maybe namenode receiving the rpc in order as follow
> 2nd create
> 3rd close
> 1st create
> And overwrite is true by default, this would make the file had been written 
> an empty file. This is an critical problem 
> We had encountered this problem. There are many hive and spark jobs running 
> on our cluster,   sometimes it occurs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to