[jira] [Commented] (HDFS-15887) Make LogRoll and TailEdits execute in parallel

JiangHua Zhu (Jira) Tue, 30 Mar 2021 23:12:04 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-15887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312087#comment-17312087
 ]


JiangHua Zhu commented on HDFS-15887:
-------------------------------------

In our cluster, when a checkpoint action occurs, lock will be used for more 
than tens of seconds at this time:
2021-03-31 09:47:29,719 [674150524] - INFO  [Edit log 
tailer:FSNamesystemLock@261] - FSNamesystem write lock held for 83843 ms via
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1021)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:261)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1596)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:278)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAs(Subject.java:360)

Under normal circumstances, you only need to use lock for a few seconds:
2021-03-31 09:17:09,459 [672330264] - INFO  [Edit log 
tailer:FSNamesystemLock@261] - FSNamesystem write lock held for 5833 ms via
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1021)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:261)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1596)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:278)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348)
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAs(Subject.java:360)

> Make LogRoll and TailEdits execute in parallel
> ----------------------------------------------
>
>                 Key: HDFS-15887
>                 URL: https://issues.apache.org/jira/browse/HDFS-15887
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: JiangHua Zhu
>            Assignee: JiangHua Zhu
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: edit_files.jpg
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> In the EditLogTailer class, LogRoll and TailEdits are executed in a thread, 
> and when a checkpoint occurs, it will compete with TailEdits for lock 
> (FSNamesystem#cpLock).
> Usually, it takes a long time to execute checkpoint, which will cause the 
> size of the generated edit log file to be relatively large.
> For example, here is an actual effect:
> The StandbyCheckpointer log is triggered as follows :  edit_files.jpg
> 2021-03-11 09:18:42,513 [769071096]-INFO [Standby State 
> Checkpointer:StandbyCheckpointer$CheckpointerThread@335]-Triggering 
> checkpoint because there have been 5142154 txns since the last checkpoint, 
> which exceeds the configured threshold 1000000
> When loading an edit log with a large amount of data, the processing time 
> will be longer. We should make the edit log size as even as possible, which 
> is good for the operation of the system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-15887) Make LogRoll and TailEdits execute in parallel

Reply via email to