yyar opened a new issue, #7472:
URL: https://github.com/apache/hudi/issues/7472

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at 
[email protected].
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   Sorry for my poor English skill.
   When I updated hudi version from 0.8.0 to 0.11.1, my spark app had been too 
slow after running for a few days. I found main thread was hanging
   with below stacktrace. This hang was caused by too many files in 
`.hoodie/metadata/.hoodie` directory. This situation was caused by an old 
rollback timeline file. 
    In my active timeline files(in `.hoodie` dir), deltacommit/commit file was 
latest one, but rollback was not. I found this comment that rollback archiving 
and commit archiving are dealt with separately. But in metadata timeline 
archiving implementation, metadata timeline is archived only older than active 
timeline. 
   Also, rollback action occurs often, so I think that the earliest active 
timeline for archiving metadata timeline should not be rollback timeline. 
   
   https://github.com/apache/hudi/issues/4892#issuecomment-1050306086
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Launch Spark Application using hudi with active `hoodie.metadata.enable` 
option. This application writes and append file continuously.
   2. Rollback occurred. (Terminate app force when hudi commit or deltacommit 
is running)
   3. After few days later, there are too many timeline files in 
`.hoodie/metadata/.hoodie` directory. Operations about metadata timeline will 
be slow.
   
   
   **Expected behavior**
   
   Metadata table timeline should be archived well although there are some 
rollback timelines in .hoodie director. 
   
   **Environment Description**
   
   * Hudi version : 0.11.1
   
   * Spark version : 3.1.3
   
   * Hive version : - 
   
   * Hadoop version : 2.7.4, 2.10.0
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : Both k8s and yarn.
   
   
   **Stacktrace**
   
   ```[email protected]/java.lang.Object.wait(Native Method)
   [email protected]/java.lang.Object.wait(Unknown Source)
   app//org.apache.hadoop.ipc.Client.call(Client.java:1467)
   app//org.apache.hadoop.ipc.Client.call(Client.java:1413)
   
app//org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
   app//com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
   
app//org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255)
   jdk.internal.reflect.GeneratedMethodAccessor194.invoke(Unknown Source)
   
[email protected]/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
 Source)
   [email protected]/java.lang.reflect.Method.invoke(Unknown Source)
   
app//org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
   
app//org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   app//com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)
   
app//org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1226)
   app//org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
   app//org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
   
app//org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:306)
   app//org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:272) 
=> holding Monitor(java.lang.Object@1437628452})
   app//org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:264)
   app//org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1526)
   
app//org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304)
   
app//org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
   
app//org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
   
app//org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312)
   app//org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
   
org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:460)
   
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.readDataFromPath(HoodieActiveTimeline.java:760)```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to