cbomgit opened a new issue, #11821:
URL: https://github.com/apache/hudi/issues/11821

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at 
[email protected].
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   We're experiencing a strange issue when deleting records from a table with 
an UPSERT command. We're noticing, that occasionally, our job will get stuck on 
the `Doing partition and writing data` stage for multiple hours. We generally 
see this vary between 5 - 30 minutes, but will sometimes see this take several 
hours. For example, in this screen shot, the one task took 5-ish hours (the 
remaining tasks were skipped).
   
   
![image](https://github.com/user-attachments/assets/7f3de682-680b-498a-b186-3d631ed1ef86)
   
   Our job runs in a loop and works of batches of data, so it is a long running 
job. We have not yet identified commonalities in the data batches when we run 
into this. We have not observed high CPU, OOM, loss of data nodes or high 
datanode space usage, during this time.
   
   We do not see any errors in the driver application logs or the executor/task 
logs that indicate that something is going wrong. We do see a lot heartbeat 
files that are written during this time. Digging into the task logs for an 
example task that took a long time. We see something like:
   
   ```
   4-08-23 09:26:32,501 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) org.apache.hudi.io.HoodieMergeHandle: Number of 
entries in MemoryBasedMap => 170, Total size in bytes of MemoryBasedMap => 
165668, Number of entries in BitCaskDiskMap => 0, Size of file spilled to disk 
=> 0
   2024-08-23 09:26:32,501 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) org.apache.hudi.io.HoodieMergeHandle: 
partitionPath:region=NA/year=2024/month=02/day=14/hour=00, fileId to be 
merged:b8973e4d-5bc9-4a3a-ba99-8181c9831056-0
   2024-08-23 09:26:32,575 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) org.apache.hudi.io.HoodieMergeHandle: Merging new 
data into oldPath 
s3://bucket-name/region=NA/year=2024/month=02/day=14/hour=00/b8973e4d-5bc9-4a3a-ba99-8181c9831056-0_4-4995-1712252_20240815132438223.parquet,
 as newPath 
s3://bucket-name/region=NA/year=2024/month=02/day=14/hour=00/bucket-name
   2024-08-23 09:26:33,241 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) org.apache.hudi.table.marker.DirectWriteMarkers: 
Creating Marker 
Path=s3://bucket-name/.hoodie/.temp/20240823091953364/region=NA/year=2024/month=02/day=14/hour=00/bucket-name.marker.MERGE
   2024-08-23 09:26:33,275 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) 
com.amazon.ws.emr.hadoop.fs.cse.CSEMultipartUploadOutputStream: close 
closed:false 
s3://bucket-name/.hoodie/.temp/20240823091953364/region=NA/year=2024/month=02/day=14/hour=00/bucket-name.marker.MERGE
   2024-08-23 09:26:33,460 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) 
com.amazon.ws.emr.hadoop.fs.s3.upload.dispatch.DefaultMultipartUploadDispatcher:
 Completed multipart upload of 1 parts 0 bytes
   2024-08-23 09:26:33,460 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) 
com.amazon.ws.emr.hadoop.fs.cse.CSEMultipartUploadOutputStream: Finished 
uploading 
bucket-name/.hoodie/.temp/20240823091953364/region=NA/year=2024/month=02/day=14/hour=00/bucket-name.marker.MERGE.
 Elapsed seconds: 0.
   2024-08-23 09:26:33,460 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) org.apache.hudi.table.marker.DirectWriteMarkers: 
[direct] Created marker file 
s3://bucket-name/.hoodie/.temp/20240823091953364/region=NA/year=2024/month=02/day=14/hour=00/bucket-name.marker.MERGE
 in 885 ms:%
   ```
   
   The next time we see this task id in the logs, we see:
   
   ```
   2024-08-23 11:18:09,558 [INFO] (producer-thread-1) 
org.apache.hudi.common.util.queue.IteratorBasedQueueProducer: finished 
buffering records
   2024-08-23 11:18:09,560 [INFO] (consumer-thread-1) 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor: Queue Consumption is 
done; notifying producer threads
   2024-08-23 11:18:12,007 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) 
com.amazon.ws.emr.hadoop.fs.cse.CSEMultipartUploadOutputStream: close 
closed:false 
s3://bucket-name/table_name/region=NA/year=2024/month=02/day=14/hour=00/b8973e4d-5bc9-4a3a-ba99-8181c9831056-0_4-5965-2033103_20240823091953364.parquet
   2024-08-23 11:18:13,383 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) 
com.amazon.ws.emr.hadoop.fs.s3.upload.dispatch.DefaultMultipartUploadDispatcher:
 Completed multipart upload of 6 parts 766624225 bytes
   2024-08-23 11:18:13,383 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) 
com.amazon.ws.emr.hadoop.fs.cse.CSEMultipartUploadOutputStream: Finished 
uploading 
bucket-name/table_name/region=NA/year=2024/month=02/day=14/hour=00/b8973e4d-5bc9-4a3a-ba99-8181c9831056-0_4-5965-2033103_20240823091953364.parquet.
 Elapsed seconds: 6699.
   2024-08-23 11:18:13,513 [INFO] (Executor task launch worker for task 4.0 in 
stage 5965.0 (TID 2033103)) org.apache.hudi.io.HoodieMergeHandle: MergeHandle 
for partitionPath region=NA/year=2024/month=02/day=14/hour=00 fileID 
b8973e4d-5bc9-4a3a-ba99-8181c9831056-0, took 6701018 ms.
   ```
   
   Note the 5th line that says that elapsed seconds (6699 = 1.86 hours). When 
looking at the uploaded file in question, I notice that it is about ~800MB in 
size. Is it expected that such a file size would bottleneck the job like this? 
   
   Here's the config we're using (with sensitive info redacted):
   
   ```
   hoodie.parquet.small.file.limit -> 104857600
   hoodie.datasource.write.precombine.field -> eventVersion
   hoodie.datasource.write.payload.class -> 
org.apache.hudi.common.model.EmptyHoodieRecordPayload
   hoodie.bloom.index.filter.dynamic.max.entries -> 1106137
   hoodie.cleaner.fileversions.retained -> 2
   hoodie.aws.secret.key -> EW/jzY57lWAQVDazxzK31jLhTFugZPhbQvZsvP2D
   hoodie.parquet.max.file.size -> 134217728
   hoodie.cleaner.parallelism -> 1500
   hoodie.write.lock.client.num_retries -> 10
   hoodie.delete.shuffle.parallelism -> 1500
   hoodie.bloom.index.prune.by.ranges -> true
   hoodie.metadata.enable -> false
   hoodie.clean.automatic -> false
   hoodie.datasource.write.operation -> upsert
   hoodie.write.lock.wait_time_ms -> 600000
   hoodie.metrics.reporter.type -> CLOUDWATCH
   hoodie.datasource.write.recordkey.field -> 
timestamp,eventId,subType,trackedItem
   hoodie.table.name -> my_table_name
   hoodie.datasource.write.table.type -> COPY_ON_WRITE
   hoodie.datasource.write.hive_style_partitioning -> true
   hoodie.datasource.write.partitions.to.delete -> 
   hoodie.write.lock.dynamodb.partition_key -> my_table_name_key
   hoodie.cleaner.policy -> KEEP_LATEST_FILE_VERSIONS
   hoodie.write.markers.type -> DIRECT
   hoodie.metrics.on -> false
   hoodie.datasource.write.reconcile.schema -> true
   hoodie.datasource.write.keygenerator.class -> 
org.apache.hudi.keygen.ComplexKeyGenerator
   hoodie.cleaner.policy.failed.writes -> LAZY
   hoodie.upsert.shuffle.parallelism -> 1500
   hoodie.write.lock.dynamodb.table -> HoodieLockTable
   hoodie.write.lock.provider -> 
org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider
   hoodie.datasource.write.partitionpath.field -> region,year,month,day,hour
   hoodie.bloom.index.filter.type -> DYNAMIC_V0
   hoodie.write.lock.wait_time_ms_between_retry -> 30000
   hoodie.write.concurrency.mode -> optimistic_concurrency_control
   hoodie.write.lock.dynamodb.region -> us-east-1
   ```
   
   **Environment Description**
   
   * Hudi version : 0.11.1
   
   * Spark version : 3.2.1
   
   * Hive version : 3.1.3
   
   * Hadoop version : 3.2.1
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   Happy to provide more info. Thanks!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to