guanziyue commented on pull request #4264:
URL: https://github.com/apache/hudi/pull/4264#issuecomment-1065290731


   > subsume
   I will rebase from master in next commit. The master branch just changed 
soon after my last rebase operation. 
   All in all, this PR initially want to solve the concurrent use of 
mergeHandle which is equivalent to using parquet writer concurrently. I applied 
two changes before. 
   1. Add a graceful exit for BoundedInMemoryExecutor + change the order of 
method call in SparkMergeHelper.
       Cons: a. will add a check of thread status on hot path.  b. Any other 
future use of Parquet writer may have risk to suffer same problem.
   2. Add a lock in ParquetWriter.
       Pros: reduce the risk of misuse in the future.
       Cons: add a lock on hot path which may influence perf. (I did a simple 
profiling. It is hard to observe a negative impact actually)
   Either one can totally solve this problem. But both of them may have 
drawback.
   And I will focus on this problem and make rename of executor another PR if 
needed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to