bkosuru edited a comment on issue #4864:
URL: https://github.com/apache/hudi/issues/4864#issuecomment-1073231293


   Hi @nsivabalan,
   
   Could you please give some suggestions for tuning bloom index configs? Our 
data is immutable but we have duplicate data. We want to insert unique rows 
only. We have allocated enough resources(400 executors, 50G) and it still 
fails. Do you think we should allocate more resources? Is there a way to 
insert_drop_dup to a single partition to make it more efficient. We know that 
the data we are going to insert belongs to a single partition. 
   
   Thanks!
   Bindu


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to