Dear All I am using DeltaStreamer to stream the data from kafka topic and to write it into the hudi data set. For this use case I am not doing any upsert all are insert only so each job creates new parquet file after the inject job. So large number of small files are creating. how can i merge these files from deltastreamer job using the available configurations.
I think compactionSmallFileSize may useful for this case, but i am not sure whether it is for deltastreamer or not. I tried it in deltastreamer but it did't worked. Please assist on this. If possible give one example for the same Thanks & Regards Rahul
