maheshguptags commented on issue #10609:
URL: https://github.com/apache/hudi/issues/10609#issuecomment-2167275346

   Hi, @michael1991 thank you for solving this, I can run the deltastream with 
RLI. Out of curiosity, how did you figure out we need to pass the jar in 
extraPath?
   ```spark/bin/spark-submit \
   --name customer-event-hudideltaStream \
   --num-executors 10 \
   --executor-memory 2g \
   --driver-memory 3g \
   --packages org.apache.hadoop:hadoop-aws:3.3.4 \
   --jars /home/mahesh.gupta/aws-msk-iam-auth-1.1.9-all.jar \
   --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer 
/home/mahesh.gupta/hudi-utilities-bundle_2.12-0.14.1.jar \
   --checkpoint 
s3a://cdp-offline-store-perf2/checkpointing/eks/sparkhudipoc/hudistream_rli_4 \
   --target-base-path 
s3a://cdp-offline-store-perf2/customer_event_temp_hudi_delta/ \
   --target-table customer_event_temp \
   --table-type COPY_ON_WRITE \
   --base-file-format PARQUET \
   --props /home/mahesh.gupta/deltaHoodie.properties \
   --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \
   --source-ordering-field updated_date \
   --payload-class org.apache.hudi.common.model.DefaultHoodieRecordPayload \
   --schemaprovider-class 
org.apache.hudi.utilities.schema.FilebasedSchemaProvider \
   --hoodie-conf 
hoodie.streamer.schemaprovider.source.schema.file=/home/mahesh.gupta/source.avsc
 \
   --hoodie-conf 
hoodie.streamer.schemaprovider.target.schema.file=/home/mahesh.gupta/source.avsc
 \
   --op UPSERT \
   --hoodie-conf hoodie.streamer.source.kafka.topic=cdp_track_temp_perf \
   --hoodie-conf hoodie.datasource.write.partitionpath.field=client_id \
   --continuous
   ```
   @ad1happy2go will need some help in memory tuning for delta stream. please 
let me know if there is any doc fo it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to