bhushanamk edited a comment on issue #2294:
URL: https://github.com/apache/hudi/issues/2294#issuecomment-738542767


   @bvaradar 
   below are the submit configs
   props file :
   hoodie.datasource.write.recordkey.field=id
   hoodie.datasource.write.partitionpath.field= 
   
hoodie.deltastreamer.source.dfs.root=s3://XXX-4-rawzone/mysql/marketplace/spree_variants
   hoodie.datasource.hive_sync.enable=true 
   hoodie.datasource.hive_sync.database=marketplace 
   hoodie.datasource.hive_sync.table=spree_variants 
   hoodie.datasource.hive_sync.username=
   hoodie.datasource.hive_sync.password= 
   hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://localhost:10000
   
hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.NonPartitionedExtractor
   
hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.NonpartitionedKeyGenerator
   hoodie.upsert.shuffle.parallelism=3
   hoodie.consistency.check.enabled=true
   hoodie.copyonwrite.record.size.estimate=100000
   #hoodie.compact.inline= false
   hoodie.parquet.max.file.size=128MB
   hoodie.parquet.small.file.limit=100MB
   
   
   
   
   Spark submit: 
   
   spark-submit --num-executors 1 --executor-memory 2g --driver-memory 2g 
--conf spark.executor.memoryOverhead=1000 --conf 
spark.driver.memoryOverhead=1000 --conf spark.dynamicAllocation.enabled=false 
--conf spark.yarn.submit.waitAppCompletion=false --conf spark.task.cpus=1 
--conf spark.executor.cores=1 --conf spark.task.maxFailures=10 --conf 
spark.memory.fraction=0.4 --conf spark.rdd.compress=true --conf 
spark.kryoserializer.buffer.max=512m --conf 
spark.serializer=org.apache.spark.serializer.KryoSerializer --conf 
spark.sql.hive.convertMetastoreParquet=false --jars 
/home/hadoop/hudi-spark-bundle_2.11-0.6.0.jar,/usr/lib/spark/external/lib/spark-avro.jar,/home/hadoop/locuz-module-1.0.jar
 --deploy-mode client --class 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer 
/home/hadoop//hudi-utilities-bundle_2.11-0.6.0.jar --table-type COPY_ON_WRITE 
--source-ordering-field cdc_ts --source-class 
org.apache.hudi.utilities.sources.ParquetDFSSource --target-base-path 
s3://xxxx-dataplatfo
 rm-4/prod/mysql/marketplace/spree_variants --target-table spree_variants 
--enable-hive-sync --props 
s3://shopx-dataplatform-4/properties/spree_variants.properties 
--transformer-class org.apache.hudi.utilities.transform.AWSDmsTransformer 
--payload-class org.apache.hudi.payload.AWSDmsAvroPayload --source-limit 
1073741824 --min-sync-interval-seconds 300 --continuous


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to