ashishmgofficial edited a comment on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-706057675
@bvaradar Please find the details : ``` spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer --jars s3://xxxx/hudi/jars/hudi-spark-bundle_2.11-0.6.1-SNAPSHOT.jar --packages org.apache.spark:spark-avro_2.11:2.4.4,org.apache.hadoop:hadoop-aws:2.7.3 --master yarn --deploy-mode client s3://xxxx/hudi/jars/hudi-utilities-bundle_2.11-0.6.1-SNAPSHOT.jar --table-type COPY_ON_WRITE --source-ordering-field last_modified_ts --source-class org.apache.hudi.utilities.sources.AvroKafkaSource --target-base-path s3a://xxxx/warehouse/hudi_dms_acc_kafka --target-table hudi_dms_acc_kafka --props s3://xxxx/hudi/conf/hudi-kafka.properties --schemaprovider-class org.apache.hudi.utilities.schema.DebeziumRegistryProvider --payload-class org.apache.hudi.common.model.DebeziumAvroPayload --transformer-class org.apache.hudi.utilities.transform.DebeziumCustomTransformer ``` hudi-kafka.properties ``` hoodie.upsert.shuffle.parallelism=10 hoodie.insert.shuffle.parallelism=10 hoodie.delete.shuffle.parallelism=10 hoodie.bulkinsert.shuffle.parallelism=10 hoodie.embed.timeline.server=true hoodie.filesystem.view.type=EMBEDDED_KV_STORE hoodie.compact.inline=false # Key fields, for kafka example hoodie.datasource.write.recordkey.field=inc_id hoodie.datasource.write.precombine.field=last_modified_ts hoodie.datasource.write.partitionpath.field=violation_code hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.NonpartitionedKeyGenerator # Schema provider props (change to absolute path based on your installation) #hoodie.deltastreamer.schemaprovider.source.schema.file=/var/demo/config/schema.avsc #hoodie.deltastreamer.schemaprovider.target.schema.file=/var/demo/config/schema.avsc # Kafka Source hoodie.deltastreamer.source.kafka.topic=airflow.public.motor_crash_violation_incidents #Kafka props bootstrap.servers=http://xxxx:29092 auto.offset.reset=earliest hoodie.deltastreamer.schemaprovider.registry.url=http://xxxx:8081/subjects/airflow.public.motor_crash_violation_incidents-value/versions/latest #hoodie.deltastreamer.schemaprovider.registry.targetUrl=http://xxxx:8081/subjects/airflow.public.motor_crash_violation_incidents-value/versions/latest schema.registry.url=http://xxxx:8081 validate.non.null = false ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
