alberttwong commented on issue #11797: URL: https://github.com/apache/hudi/issues/11797#issuecomment-2315947928
raw updated commands ``` https://github.com/apache/spark/blob/v3.4.3/pom.xml#L125 cat /opt/demo/data/batch_1.json | kafkacat -b kafka:29092 -t stock_ticks -P kafkacat -b kafka -L -J | jq . spark-submit \ --packages org.apache.hudi:hudi-utilities-slim-bundle_2.12:0.15.0,org.apache.hudi:hudi-spark3.4-bundle_2.12:0.15.0,org.apache.hadoop:hadoop-aws:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.262 \ --class org.apache.hudi.utilities.streamer.HoodieStreamer org.apache.hudi_hudi-utilities-bundle_2.12-0.15.0.jar \ --table-type COPY_ON_WRITE \ --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \ --source-ordering-field ts \ --target-base-path s3a://warehouse/stock_ticks_cow \ --target-table stock_ticks_cow \ --props file:///opt/demo/config/kafka-source.properties \ --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider spark-submit \ --packages org.apache.hudi:hudi-utilities-slim-bundle_2.12:0.15.0,org.apache.hudi:hudi-spark3.4-bundle_2.12:0.15.0,org.apache.hadoop:hadoop-aws:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.262 \ --class org.apache.hudi.utilities.streamer.HoodieStreamer org.apache.hudi_hudi-utilities-bundle_2.12-0.15.0.jar \ --table-type MERGE_ON_READ \ --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \ --source-ordering-field ts \ --target-base-path s3a://warehouse/stock_ticks_mor \ --target-table stock_ticks_mor \ --props file:///opt/demo/config/kafka-source.properties \ --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider \ --disable-compaction spark-shell --packages org.apache.hudi:hudi-hive-sync-bundle:0.15.0,org.apache.thrift:libthrift:0.15.0 /opt/hudi/hudi-sync/hudi-hive-sync/run_sync_tool.sh \ --metastore-uris 'thrift://hive-metastore:9083' \ --partitioned-by dt \ --base-path 's3a://warehouse/stock_ticks_cow' \ --database default \ --table stock_ticks_cow \ --sync-mode hms \ --partition-value-extractor org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor /opt/hudi/hudi-sync/hudi-hive-sync/run_sync_tool.sh \ --metastore-uris 'thrift://hive-metastore:9083' \ --partitioned-by dt \ --base-path 's3a://warehouse/stock_ticks_mor' \ --database default \ --table stock_ticks_mor \ --sync-mode hms \ --partition-value-extractor org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor spark-sql --packages org.apache.hudi:hudi-utilities-slim-bundle_2.12:0.15.0,org.apache.hudi:hudi-spark3.4-bundle_2.12:0.15.0,org.apache.hadoop:hadoop-aws:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.262 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
