soumilshah1995 commented on issue #10499:
URL: https://github.com/apache/hudi/issues/10499#issuecomment-1892837784

   Following works 
   
   ```
   spark-submit \
       --class org.apache.hudi.utilities.streamer.HoodieStreamer \
       --packages org.apache.hudi:hudi-spark3.4-bundle_2.12:0.14.0 \
       --properties-file spark-config.properties \
       --master 'local[*]' \
       --executor-memory 1g \
       
/Users/soumilshah/IdeaProjects/SparkProject/apache-hudi-delta-streamer-labs/E11/jar/hudi-utilities-slim-bundle_2.12-0.14.0.jar
 \
       --table-type COPY_ON_WRITE \
       --op UPSERT \
       --source-ordering-field ts \
       --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
       --target-base-path 
file:///Users/soumilshah/IdeaProjects/SparkProject/apache-hudi-delta-streamer-labs/E11/hudidb/orders
 \
       --target-table orders \
       --schemaprovider-class 
org.apache.hudi.utilities.schema.SchemaRegistryProvider \
       --transformer-class 
org.apache.hudi.utilities.transform.SqlQueryBasedTransformer \
       --props hudi_tbl.props
   ```
   
   #### prop
   ```
   hoodie.datasource.write.recordkey.field=order_id
   hoodie.datasource.write.partitionpath.field=order_date
   hoodie.datasource.write.precombine.field=ts
   
   bootstrap.servers=localhost:7092
   auto.offset.reset=earliest
   
   hoodie.streamer.source.kafka.topic=orders_complex
   
hoodie.streamer.source.kafka.value.deserializer.class=org.apache.hudi.utilities.deser.KafkaAvroSchemaDeserializer
   
   schema.registry.url=http://localhost:8081/
   hoodie.streamer.schemaprovider.registry.schemaconverter=
   
hoodie.streamer.schemaprovider.registry.url=http://localhost:8081/subjects/orders_complex-value/versions/latest
   
   hoodie.streamer.transformer.sql=SELECT * FROM <SRC> a
   
   
   ```
   
   ### Hudi table created 
   ```
   
   root
    |-- _hoodie_commit_time: string (nullable = true)
    |-- _hoodie_commit_seqno: string (nullable = true)
    |-- _hoodie_record_key: string (nullable = true)
    |-- _hoodie_partition_path: string (nullable = true)
    |-- _hoodie_file_name: string (nullable = true)
    |-- order_id: string (nullable = false)
    |-- name: string (nullable = false)
    |-- order_value: string (nullable = false)
    |-- priority: string (nullable = false)
    |-- ts: string (nullable = false)
    |-- customer: struct (nullable = false)
    |    |-- customer_id: string (nullable = false)
    |-- order_date: string (nullable = false)
   
   
   ```
   
   would love to learn how to use flatten transformer :D 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to