soumilshah1995 opened a new issue, #11258:
URL: https://github.com/apache/hudi/issues/11258
Here is Delta Streamer
```
spark-submit \
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
--packages org.apache.hudi:hudi-spark3.4-bundle_2.12:0.14.0 \
--properties-file spark-config.properties \
--master 'local[*]' \
--executor-memory 1g \
/Users/soumilshah/IdeaProjects/SparkProject/apache-hudi-delta-streamer-labs/E1/jar/hudi-utilities-slim-bundle_2.12-0.14.0.jar
\
--table-type COPY_ON_WRITE \
--op UPSERT \
--transformer-class
org.apache.hudi.utilities.transform.SqlQueryBasedTransformer \
--source-ordering-field replicadmstimestamp \
--source-class org.apache.hudi.utilities.sources.ParquetDFSSource \
--target-base-path
file:///Users/soumilshah/IdeaProjects/SparkProject/apache-hudi-delta-streamer-labs/E1/silver/
\
--target-table invoice \
--props hudi_tbl.props
```
# Hudi prop
```
hoodie.streamer.source.hoodieincr.missing.checkpoint.strategy=READ_UPTO_LATEST_COMMIT
hoodie.streamer.source.hoodieincr.path=s3a://warehouse/default/table_name=orders
hoodie.datasource.write.recordkey.field=order_id
hoodie.datasource.write.partitionpath.field=
hoodie.datasource.write.precombine.field=ts
```
Tried following options
```
hoodie.streamer.transformer.sql.file=join.sql
OR
hoodie.streamer.transformer.sql.file=file:///Users/soumilshah/IdeaProjects/SparkProject/deltastreamerBroadcastJoins/join.sql
OR
hoodie.streamer.transformer.sql.file=/Users/soumilshah/IdeaProjects/SparkProject/deltastreamerBroadcastJoins/join.sql
```
# Error Message
```
FO BaseHoodieTableFileIndex: Refresh table orders, spent: 15 ms
24/05/19 12:38:37 ERROR HoodieStreamer: Shutting down delta-sync due to
exception
java.lang.IllegalArgumentException: Property hoodie.streamer.transformer.sql
not found
at
org.apache.hudi.common.util.ConfigUtils.getStringWithAltKeys(ConfigUtils.java:334)
at
org.apache.hudi.common.util.ConfigUtils.getStringWithAltKeys(ConfigUtils.java:308)
at
org.apache.hudi.utilities.transform.SqlQueryBasedTransformer.apply(SqlQueryBasedTransformer.java:52)
at
org.apache.hudi.utilities.transform.ChainedTransformer.apply(ChainedTransformer.java:105)
at
org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchFromSource$0(StreamSync.java:530)
at org.apache.hudi.common.util.Option.map(Option.java:108)
at
org.apache.hudi.utilities.streamer.StreamSync.fetchFromSource(StreamSync.java:530)
at
org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:495)
at
org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:405)
at
org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.lambda$startService$1(HoodieStreamer.java:757)
at
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
24/05/19 12:38:37 INFO HoodieStreamer: Delta Sync shutdown. Error ?true
24/05/19 12:38:37 INFO HoodieStreamer: Ingestion completed. Has error: true
24/05/19 12:38:37 INFO StreamSync: Shutting down embedded timeline server
24/05/19 12:38:37 ERROR HoodieAsyncService: Service shutdown with error
java.util.concurrent.ExecutionException:
org.apache.hudi.exception.HoodieException: Property
hoodie.streamer.transformer.sql not found
at
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
at
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2005)
at
org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103)
at
org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:65)
at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
at
org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:205)
at
org.apache.hudi.utilities.streamer.HoodieStreamer.main(HoodieStreamer.java:584)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.hudi.exception.HoodieException: Property
hoodie.streamer.transformer.sql not found
at
org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.lambda$startService$1(HoodieStreamer.java:796)
at
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.IllegalArgumentException: Property
hoodie.streamer.transformer.sql not found
at
org.apache.hudi.common.util.ConfigUtils.getStringWithAltKeys(ConfigUtils.java:334)
at
org.apache.hudi.common.util.ConfigUtils.getStringWithAltKeys(ConfigUtils.java:308)
at
org.apache.hudi.utilities.transform.SqlQueryBasedTransformer.apply(SqlQueryBasedTransformer.java:52)
at
org.apache.hudi.utilities.transform.ChainedTransformer.apply(ChainedTransformer.java:105)
at
org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchFromSource$0(StreamSync.java:530)
at org.apache.hudi.common.util.Option.map(Option.java:108)
at
org.apache.hudi.utilities.streamer.StreamSync.fetchFromSource(StreamSync.java:530)
at
org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:495)
at
org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:405)
at
org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.lambda$startService$1(HoodieStreamer.java:757)
... 4 more
24/05/19 12:38:37 INFO SparkContext: SparkContext is stopping with exitCode
0.
24/05/19 12:38:37 INFO SparkUI: Stopped Spark web UI at
http://soumils-mbp:8090
24/05/19 12:38:37 INFO MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
24/05/19 12:38:37 INFO MemoryStore: MemoryStore cleared
24/05/19 12:38:37 INFO BlockManager: BlockManager stopped
24/05/19 12:38:37 INFO BlockManagerMaster: BlockManagerMaster stopped
24/05/19 12:38:37 INFO
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
24/05/19 12:38:37 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main"
org.apache.hudi.utilities.ingestion.HoodieIngestionException: Ingestion service
was shut down with exception.
at
org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:67)
at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
at
org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:205)
at
org.apache.hudi.utilities.streamer.HoodieStreamer.main(HoodieStreamer.java:584)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
at org.apache.spark.de
```
Delta Streamer works fine
```
hoodie.streamer.transformer.sql=SELECT a.customer_id, c.name AS
customer_name, c.state AS state, c.city, c.email, a.order_id, a.name AS
order_name, a.order_value, a.priority, a.order_date, a.ts FROM <SRC> a JOIN
hudi_db.customers c ON c.customer_id = a.customer_id
```
tried all option it fails when I am proving .sql file

Here is my Joined.sql
```
SELECT
a.customer_id,
c.name AS customer_name,
c.state AS state,
c.city,
c.email,
a.order_id,
a.name AS order_name,
a.order_value,
a.priority,
a.order_date,
a.ts
FROM
<SRC> a
JOIN
hudi_db.customers c
ON
c.customer_id = a.customer_id;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]