soumilshah1995 opened a new issue, #10644:
URL: https://github.com/apache/hudi/issues/10644
Hello,
I'm reaching out to inquire about the Hudi exporter service. I've had some
experience working with it, but I'm particularly interested in whether we
support the integration of SQL transformer with it.
The concept is to utilize the Hudi export utility for exporting Hudi data.
However, there could be instances where customers require exporting filtered
data. For instance, they might need all data related to a specific stock like
AAPL.
Do we have plans to incorporate a filtering mechanism into the Hudi exporter?
Here's an example of the Spark-submit command:
```
park-submit \
--class org.apache.hudi.utilities.HoodieSnapshotExporter \
--packages 'org.apache.hudi:hudi-spark3.4-bundle_2.12:0.14.0' \
--master 'local[*]' \
--executor-memory 1g \
/Users/soumilshah/IdeaProjects/SparkProject/DeltaStreamer/jar/hudi-utilities-slim-bundle_2.12-0.14.0.jar
\
--source-base-path
'file:///Users/soumilshah/IdeaProjects/SparkProject/DeltaStreamer/hudi/bronze_orders'
\
--target-output-path
'file:///Users/soumilshah/IdeaProjects/SparkProject/DeltaStreamer/hudi/json/' \
--output-format 'json'
```
i tried this flags
```
--transformer-class
org.apache.hudi.utilities.transform.SqlQueryBasedTransformer \
--hoodie-conf hoodie.deltastreamer.transformer.sql='SELECT *, extract(year
from order_date) as year, extract(month from order_date) as month FROM <SRC>
a' \
```
Looks like its not supported with HoodieSnapshotExporter
REF
https://hudi.apache.org/docs/snapshot_exporter (edited)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]