soumilshah1995 opened a new issue, #10644:
URL: https://github.com/apache/hudi/issues/10644

   Hello,
   
   I'm reaching out to inquire about the Hudi exporter service. I've had some 
experience working with it, but I'm particularly interested in whether we 
support the integration of SQL transformer with it.
   The concept is to utilize the Hudi export utility for exporting Hudi data. 
However, there could be instances where customers require exporting filtered 
data. For instance, they might need all data related to a specific stock like 
AAPL.
   Do we have plans to incorporate a filtering mechanism into the Hudi exporter?
   Here's an example of the Spark-submit command:
   
   ```
   
   park-submit \
       --class org.apache.hudi.utilities.HoodieSnapshotExporter \
       --packages 'org.apache.hudi:hudi-spark3.4-bundle_2.12:0.14.0' \
       --master 'local[*]' \
       --executor-memory 1g \
       
/Users/soumilshah/IdeaProjects/SparkProject/DeltaStreamer/jar/hudi-utilities-slim-bundle_2.12-0.14.0.jar
 \
       --source-base-path 
'file:///Users/soumilshah/IdeaProjects/SparkProject/DeltaStreamer/hudi/bronze_orders'
 \
       --target-output-path 
'file:///Users/soumilshah/IdeaProjects/SparkProject/DeltaStreamer/hudi/json/' \
       --output-format 'json'
   
   ```
   
   i tried this flags
   ```
   
   --transformer-class 
org.apache.hudi.utilities.transform.SqlQueryBasedTransformer \
   --hoodie-conf hoodie.deltastreamer.transformer.sql='SELECT *, extract(year 
from order_date) as year, extract(month from order_date) as month  FROM <SRC> 
a' \
   ```
   
   
   Looks like its not supported with HoodieSnapshotExporter
   REF
   https://hudi.apache.org/docs/snapshot_exporter (edited) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to