soumilshah1995 opened a new issue, #10287:
URL: https://github.com/apache/hudi/issues/10287
Hey community,
I hope you're doing well. I recently launched a Thrift server using Spark,
incorporating the Hudi library. The server runs smoothly, and I can interact
with it using Beeline to query data successfully.
```
spark-submit \
--master 'local[*]' \
--conf spark.executor.extraJavaOptions=-Duser.timezone=Etc/UTC \
--conf spark.eventLog.enabled=false \
--conf
spark.sql.warehouse.dir=file:///Users/soumilshah/Desktop/soumil/sparkwarehouse \
--packages
'org.apache.hudi:hudi-spark3-bundle_2.12:0.14.0,org.apache.spark:spark-sql_2.12:3.4.0,org.apache.spark:spark-hive_2.12:3.4.0'
\
--class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 \
--name "Thrift JDBC/ODBC Server" \
--executor-memory 512m \
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
--conf
spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension \
--conf spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar
```
BEELINE
```
beeline -u jdbc:hive2://localhost:10000/default
```
on BEELINE
```
CREATE TABLE hudi_table (
ts BIGINT,
uuid STRING,
rider STRING,
driver STRING,
fare DOUBLE,
city STRING
) USING HUDI
PARTITIONED BY (city);
```
Works fine
INSerted data
```
INSERT INTO hudi_table
VALUES
(1695159649087,'334e26e9-8355-45cc-97c6-c31daf0df330','rider-A','driver-K',19.10,'san_francisco'),
(1695091554788,'e96c4396-3fad-413a-a942-4cb36106d721','rider-C','driver-M',27.70
,'san_francisco'),
(1695046462179,'9909a8b1-2d15-4d3d-8ec9-efc48c536a00','rider-D','driver-L',33.90
,'san_francisco'),
(1695332066204,'1dced545-862b-4ceb-8b43-d2a568f6616b','rider-E','driver-O',93.50,'san_francisco'),
(1695516137016,'e3cf430c-889d-4015-bc98-59bdce1e530c','rider-F','driver-P',34.15,'sao_paulo'
),
(1695376420876,'7a84095f-737f-40bc-b62f-6b69664712d2','rider-G','driver-Q',43.40
,'sao_paulo' ),
(1695173887231,'3eeb61f7-c2b0-4636-99bd-5d7a5a1d2c04','rider-I','driver-S',41.06
,'chennai' ),
(1695115999911,'c8abbe79-8d89-47ea-b4ce-4d224bae5bfa','rider-J','driver-T',17.85,'chennai');
```
Now when i am trying to connect with DBT or DBeaver
to run SQL query against i see following error
```
QL Error: org.apache.hive.service.cli.HiveSQLException: Error running query:
java.util.concurrent.ExecutionException:
org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to
find the data source: hudi. Please find packages at
`https://spark.apache.org/third-party-projects.html`.
```
I have successfully created a table and inserted data into Hudi tables using
Beeline. The problem arises when I try to interact with Hudi tables using tools
like DBT or DB Ever.
Any insights or guidance on resolving this issue would be greatly
appreciated! If you have any experience with integrating Hudi into Spark Thrift
Server and overcoming similar challenges, your expertise would be invaluable.
Thanks in advance for your help!
Regards
Soumil
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]