Akshay2Agarwal opened a new issue #2913:
URL: https://github.com/apache/hudi/issues/2913
Do we need hive server compulsorily for running hive sync as I tried to use
metastore jdbc url in `hoodie.datasource.hive_sync.jdbcurl` as the mysql jdbc
url. It gave syntax error in SQL statement when tried to sync hudi table.
**To Reproduce**
Steps to reproduce the behavior:
1.```spark-shell \
--packages
org.apache.hudi:hudi-spark-bundle_2.11:0.8.0,org.apache.spark:spark-avro_2.11:2.4.4
\
--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'```
2. `val df = Seq(
(1, 213213213, "2014/01/01"),
(2, 343432434, "2014/11/30"),
(3, 343242323, "2016/12/29"),
(4, 344234242, "2016/05/09")
).toDF("typeId","eventTime","partition")`
3. ```df.write.format("hudi").
options(getQuickstartWriteConfigs).
option(PRECOMBINE_FIELD_OPT_KEY, "eventTime").
option(RECORDKEY_FIELD_OPT_KEY, "typeId").
option(PARTITIONPATH_FIELD_OPT_KEY, "partition").
option(HIVE_PARTITION_FIELDS_OPT_KEY, "partition").
option(HIVE_STYLE_PARTITIONING_OPT_KEY, false).
option(HIVE_SYNC_ENABLED_OPT_KEY, true).
option(HIVE_TABLE_OPT_KEY, "hive_test_data").
option(HIVE_USER_OPT_KEY, "hive").
option(HIVE_PASS_OPT_KEY, "XXXXXXX").
option(HIVE_URL_OPT_KEY, "jdbc:mysql://XXXXXXXX.compute.internal:3306").
option(TABLE_NAME, "hudi_events_test").
mode(Overwrite).
save("s3a://XXXXXXXX/test-lake-data/hudi_events_test/")```
**Expected behavior**
I expected, it should sync hudi table to metastore
**Environment Description**
* Hudi version : 0.8.0
* Spark version : 2.4.4
* Hive version : 2
* Hadoop version :2
* Storage (HDFS/S3/GCS..) :S3
* Running on Docker? (yes/no) :no
**Stacktrace**
```Caused by: java.sql.SQLSyntaxErrorException: (conn=47) You have an error
in your SQL syntax; check the manual that corresponds to your MySQL server
version for the right syntax to use near 'EXTERNAL TABLE IF NOT EXISTS
`default`.`hive_test_data`( `_hoodie_commit_time` ' at line 1
at
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:243)
at
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:164)
at
org.mariadb.jdbc.MariaDbStatement.executeExceptionEpilogue(MariaDbStatement.java:258)
at
org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:349)
at org.mariadb.jdbc.MariaDbStatement.execute(MariaDbStatement.java:484)
at
org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:367)
... 110 more
Caused by: java.sql.SQLException: You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server version for the right
syntax to use near 'EXTERNAL TABLE IF NOT EXISTS `default`.`hive_test_data`(
`_hoodie_commit_time` ' at line 1
Query is: CREATE EXTERNAL TABLE IF NOT EXISTS `default`.`hive_test_data`(
`_hoodie_commit_time` string, `_hoodie_commit_seqno` string,
`_hoodie_record_key` string, `_hoodie_partition_path` string,
`_hoodie_file_name` string, `typeId` int, `eventTime` int) PARTITIONED BY
(`partition` string) ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS
INPUTFORMAT 'org.apache.hudi.hadoop.HoodieParquetInputFormat' OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION
's3a://XXXXXXXXX/test-lake-data/hudi_events_test'
java thread: main
at
org.mariadb.jdbc.internal.util.LogQueryTool.exceptionWithQuery(LogQueryTool.java:134)
at
org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:184)
at
org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:343)
... 112 more```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]