xushiyan commented on issue #6808: URL: https://github.com/apache/hudi/issues/6808#issuecomment-1324172279
@schlichtanders the derby url is not following the pattern specified here https://db.apache.org/derby/docs/10.14/ref/rrefjdbc10889.html if you use named attribute like databaseName=xxx, it should go after `;`. otherwise, it should just be `jdbc:derby:memory:default;create=true` i used below settings to test in-memory derby, which is working hive-site.xml ``` <configuration> <property> <name>system:user.name</name> <value>${user.name}</value> </property> <property> <name>system:java.io.tmpdir</name> <value>file:///tmp/hudi-bundles/hive/java</value> </property> <property> <name>hive.exec.scratchdir</name> <value>file:///tmp/hudi-bundles/hive/exec</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>file:///tmp/hudi-bundles/hive/warehouse</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://localhost:9083</value> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:memory:default;create=true</value> </property> </configuration> ``` also copy to spark ``` cp $HIVE_HOME/conf/hive-site.xml $SPARK_HOME/conf/hive-site.xml ``` then start hms ``` $HIVE_HOME/bin/hive --service metastore ``` then start spark-shell ``` spark-shell --jars hudi-spark3.1-bundle_2.12-0.13.0-SNAPSHOT.jar \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \ --conf spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension ``` run quickstart example ```scala import org.apache.hudi.QuickstartUtils._ import scala.collection.JavaConversions._ import org.apache.spark.sql.SaveMode._ import org.apache.hudi.DataSourceReadOptions._ import org.apache.hudi.DataSourceWriteOptions._ import org.apache.hudi.config.HoodieWriteConfig._ import org.apache.hudi.common.model.HoodieRecord val expected = 10 val database = "default" val tableName = "trips" val basePath = "file:///tmp/hudi-bundles/tests/" + tableName val dataGen = new DataGenerator val inserts = convertToStringList(dataGen.generateInserts(expected)) val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2)) df.write.format("hudi"). options(getQuickstartWriteConfigs). option(PRECOMBINE_FIELD_OPT_KEY, "ts"). option(RECORDKEY_FIELD_OPT_KEY, "uuid"). option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath"). option(TABLE_NAME, tableName). option("hoodie.datasource.meta.sync.enable", "true"). option("hoodie.datasource.hive_sync.database", database). option("hoodie.datasource.hive_sync.table", tableName). option("hoodie.datasource.hive_sync.partition_extractor_class", "org.apache.hudi.hive.SinglePartPartitionValueExtractor"). option("hoodie.datasource.hive_sync.mode", "hms"). option("hoodie.datasource.hive_sync.metastore.uris", "thrift://localhost:9083/"). mode(Overwrite). save(basePath) spark.sql("desc " + tableName).show ``` ``` +--------------------+---------+-------+ | col_name|data_type|comment| +--------------------+---------+-------+ | _hoodie_commit_time| string| null| |_hoodie_commit_seqno| string| null| | _hoodie_record_key| string| null| |_hoodie_partition...| string| null| | _hoodie_file_name| string| null| | begin_lat| double| null| | begin_lon| double| null| | driver| string| null| | end_lat| double| null| | end_lon| double| null| | fare| double| null| | rider| string| null| | ts| bigint| null| | uuid| string| null| | partitionpath| string| null| |# Partition Infor...| | | | # col_name|data_type|comment| | partitionpath| string| null| +--------------------+---------+-------+ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
