abhishekshenoy commented on issue #10629: URL: https://github.com/apache/hudi/issues/10629#issuecomment-1953462323
@ad1happy2go @the-other-tim-brown ``` But should nt that be internally called when we are providing the Hudi Bq configs and enabling META_SYNC_ENABLED. In my case we use df.write.options(hudiAndHiveAndBQConfigs).save() and the hudiAndHiveAndBQConfigs has both hive and bq related configs . **But still only hive sync happens implicitly**. Is it by design that as part of our write function we need to perform both df.write.options(hudiAndHiveAndBQConfigs).save() new BigQuerySyncTool(getBigQueryProps).syncHoodieTable() ``` ``` **Also i am seeing an issue that with BQ sync , the partition fields do not get reflected correctly .** Our output looks something like this gs://bucket/folder/tablename/partition_column=partition_value_1 gs://bucket/folder/tablename/partition_column=partition_value_2 We have given the below and the table in BQ is not partitioned hoodie.gcp.bigquery.sync.base_path=gs://bucket/folder/tablename hoodie.datasource.meta.sync.base.path=gs://bucket/folder/tablename hoodie.gcp.bigquery.sync.source_uri_prefix=gs://bucket/folder/tablename hoodie.datasource.hive_sync.partition_fields=partition_column hoodie.datasource.hive_sync.partition_fields=partition_column hoodie.gcp.bigquery.sync.partition_fields=partition_column hoodie.gcp.bigquery.sync.assume_date_partitioning=false hoodie.gcp.bigquery.sync.require_partition_filter=false ``` ``` If we pass sourceUriPrefix as any of the below , sync fails hoodie.gcp.bigquery.sync.source_uri_prefix=gs://bucket/folder/tablename/partion_column= hoodie.gcp.bigquery.sync.source_uri_prefix=gs://bucket/folder/tablename/partion_column= * How can this be resolved ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
