Re: [I] [SUPPORT] BQ synch tool not working with HUDI bundle jar [hudi]

via GitHub Mon, 19 Feb 2024 20:16:47 -0800


abhishekshenoy commented on issue #10629:
URL: https://github.com/apache/hudi/issues/10629#issuecomment-1953462323


   @ad1happy2go @the-other-tim-brown  
   
   ```
   But should nt that be internally called when we are providing the Hudi Bq 
configs and enabling META_SYNC_ENABLED. 
   
   In my case we use df.write.options(hudiAndHiveAndBQConfigs).save() and the 
hudiAndHiveAndBQConfigs has both hive and bq related configs . **But still only 
hive sync happens implicitly**. 
   
   Is it by design that as part of our write function we need to perform both 
   
   df.write.options(hudiAndHiveAndBQConfigs).save()
   new BigQuerySyncTool(getBigQueryProps).syncHoodieTable()
   ```
   
   
   ```
   **Also i am seeing an issue that with BQ sync , the partition fields do not 
get reflected correctly .**
   
   Our output looks something like this 
   
   gs://bucket/folder/tablename/partition_column=partition_value_1
   gs://bucket/folder/tablename/partition_column=partition_value_2
   
   We have given the below and the table in BQ is not partitioned 
   
   hoodie.gcp.bigquery.sync.base_path=gs://bucket/folder/tablename
   hoodie.datasource.meta.sync.base.path=gs://bucket/folder/tablename
   hoodie.gcp.bigquery.sync.source_uri_prefix=gs://bucket/folder/tablename
   hoodie.datasource.hive_sync.partition_fields=partition_column
   hoodie.datasource.hive_sync.partition_fields=partition_column
   hoodie.gcp.bigquery.sync.partition_fields=partition_column
   hoodie.gcp.bigquery.sync.assume_date_partitioning=false
   hoodie.gcp.bigquery.sync.require_partition_filter=false
   ```
   
   
   ```
   If we pass sourceUriPrefix as any of the below , sync fails  
hoodie.gcp.bigquery.sync.source_uri_prefix=gs://bucket/folder/tablename/partion_column=
 
   
hoodie.gcp.bigquery.sync.source_uri_prefix=gs://bucket/folder/tablename/partion_column=
 *
   
   
   How can this be resolved 
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [SUPPORT] BQ synch tool not working with HUDI bundle jar [hudi]

Reply via email to