vingov commented on PR #5201:
URL: https://github.com/apache/hudi/pull/5201#issuecomment-1099237144

   > > > Shouldn’t the BigQuery inputformat adapter the dataset with partition 
columns? And why bring in the complexities to the writer/reader of spark based 
on little gains.
   > > 
   > > Hey @danny0405 BQ does not support custom inputformat as such. We had 
some previous discussion in the RFC with @vingov here [#4503 
(comment)](https://github.com/apache/hudi/pull/4503#discussion_r786981999). 
There are big companies who rely heavily on BQ so it's worth the efforts. 
Besides, this is an existing config default to false, so people don't need it 
don't use. Hope this helps clarify the context.
   > 
   > I know we are fixing a critical bug for BigQuery but the way the patch 
fixes seems not right ? For example, i can imagine that after this patch, the 
table written by Flink still can not read with BigQuery.
   > 
   > We should not rush in a fix that partially worked and we should work in 
the direction that the Bigquery inputformat can solve the problem. There is 
also confusion with this patch because even though we drop the partition fields 
in the data set columns, we still keep that in the metadata fields.
   > 
   > @vinothchandar  WDYT ?
   
   Hi @danny0405 - I understand your concern, In Uber and Wal-Mart there is 
already a use case to read Hudi tables using Big Query for ML use cases, this 
patch is to support those use cases.
   
   From Uber, we already reached out to the Big Query team and requested them 
to support custom input formats, they don't have any road map to add it hence 
we need this patch until then to support the existing customers.
   
   I understand this is a stop-gap solution until we have the leverage to nudge 
Big Query team to add that custom input format support, @vinothchandar  is 
already working with some of the Big query folks to make it happen. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to