vingov commented on PR #5201: URL: https://github.com/apache/hudi/pull/5201#issuecomment-1099237144
> > > Shouldn’t the BigQuery inputformat adapter the dataset with partition columns? And why bring in the complexities to the writer/reader of spark based on little gains. > > > > Hey @danny0405 BQ does not support custom inputformat as such. We had some previous discussion in the RFC with @vingov here [#4503 (comment)](https://github.com/apache/hudi/pull/4503#discussion_r786981999). There are big companies who rely heavily on BQ so it's worth the efforts. Besides, this is an existing config default to false, so people don't need it don't use. Hope this helps clarify the context. > > I know we are fixing a critical bug for BigQuery but the way the patch fixes seems not right ? For example, i can imagine that after this patch, the table written by Flink still can not read with BigQuery. > > We should not rush in a fix that partially worked and we should work in the direction that the Bigquery inputformat can solve the problem. There is also confusion with this patch because even though we drop the partition fields in the data set columns, we still keep that in the metadata fields. > > @vinothchandar WDYT ? Hi @danny0405 - I understand your concern, In Uber and Wal-Mart there is already a use case to read Hudi tables using Big Query for ML use cases, this patch is to support those use cases. From Uber, we already reached out to the Big Query team and requested them to support custom input formats, they don't have any road map to add it hence we need this patch until then to support the existing customers. I understand this is a stop-gap solution until we have the leverage to nudge Big Query team to add that custom input format support, @vinothchandar is already working with some of the Big query folks to make it happen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
