[ https://issues.apache.org/jira/browse/BEAM-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17001198#comment-17001198 ]
Reuven Lax commented on BEAM-9010: ---------------------------------- Could you check to see whether the output of TableRowJsonCoder has changed as well? If it's the same then we're ok - and the test should be using TableRowJsonCoder instead of toString. > BigQuery TableRow's size is toString().length() ? > ------------------------------------------------- > > Key: BEAM-9010 > URL: https://issues.apache.org/jira/browse/BEAM-9010 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow > Reporter: Tomo Suzuki > Priority: Minor > > The following tests failed when I tried to upgrade google-http-client 1.34.0 > from 1.28.0: > {noformat} > org.apache.beam.sdk.io.gcp.bigquery.BigQueryIOReadTest.testEstimatedSizeWithoutStreamingBuffer > org.apache.beam.sdk.io.gcp.bigquery.BigQueryIOReadTest.testEstimatedSizeWithStreamingBuffer > org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtilTest.testInsertAll > {noformat} > [https://builds.apache.org/job/beam_PreCommit_Java_Commit/9288/#showFailuresLink] > h3. Reason of the test failures > [org.apache.beam.sdk.io.gcp.testing.TableContainer|https://github.com/apache/beam/blob/6fa94c9/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/testing/TableContainer.java#L43] > and > [org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl|https://github.com/apache/beam/blob/c2f0d28/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L758] > rely on {{TableRow.toString().length()}} to calculate the size. Example: > {code:java} > dataSize += row.toString().length(); > if (dataSize >= maxRowBatchSize > || rows.size() >= maxRowsPerBatch > || i == rowsToPublish.size() - 1) { > {code} > However, with [google-http-client's > PR#589|https://github.com/googleapis/google-http-java-client/pull/589/files#diff-914cd7ff18143b3d2398149e1cfb4f45R218], > the GenericData.toString output has changed since v1.29.0. > In old google-http-client 1.28.0, an example row's toString returned: > {noformat} > {f=[{v=foo}, {v=1234}]} > {noformat} > In new google-http-client 1.29.0 and higher, the same row's toString returns: > {noformat} > GenericData{classInfo=[f], {f=[GenericData{classInfo=[v], {v=foo}}, > GenericData{classInfo=[v], {v=1234}}]}} > {noformat} > h1. Question: > Is this right thing to rely on {{toString().length()}} in the BigQuery > classes? -- This message was sent by Atlassian Jira (v8.3.4#803005)