[jira] [Updated] (BEAM-11061) BigQuery IO - Storage: Fraction consumed shrinking between responses

Jonas Grabber (Jira) Wed, 14 Oct 2020 07:01:07 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-11061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jonas Grabber updated BEAM-11061:
---------------------------------
    Description: 
When reading a couple of million rows (and above 100 Gigabytes) from BigQuery 
Storage (DIRECT_READ) with Dataflow and above 8 vCPUs (4x n1-standard-4) the 
attached exception is thrown about once per vCPU.

The issue seems to be that the value of fraction_consumed in the 
[StreamStatus|https://cloud.google.com/bigquery/docs/reference/storage/rpc/google.cloud.bigquery.storage.v1beta1#streamstatus]
 object returned from the Storage API decreased between responses.

I tested this repeatedly with varying amounts of input data, number of workers 
and machine types and was able to reproduce the issue repeatedly with different 
configurations above 8 vCPUs used (16, 32, 128 and n1-standard-4, 
n1-standard-8, and n1-standard-16).

So far Jobs with 8 vCPUs ran fine.


{{ Error message from worker: java.io.IOException: Failed to advance reader of 
source: name: "projects/REDACTED/locations/eu/streams/REDACTED" 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:620)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.advance(ReadOperation.java:399)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:194)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:417)
 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:386)
 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:311)
 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:140)
 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:120)
 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
 java.util.concurrent.FutureTask.run(FutureTask.java:266) 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
java.lang.Thread.run(Thread.java:748) Caused by: 
java.lang.IllegalArgumentException: Fraction consumed from the current response 
(0.7945484519004822) has to be larger than or equal to the fraction consumed 
from the previous response (0.8467302322387695). 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.readNextRecord(BigQueryStorageStreamSource.java:242)
 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.advance(BigQueryStorageStreamSource.java:211)
 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:617)
 ... 14 more}}

  was:
When reading a couple of million rows (and above 100 Gigabytes) from BigQuery 
Storage  (DIRECT_READ) with Dataflow and above 8 vCPUs (4x n1-standard-4) the 
attached exception is thrown about once per vCPU.

The issue seems to be that the value of fraction_consumed in the 
[StreamStatus|https://cloud.google.com/bigquery/docs/reference/storage/rpc/google.cloud.bigquery.storage.v1beta1#streamstatus]
 object returned from the Storage API decreased between responses.

I tested this repeatedly with varying amounts of input data, number of workers 
and machine types and was able to reproduce the issue repeatedly with different 
configurations above 8 vCPUs used (16, 32, 128 and n1-standard-4, 
n1-standard-8, and n1-standard-16).

So far Jobs with 8 vCPUs ran fine.

{{Error message from worker: java.io.IOException: Failed to advance reader of 
source: name: "projects/REDACTED/locations/eu/streams/REDACTED" 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:620)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.advance(ReadOperation.java:399)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:194)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:417)
 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:386)
 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:311)
 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:140)
 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:120)
 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
 java.util.concurrent.FutureTask.run(FutureTask.java:266) 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
java.lang.Thread.run(Thread.java:748) Caused by: 
java.lang.IllegalArgumentException: Fraction consumed from the current response 
(0.7945484519004822) has to be larger than or equal to the fraction consumed 
from the previous response (0.8467302322387695). 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.readNextRecord(BigQueryStorageStreamSource.java:242)
 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.advance(BigQueryStorageStreamSource.java:211)
 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:617)
 ... 14 more }}


> BigQuery IO - Storage: Fraction consumed shrinking between responses
> --------------------------------------------------------------------
>
>                 Key: BEAM-11061
>                 URL: https://issues.apache.org/jira/browse/BEAM-11061
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.23.0
>            Reporter: Jonas Grabber
>            Priority: P2
>
> When reading a couple of million rows (and above 100 Gigabytes) from BigQuery 
> Storage (DIRECT_READ) with Dataflow and above 8 vCPUs (4x n1-standard-4) the 
> attached exception is thrown about once per vCPU.
> The issue seems to be that the value of fraction_consumed in the 
> [StreamStatus|https://cloud.google.com/bigquery/docs/reference/storage/rpc/google.cloud.bigquery.storage.v1beta1#streamstatus]
>  object returned from the Storage API decreased between responses.
> I tested this repeatedly with varying amounts of input data, number of 
> workers and machine types and was able to reproduce the issue repeatedly with 
> different configurations above 8 vCPUs used (16, 32, 128 and n1-standard-4, 
> n1-standard-8, and n1-standard-16).
> So far Jobs with 8 vCPUs ran fine.
> {{ Error message from worker: java.io.IOException: Failed to advance reader 
> of source: name: "projects/REDACTED/locations/eu/streams/REDACTED" 
> org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:620)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.advance(ReadOperation.java:399)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:194)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
>  
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:417)
>  
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:386)
>  
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:311)
>  
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:140)
>  
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:120)
>  
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
>  java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  java.lang.Thread.run(Thread.java:748) Caused by: 
> java.lang.IllegalArgumentException: Fraction consumed from the current 
> response (0.7945484519004822) has to be larger than or equal to the fraction 
> consumed from the previous response (0.8467302322387695). 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.readNextRecord(BigQueryStorageStreamSource.java:242)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryStorageStreamSource$BigQueryStorageStreamReader.advance(BigQueryStorageStreamSource.java:211)
>  
> org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.advance(WorkerCustomSources.java:617)
>  ... 14 more}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-11061) BigQuery IO - Storage: Fraction consumed shrinking between responses

Reply via email to