Re: Questions about writing to BigQuery using storage api

2023-12-07 Thread hsy...@gmail.com
Caused by is in the error message

On Thu, Dec 7, 2023 at 10:47 AM Reuven Lax via user 
wrote:

> This is the stack trace of the rethrown exception. The log should also
> contain a "caused by" log somewhere detailing the original exception. Do
> you happen to have that?
>
> On Thu, Dec 7, 2023 at 8:46 AM hsy...@gmail.com  wrote:
>
>> Here is the complete stacktrace  It doesn't even hit my code and it
>> happens consistently!
>>
>> Error message from worker: java.lang.RuntimeException:
>> java.lang.IllegalStateException
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>> Caused by: java.lang.IllegalStateException
>> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
>> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
>> Source)
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:211)
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:185)
>> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:340)
>> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
>> org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:54)
>> org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:285)
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:275)
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:85)
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:430)
>> org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.outputWithTimestamp(DoFnOutputReceivers.java:85)
>> org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn.flushBatch(GroupIntoBatches.java:660)
>> org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn.processElement(GroupIntoBatches.java:518)
>> org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn$DoFnInvoker.invokeProcessElement(Unknown
>> Source)
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:211)
>> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:185)
>> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:340)
>> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
>> org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:54)
>> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:218)
>> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:169)
>> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:83)
>> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1433)
>> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$800(StreamingDataflowWorker.java:155)
>> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$4.run(StreamingDataflowWorker.java:1056)
>> org.apache.beam.runners.dataflow.worker.util.BoundedQueueExecutor.lambda$executeLockHeld$0(BoundedQueueExecutor.java:163)
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>> java.base/java.lang.Thread.run(Thread.java:834)
>>
>> Regards,
>> Siyuan
>>
>> On Wed, Dec 6, 2023 at 10:12 AM Ahmed Abualsaud <
>> ahmedabuals...@google.com> wrote:
>>
>>> Hey, can you provide the full stack trace for the error you're seeing?
>>> Also is this happening consistently?
>>>

Re: Questions about writing to BigQuery using storage api

2023-12-07 Thread Reuven Lax via user
This is the stack trace of the rethrown exception. The log should also
contain a "caused by" log somewhere detailing the original exception. Do
you happen to have that?

On Thu, Dec 7, 2023 at 8:46 AM hsy...@gmail.com  wrote:

> Here is the complete stacktrace  It doesn't even hit my code and it
> happens consistently!
>
> Error message from worker: java.lang.RuntimeException:
> java.lang.IllegalStateException
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
> Caused by: java.lang.IllegalStateException
> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
> Source)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:211)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:185)
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:340)
> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
> org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:54)
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:285)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:275)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:85)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:430)
> org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.outputWithTimestamp(DoFnOutputReceivers.java:85)
> org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn.flushBatch(GroupIntoBatches.java:660)
> org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn.processElement(GroupIntoBatches.java:518)
> org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn$DoFnInvoker.invokeProcessElement(Unknown
> Source)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:211)
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:185)
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:340)
> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
> org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:54)
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:218)
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:169)
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:83)
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1433)
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$800(StreamingDataflowWorker.java:155)
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$4.run(StreamingDataflowWorker.java:1056)
> org.apache.beam.runners.dataflow.worker.util.BoundedQueueExecutor.lambda$executeLockHeld$0(BoundedQueueExecutor.java:163)
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> java.base/java.lang.Thread.run(Thread.java:834)
>
> Regards,
> Siyuan
>
> On Wed, Dec 6, 2023 at 10:12 AM Ahmed Abualsaud 
> wrote:
>
>> Hey, can you provide the full stack trace for the error you're seeing?
>> Also is this happening consistently?
>>
>> *+1* to raising a Google ticket where we'll have more visibility.
>>
>> On Wed, Dec 6, 2023 at 11:33 AM John Casey 
>> wrote:
>>
>>> Hmm. It may be best if you raise a ticket with 

Re: Questions about writing to BigQuery using storage api

2023-12-07 Thread hsy...@gmail.com
Here is the complete stacktrace  It doesn't even hit my code and it happens
consistently!

Error message from worker: java.lang.RuntimeException:
java.lang.IllegalStateException
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
Caused by: java.lang.IllegalStateException
org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
Source)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:211)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:185)
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:340)
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:54)
org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:285)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:275)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:85)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:430)
org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.outputWithTimestamp(DoFnOutputReceivers.java:85)
org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn.flushBatch(GroupIntoBatches.java:660)
org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn.processElement(GroupIntoBatches.java:518)
org.apache.beam.sdk.transforms.GroupIntoBatches$GroupIntoBatchesDoFn$DoFnInvoker.invokeProcessElement(Unknown
Source)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:211)
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:185)
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:340)
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:54)
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:218)
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:169)
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:83)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1433)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$800(StreamingDataflowWorker.java:155)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$4.run(StreamingDataflowWorker.java:1056)
org.apache.beam.runners.dataflow.worker.util.BoundedQueueExecutor.lambda$executeLockHeld$0(BoundedQueueExecutor.java:163)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:834)

Regards,
Siyuan

On Wed, Dec 6, 2023 at 10:12 AM Ahmed Abualsaud 
wrote:

> Hey, can you provide the full stack trace for the error you're seeing?
> Also is this happening consistently?
>
> *+1* to raising a Google ticket where we'll have more visibility.
>
> On Wed, Dec 6, 2023 at 11:33 AM John Casey 
> wrote:
>
>> Hmm. It may be best if you raise a ticket with Google support for this. I
>> can inspect your job directly if you do that, and that will make this more
>> straightforward.
>>
>> On Wed, Dec 6, 2023 at 11:24 AM hsy...@gmail.com 
>> wrote:
>>
>>> I’m just using dataflow engine
>>> On Wed, Dec 6, 2023 at 08:23 John Casey via user 
>>> wrote:
>>>
 Well, that is odd. It looks like 

Re: Questions about writing to BigQuery using storage api

2023-12-06 Thread Ahmed Abualsaud via user
Hey, can you provide the full stack trace for the error you're seeing? Also
is this happening consistently?

*+1* to raising a Google ticket where we'll have more visibility.

On Wed, Dec 6, 2023 at 11:33 AM John Casey  wrote:

> Hmm. It may be best if you raise a ticket with Google support for this. I
> can inspect your job directly if you do that, and that will make this more
> straightforward.
>
> On Wed, Dec 6, 2023 at 11:24 AM hsy...@gmail.com  wrote:
>
>> I’m just using dataflow engine
>> On Wed, Dec 6, 2023 at 08:23 John Casey via user 
>> wrote:
>>
>>> Well, that is odd. It looks like the underlying client is closed, which
>>> is unexpected.
>>>
>>> Do you see any retries in your pipeline? Also, what runner are you using?
>>>
>>> @Ahmed Abualsaud  this might be interesting
>>> to you too
>>>
>>> On Tue, Dec 5, 2023 at 9:39 PM hsy...@gmail.com 
>>> wrote:
>>>
 I'm using version 2.51.0 and The configuration is like this

 write
 .withoutValidation()
 .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
 .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
 .withExtendedErrorInfo()
 .withMethod(Write.Method.STORAGE_WRITE_API)
 .withTriggeringFrequency(Duration.standardSeconds(10))
 .withAutoSharding().optimizedWrites()
 .withFailedInsertRetryPolicy(retryTransientErrors());


 On Tue, Dec 5, 2023 at 11:20 AM John Casey via user <
 user@beam.apache.org> wrote:

> Hi,
>
> Could you add some more detail? Which beam version are you using?
>
>
> On Tue, Dec 5, 2023 at 1:52 PM hsy...@gmail.com 
> wrote:
>
>> Any one has experience in writing to BQ using storage api
>>
>> I tried to use it because according to the document it is more
>> efficient
>> but I got error below
>>
>> 2023-12-05 04:01:29.741 PST
>> Error message from worker: java.lang.RuntimeException:
>> java.lang.IllegalStateException
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>> Caused by: java.lang.IllegalStateException
>> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
>> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
>> Source)
>>
>


Re: Questions about writing to BigQuery using storage api

2023-12-06 Thread John Casey via user
Hmm. It may be best if you raise a ticket with Google support for this. I
can inspect your job directly if you do that, and that will make this more
straightforward.

On Wed, Dec 6, 2023 at 11:24 AM hsy...@gmail.com  wrote:

> I’m just using dataflow engine
> On Wed, Dec 6, 2023 at 08:23 John Casey via user 
> wrote:
>
>> Well, that is odd. It looks like the underlying client is closed, which
>> is unexpected.
>>
>> Do you see any retries in your pipeline? Also, what runner are you using?
>>
>> @Ahmed Abualsaud  this might be interesting
>> to you too
>>
>> On Tue, Dec 5, 2023 at 9:39 PM hsy...@gmail.com  wrote:
>>
>>> I'm using version 2.51.0 and The configuration is like this
>>>
>>> write
>>> .withoutValidation()
>>> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
>>> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>>> .withExtendedErrorInfo()
>>> .withMethod(Write.Method.STORAGE_WRITE_API)
>>> .withTriggeringFrequency(Duration.standardSeconds(10))
>>> .withAutoSharding().optimizedWrites()
>>> .withFailedInsertRetryPolicy(retryTransientErrors());
>>>
>>>
>>> On Tue, Dec 5, 2023 at 11:20 AM John Casey via user <
>>> user@beam.apache.org> wrote:
>>>
 Hi,

 Could you add some more detail? Which beam version are you using?


 On Tue, Dec 5, 2023 at 1:52 PM hsy...@gmail.com 
 wrote:

> Any one has experience in writing to BQ using storage api
>
> I tried to use it because according to the document it is more
> efficient
> but I got error below
>
> 2023-12-05 04:01:29.741 PST
> Error message from worker: java.lang.RuntimeException:
> java.lang.IllegalStateException
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
> Caused by: java.lang.IllegalStateException
> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
> Source)
>



Re: Questions about writing to BigQuery using storage api

2023-12-06 Thread hsy...@gmail.com
I’m just using dataflow engine
On Wed, Dec 6, 2023 at 08:23 John Casey via user 
wrote:

> Well, that is odd. It looks like the underlying client is closed, which is
> unexpected.
>
> Do you see any retries in your pipeline? Also, what runner are you using?
>
> @Ahmed Abualsaud  this might be interesting to
> you too
>
> On Tue, Dec 5, 2023 at 9:39 PM hsy...@gmail.com  wrote:
>
>> I'm using version 2.51.0 and The configuration is like this
>>
>> write
>> .withoutValidation()
>> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
>> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>> .withExtendedErrorInfo()
>> .withMethod(Write.Method.STORAGE_WRITE_API)
>> .withTriggeringFrequency(Duration.standardSeconds(10))
>> .withAutoSharding().optimizedWrites()
>> .withFailedInsertRetryPolicy(retryTransientErrors());
>>
>>
>> On Tue, Dec 5, 2023 at 11:20 AM John Casey via user 
>> wrote:
>>
>>> Hi,
>>>
>>> Could you add some more detail? Which beam version are you using?
>>>
>>>
>>> On Tue, Dec 5, 2023 at 1:52 PM hsy...@gmail.com 
>>> wrote:
>>>
 Any one has experience in writing to BQ using storage api

 I tried to use it because according to the document it is more efficient
 but I got error below

 2023-12-05 04:01:29.741 PST
 Error message from worker: java.lang.RuntimeException:
 java.lang.IllegalStateException
 org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
 org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
 Caused by: java.lang.IllegalStateException
 org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
 org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
 org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
 org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
 org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)

>>>


Re: Questions about writing to BigQuery using storage api

2023-12-06 Thread John Casey via user
Well, that is odd. It looks like the underlying client is closed, which is
unexpected.

Do you see any retries in your pipeline? Also, what runner are you using?

@Ahmed Abualsaud  this might be interesting to
you too

On Tue, Dec 5, 2023 at 9:39 PM hsy...@gmail.com  wrote:

> I'm using version 2.51.0 and The configuration is like this
>
> write
> .withoutValidation()
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
> .withExtendedErrorInfo()
> .withMethod(Write.Method.STORAGE_WRITE_API)
> .withTriggeringFrequency(Duration.standardSeconds(10))
> .withAutoSharding().optimizedWrites()
> .withFailedInsertRetryPolicy(retryTransientErrors());
>
>
> On Tue, Dec 5, 2023 at 11:20 AM John Casey via user 
> wrote:
>
>> Hi,
>>
>> Could you add some more detail? Which beam version are you using?
>>
>>
>> On Tue, Dec 5, 2023 at 1:52 PM hsy...@gmail.com  wrote:
>>
>>> Any one has experience in writing to BQ using storage api
>>>
>>> I tried to use it because according to the document it is more efficient
>>> but I got error below
>>>
>>> 2023-12-05 04:01:29.741 PST
>>> Error message from worker: java.lang.RuntimeException:
>>> java.lang.IllegalStateException
>>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
>>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>>> Caused by: java.lang.IllegalStateException
>>> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
>>> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
>>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
>>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
>>> Source)
>>>
>>


Re: Questions about writing to BigQuery using storage api

2023-12-05 Thread hsy...@gmail.com
I'm using version 2.51.0 and The configuration is like this

write
.withoutValidation()
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
.withExtendedErrorInfo()
.withMethod(Write.Method.STORAGE_WRITE_API)
.withTriggeringFrequency(Duration.standardSeconds(10))
.withAutoSharding().optimizedWrites()
.withFailedInsertRetryPolicy(retryTransientErrors());


On Tue, Dec 5, 2023 at 11:20 AM John Casey via user 
wrote:

> Hi,
>
> Could you add some more detail? Which beam version are you using?
>
>
> On Tue, Dec 5, 2023 at 1:52 PM hsy...@gmail.com  wrote:
>
>> Any one has experience in writing to BQ using storage api
>>
>> I tried to use it because according to the document it is more efficient
>> but I got error below
>>
>> 2023-12-05 04:01:29.741 PST
>> Error message from worker: java.lang.RuntimeException:
>> java.lang.IllegalStateException
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>> Caused by: java.lang.IllegalStateException
>> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
>> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
>> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
>> Source)
>>
>


Re: Questions about writing to BigQuery using storage api

2023-12-05 Thread John Casey via user
Hi,

Could you add some more detail? Which beam version are you using?


On Tue, Dec 5, 2023 at 1:52 PM hsy...@gmail.com  wrote:

> Any one has experience in writing to BQ using storage api
>
> I tried to use it because according to the document it is more efficient
> but I got error below
>
> 2023-12-05 04:01:29.741 PST
> Error message from worker: java.lang.RuntimeException:
> java.lang.IllegalStateException
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:573)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
> Caused by: java.lang.IllegalStateException
> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:496)
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl$1.pin(BigQueryServicesImpl.java:1403)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.lambda$process$12(StorageApiWritesShardedRecords.java:565)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn.process(StorageApiWritesShardedRecords.java:790)
> org.apache.beam.sdk.io.gcp.bigquery.StorageApiWritesShardedRecords$WriteRecordsDoFn$DoFnInvoker.invokeProcessElement(Unknown
> Source)
>