Satya Vadapalli created NIFI-14453:
--------------------------------------
Summary: PutBigQuery Creates too many active streams
Key: NIFI-14453
URL: https://issues.apache.org/jira/browse/NIFI-14453
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Affects Versions: 2.3.0
Reporter: Satya Vadapalli
*The Issue:*
We recently migrated from Nifi 1.x to 2.x. We've been having an issue with
PutBigQuery processor. It used to work well on the older
version(PutBigQueryStreaming), because I believe it used the BigQuery REST API
vs the new one uses the Storage API. I'm running into an issue where the
processor is opening over 10k streams, instead of reusing the existing stream.
Here's the error message.
{code:java}
PutBigQuery[id=01bc2d0f-0196-1000-0000-0000541e88b5] Processing halted:
yielding [1 sec]: com.google.api.gax.rpc.FailedPreconditionException:
io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Table has too many active
streams with too little traffic. Please send more traffic through existing
streams or finalize unused streams. Table=1087270590813:xxxxxx.xxxxxxxx.
ActiveStreamCount=10139. ActualPerStreamBytesPerSec=0.0333333.
RequiredPerStreamBytesPerSecForMoreStreams=300000. If you have already
terminated all the traffic, the error will go away in two hours. To avoid this
problem in the long term, please use less streams for the same amount of data
Entity: projects/xxx/datasets/xxx/tables/xxx - Caused by:
io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Table has too many active
streams with too little traffic. Please send more traffic through existing
streams or finalize unused streams. Table=1087270590813:xxxxxxx.xxxxxxxxxx.
ActiveStreamCount=10139. ActualPerStreamBytesPerSec=0.0333333.
RequiredPerStreamBytesPerSecForMoreStreams=300000. If you have already
terminated all the traffic, the error will go away in two hours. To avoid this
problem in the long term, please use less streams for the same amount of data
Entity: projects/xxx/datasets/xxx/tables/xxx
{code}
*What is Expected:*
The PutBigQuery should be able to stream data into Bigquery, with a single
stream, using the Storage Write API as described in the document -
[https://cloud.google.com/bigquery/docs/write-api-streaming]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)