Satya Vadapalli created NIFI-14453:
--------------------------------------

             Summary: PutBigQuery Creates too many active streams
                 Key: NIFI-14453
                 URL: https://issues.apache.org/jira/browse/NIFI-14453
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
    Affects Versions: 2.3.0
            Reporter: Satya Vadapalli


*The Issue:*
We recently migrated from Nifi 1.x to 2.x. We've been having an issue with 
PutBigQuery processor. It used to work well on the older 
version(PutBigQueryStreaming), because I believe it used the BigQuery REST API 
vs the new one uses the Storage API. I'm running into an issue where the 
processor is opening over 10k streams, instead of reusing the existing stream. 
Here's the error message. 
 
{code:java}
PutBigQuery[id=01bc2d0f-0196-1000-0000-0000541e88b5] Processing halted: 
yielding [1 sec]: com.google.api.gax.rpc.FailedPreconditionException: 
io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Table has too many active 
streams with too little traffic. Please send more traffic through existing 
streams or finalize unused streams. Table=1087270590813:xxxxxx.xxxxxxxx. 
ActiveStreamCount=10139. ActualPerStreamBytesPerSec=0.0333333. 
RequiredPerStreamBytesPerSecForMoreStreams=300000. If you have already 
terminated all the traffic, the error will go away in two hours. To avoid this 
problem in the long term, please use less streams for the same amount of data 
Entity: projects/xxx/datasets/xxx/tables/xxx - Caused by: 
io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Table has too many active 
streams with too little traffic. Please send more traffic through existing 
streams or finalize unused streams. Table=1087270590813:xxxxxxx.xxxxxxxxxx. 
ActiveStreamCount=10139. ActualPerStreamBytesPerSec=0.0333333. 
RequiredPerStreamBytesPerSecForMoreStreams=300000. If you have already 
terminated all the traffic, the error will go away in two hours. To avoid this 
problem in the long term, please use less streams for the same amount of data 
Entity: projects/xxx/datasets/xxx/tables/xxx
{code}
*What is Expected:*
The PutBigQuery should be able to stream data into Bigquery, with a single 
stream, using the Storage Write API as described in the document - 
[https://cloud.google.com/bigquery/docs/write-api-streaming]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to