[jira] [Updated] (BEAM-12449) OutOfMemoryError in Google Dataflow pipeline in 2.29.0

Beam JIRA Bot (Jira) Mon, 16 Aug 2021 10:23:08 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-12449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Beam JIRA Bot updated BEAM-12449:
---------------------------------
    Labels:   (was: stale-P2)

> OutOfMemoryError in Google Dataflow pipeline in 2.29.0
> ------------------------------------------------------
>
>                 Key: BEAM-12449
>                 URL: https://issues.apache.org/jira/browse/BEAM-12449
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.29.0
>            Reporter: Nikolai Romanov
>            Priority: P3
>
> Our pipeline reads data from PubSub and writes it to BigQuery.
> When we updated our beam version to 2.29.0 we started  having  lots of errors 
> like
> {quote}"*~*~*~ Channel ManagedChannelImpl\{logId=59, 
> target=bigquerystorage.googleapis.com:443} was not shutdown properly!!! ~*~*~*
> {quote}
> which is already mentioned in tickets:
> https://issues.apache.org/jira/browse/BEAM-12365
> https://issues.apache.org/jira/browse/BEAM-12356
> But for us the worst thing was the fact we started having lots of 
> OutOfMemoryError errors:
> {code:JSON}
> {
>   "insertId": "3218987470810364545:19344:0:33525266",
>   "jsonPayload": {
>     "thread": "36781",
>     "job": "2021-06-02_05_56_45-14040068175423437671",
>     "stage": "P2",
>     "exception": "java.lang.OutOfMemoryError: unable to create native thread: 
> possibly out of memory or process/resource limits reached\n\tat 
> java.base/java.lang.Thread.start0(Native Method)\n\tat 
> java.base/java.lang.Thread.start(Thread.java:803)\n\tat 
> java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937)\n\tat
>  
> java.base/java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1005)\n\tat
>  
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat
>  
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat
>  java.base/java.lang.Thread.run(Thread.java:834)\n",
>     "work": "ac872063622e9bea-1fc5c3cfe1c5e604",
>     "logger": 
> "org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker",
>     "worker": "text-events-processor-06020556-7d78-harness-b3z1",
>     "message": "Uncaught exception in main thread. Exiting with status code 
> 1."
>   },
>   "resource": {
>     "type": "dataflow_step",
>     "labels": {
>       "project_id": "bolcom-stg-trex-c7d",
>       "job_name": "text-events-processor",
>       "region": "europe-west1",
>       "job_id": "2021-06-02_05_56_45-14040068175423437671",
>       "step_id": ""
>     }
>   },
>   "timestamp": "2021-06-02T22:28:07.363Z",
>   "severity": "ERROR",
>   "labels": {
>     "dataflow.googleapis.com/job_id": 
> "2021-06-02_05_56_45-14040068175423437671",
>     "compute.googleapis.com/resource_id": "3218987470810364545",
>     "compute.googleapis.com/resource_name": 
> "text-events-processor-06020556-7d78-harness-b3z1",
>     "dataflow.googleapis.com/region": "europe-west1",
>     "dataflow.googleapis.com/log_type": "supportability",
>     "compute.googleapis.com/resource_type": "instance",
>     "dataflow.googleapis.com/job_name": "text-events-processor"
>   },
>   "logName": 
> "projects/bolcom-stg-trex-c7d/logs/dataflow.googleapis.com%2Fworker",
>   "receiveTimestamp": "2021-06-02T22:28:09.656481597Z"
> }
> {code}
> I suspect that these two errors are related, but I think OutOfMemory 
> shouldn't be ignored at all.
> We had to rollback to 2.28.0. The pipeline works fine now, without any errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-12449) OutOfMemoryError in Google Dataflow pipeline in 2.29.0

Reply via email to