[jira] [Work logged] (BEAM-6923) OOM errors in jobServer when using GCS artifactDir

ASF GitHub Bot (Jira) Wed, 25 Sep 2019 16:20:54 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-6923?focusedWorklogId=318656&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-318656
 ]


ASF GitHub Bot logged work on BEAM-6923:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Sep/19 23:18
            Start Date: 25/Sep/19 23:18
    Worklog Time Spent: 10m 
      Work Description: angoenka commented on issue #9647: [BEAM-6923] limit 
number of concurrent artifact write to 8
URL: https://github.com/apache/beam/pull/9647#issuecomment-535261778
 
 
   > > > It may be much simpler to set the GCS upload buffer size to 1MiB to 
solve BEAM-6923 via setting 
[GcsUploadBufferSize](https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java#L91).
 I think historically we have used 1MiB in the past while the default is 64MiB
   > > 
   > > 
   > > This makes sense. Do you think we should apply this configuration for 
all the gcs write.
   > > I am a bit concern as it can impact performance of data path of gcs 
sink. We can enhance the filesystems configuration to allow passing filesystem 
specific options.
   > 
   > In portable pipeline execution we don't expect to run user transforms for 
reading/writing from GCS and this would only be for Runner interactions with 
GCS so I believe we could make it 1MiB for the GcsFileSystem by default.
   
   Makes sense. 
   Made the change to limit the buffer usage only for flink and spark job server
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 318656)
    Time Spent: 2h 40m  (was: 2.5h)

> OOM errors in jobServer when using GCS artifactDir
> --------------------------------------------------
>
>                 Key: BEAM-6923
>                 URL: https://issues.apache.org/jira/browse/BEAM-6923
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-harness
>            Reporter: Lukasz Gajowy
>            Assignee: Ankur Goenka
>            Priority: Major
>         Attachments: Instance counts.png, Paths to GC root.png, 
> Telemetries.png, beam6923-flink156.m4v, beam6923flink182.m4v, heapdump 
> size-sorted.png
>
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> When starting jobServer with artifactDir pointing to a GCS bucket: 
> {code:java}
> ./gradlew :beam-runners-flink_2.11-job-server:runShadow 
> -PflinkMasterUrl=localhost:8081 -PartifactsDir=gs://the-bucket{code}
> and running a Java portable pipeline with the following, portability related 
> pipeline options: 
> {code:java}
> --runner=PortableRunner --jobEndpoint=localhost:8099 
> --defaultEnvironmentType=DOCKER 
> --defaultEnvironmentConfig=gcr.io/<my-freshly-built-sdk-harness-image>/java:latest'{code}
>  
> I'm facing a series of OOM errors, like this: 
> {code:java}
> Exception in thread "grpc-default-executor-3" java.lang.OutOfMemoryError: 
> Java heap space
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.buildContentChunk(MediaHttpUploader.java:606)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:408)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:508)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
> at 
> com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:301)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){code}
>  
> This does not happen when I'm using a local filesystem for the artifact 
> staging location. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-6923) OOM errors in jobServer when using GCS artifactDir

Reply via email to