[
https://issues.apache.org/jira/browse/BEAM-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Braden Bassingthwaite closed BEAM-6117.
---------------------------------------
Resolution: Information Provided
Fix Version/s: Not applicable
> Dataflow Slowness
> -----------------
>
> Key: BEAM-6117
> URL: https://issues.apache.org/jira/browse/BEAM-6117
> Project: Beam
> Issue Type: Bug
> Components: sdk-go
> Reporter: Braden Bassingthwaite
> Assignee: Robert Burke
> Priority: Major
> Fix For: Not applicable
>
> Attachments: Screen Shot 2018-11-22 at 7.08.08 PM.png, Screen Shot
> 2018-11-22 at 7.08.32 PM.png, Screen Shot 2018-11-22 at 7.11.33 PM.png
>
>
> This is a pretty open ended ticket but we've been struggling with this for
> quite some time and hoping we can get assistance in getting our issue
> resolved.
>
> We wrote and contributed the datastore reader earlier this year and have been
> using it in our project in a couple of scenarios with success. The problem
> that we are facing is that our dataflows take a long time. We have datastore
> kinds that are 100M+ and they take 2-3 days to go over. We've try fiddling
> with all of the knobs available to us(datastore splits, cpus, turning off
> autoscaling, scope changes, updating libraries, etc...) and can't seem to
> make it go faster.
> My only hunch is that within the datastore reader when viewing the status in
> dataflows ui. Is that we see:
> Output collections
> DailyListingScore/main.queryFn.out0
> Elements added
> –
> Estimated size
> –
> I am assuming that these numbers would indicate to dataflow the progress that
> the step is making and scale up/down dependent on these numbers. Is this
> right? Or would these numbers have no bearing? We've tried starting the
> dataflow with 32+ workers and it will always scale down to 1-2 nodes after a
> couple of minutes. It seems as though dataflow isn't scaling up when it
> should. Any directions or assistance in getting this issue solved would be
> great!
>
> Thanks
>
>
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)