OK, seems like beam.BatchElements(max_batch_size=x) will do the trick for me and runs fine in DataFlow!
On Wed, Feb 5, 2020 at 7:38 AM Alan Krumholz <[email protected]> wrote: > Actually beam.GroupIntoBatches() gives me the same error as > beam.util.GroupIntoBatches() :( > back to square one. > > Any other ideas? > > Thank you! > > > On Wed, Feb 5, 2020 at 7:32 AM Alan Krumholz <[email protected]> > wrote: > >> Never mind there seems to be a beam.GroupIntoBatches() that I >> should have originally used instead of beam.util.GroupIntoBatches().... >> >> On Wed, Feb 5, 2020 at 7:19 AM Alan Krumholz <[email protected]> >> wrote: >> >>> Hello, I'm having issues running beam.util.GroupIntoBatches() in >>> DataFlow. >>> >>> I get the following error message: >>> >>> Exception: Requested execution of a stateful DoFn, but no user state >>>> context is available. This likely means that the current runner does not >>>> support the execution of stateful DoFns >>> >>> >>> Seems to be related to: >>> >>> https://stackoverflow.com/questions/56403572/no-userstate-context-is-available-google-cloud-dataflow >>> >>> Is there another way I can achieve the same using other beam function? >>> >>> I basically want to batch rows into groups of 100 as it is a lot faster >>> to transform all at once than doing it 1 by 1. >>> >>> I also was planning to use this function for a custom snowflake sink (so >>> I could insert many rows at once) >>> >>> I'm sure there must be another way to do this in DataFlow but not sure >>> how? >>> >>> Thanks so much! >>> >>
