OK, seems like beam.BatchElements(max_batch_size=x) will do the trick for
me and runs fine in DataFlow!

On Wed, Feb 5, 2020 at 7:38 AM Alan Krumholz <[email protected]>
wrote:

> Actually beam.GroupIntoBatches() gives me the same error as
> beam.util.GroupIntoBatches() :(
> back to square one.
>
> Any other ideas?
>
> Thank you!
>
>
> On Wed, Feb 5, 2020 at 7:32 AM Alan Krumholz <[email protected]>
> wrote:
>
>> Never mind there seems to be a  beam.GroupIntoBatches()  that I
>> should have originally used instead of beam.util.GroupIntoBatches()....
>>
>> On Wed, Feb 5, 2020 at 7:19 AM Alan Krumholz <[email protected]>
>> wrote:
>>
>>> Hello, I'm having issues running beam.util.GroupIntoBatches() in
>>> DataFlow.
>>>
>>> I get the following error message:
>>>
>>> Exception: Requested execution of a stateful DoFn, but no user state
>>>> context is available. This likely means that the current runner does not
>>>> support the execution of stateful DoFns
>>>
>>>
>>> Seems to be related to:
>>>
>>> https://stackoverflow.com/questions/56403572/no-userstate-context-is-available-google-cloud-dataflow
>>>
>>> Is there another way I can achieve the same using other beam function?
>>>
>>> I basically want to batch rows into groups of 100 as it is a lot faster
>>> to transform all at once than doing it 1 by 1.
>>>
>>> I also was planning to use this function for a custom snowflake sink (so
>>> I could insert many rows at once)
>>>
>>> I'm sure there must be another way to do this in DataFlow but not sure
>>> how?
>>>
>>> Thanks so much!
>>>
>>

Reply via email to