Passing parameter to DoFn in Python

2018-04-10 Thread OrielResearch Eila Arich-Landkof
Hi all, Is it possible to pass a string parameter with DoFn function and what would be the syntax? The call should look something like that: beam.ParDo(SampleFn(samplePath)) how would the class definition be updated? class SampleFn(beam.DoFn): def process(self,element): Thanks, -- Eila

Re: Passing parameter to DoFn in Python

2018-04-10 Thread Robert Bradshaw
Yes, DoFns are normal Python classes. To do this you would write class SampleFn(beam.DoFn): def __init__(self, samplePath): self.samplePath = samplePath def process(self, element): # use self.samplePath here, will get to remote workers via pickling On Tue, Apr 10, 2018

How to implement @SplitRestriction for Splittable DoFn

2018-04-10 Thread Jiayuan Ma
Hi all, I'm trying to use ReplicateFn mentioned in this doc in my pipeline to speed up a nested for loop. The use case is exactly the same as "*Counting friends in common (cross join by key)*" section. However, I have trouble to make it work with beam

Re: Passing parameter to DoFn in Python

2018-04-10 Thread OrielResearch Eila Arich-Landkof
great! thanks On Tue, Apr 10, 2018 at 7:31 PM, Robert Bradshaw wrote: > Yes, DoFns are normal Python classes. To do this you would write > > class SampleFn(beam.DoFn): > def __init__(self, samplePath): > self.samplePath = samplePath > > def process(self,