Fix cc to correct Holden. On Fri, Mar 30, 2018 at 5:05 PM, 8 Gianfortoni <8...@tokentransit.com> wrote:
> Hi dev team, > > I'm having a lot of trouble running any pipeline that calls GroupByKey. > Maybe I'm doing something wrong, but for some reason I cannot get > GroupByKey not to crash the program. > > I have edited wordcount.go and minimal_wordcount.go to work similarly to > my own program, and it crashes for those as well. > > Here is the snippet of code I added to minimal_wordcount (full source > attached): > > // Concept #3: Invoke the stats.Count transform on our > PCollection of > > // individual words. The Count transform returns a new > PCollection of > > // key/value pairs, where each key represents a unique word in > the text. > > // The associated value is the occurrence count for that word. > > singles := beam.ParDo(s, func(word string) (string, int) { > > return word, 1 > > }, words) > > > grouped := beam.GroupByKey(s, singles) > > > counted := beam.ParDo(s, func(word string, values func(*int) > bool) (string, int) { > > sum := 0 > > for { > > var i int > > if values(&i) { > > sum = sum + i > > } else { > > break > > } > > } > > return word, sum > > }, grouped) > > > // Use a ParDo to format our PCollection of word counts into a > printable > > // string, suitable for writing to an output file. When each > element > > // produces exactly one element, the DoFn can simply return it. > > formatted := beam.ParDo(s, func(w string, c int) string { > > return fmt.Sprintf("%s: %v", w, c) > > }, counted) > > > > I also attached the full source code and output that happens when I run > both wordcount and minimal_wordcount. > > Am I just doing something wrong here? In any case, it seems inappropriate > to panic during runtime without any debugging information (save a stack > trace, but only if you call beamx.Run() as opposed to direct.Execute(), > which just dies without any info. > > Thank you so much, > 8 >