Fix cc to correct Holden.
On Fri, Mar 30, 2018 at 5:05 PM, 8 Gianfortoni <[email protected]> wrote:
> Hi dev team,
>
> I'm having a lot of trouble running any pipeline that calls GroupByKey.
> Maybe I'm doing something wrong, but for some reason I cannot get
> GroupByKey not to crash the program.
>
> I have edited wordcount.go and minimal_wordcount.go to work similarly to
> my own program, and it crashes for those as well.
>
> Here is the snippet of code I added to minimal_wordcount (full source
> attached):
>
> // Concept #3: Invoke the stats.Count transform on our
> PCollection of
>
> // individual words. The Count transform returns a new
> PCollection of
>
> // key/value pairs, where each key represents a unique word in
> the text.
>
> // The associated value is the occurrence count for that word.
>
> singles := beam.ParDo(s, func(word string) (string, int) {
>
> return word, 1
>
> }, words)
>
>
> grouped := beam.GroupByKey(s, singles)
>
>
> counted := beam.ParDo(s, func(word string, values func(*int)
> bool) (string, int) {
>
> sum := 0
>
> for {
>
> var i int
>
> if values(&i) {
>
> sum = sum + i
>
> } else {
>
> break
>
> }
>
> }
>
> return word, sum
>
> }, grouped)
>
>
> // Use a ParDo to format our PCollection of word counts into a
> printable
>
> // string, suitable for writing to an output file. When each
> element
>
> // produces exactly one element, the DoFn can simply return it.
>
> formatted := beam.ParDo(s, func(w string, c int) string {
>
> return fmt.Sprintf("%s: %v", w, c)
>
> }, counted)
>
>
>
> I also attached the full source code and output that happens when I run
> both wordcount and minimal_wordcount.
>
> Am I just doing something wrong here? In any case, it seems inappropriate
> to panic during runtime without any debugging information (save a stack
> trace, but only if you call beamx.Run() as opposed to direct.Execute(),
> which just dies without any info.
>
> Thank you so much,
> 8
>