Oh, I forgot to mention that I pulled from master with this commit as
latest:
https://github.com/apache/beam/commit/95a524e52606de1467b5d8b2cc99263b8a111a8d



On Fri, Mar 30, 2018, 5:09 PM 8 Gianfortoni <8...@tokentransit.com> wrote:

> Fix cc to correct Holden.
>
> On Fri, Mar 30, 2018 at 5:05 PM, 8 Gianfortoni <8...@tokentransit.com> wrote:
>
>> Hi dev team,
>>
>> I'm having a lot of trouble running any pipeline that calls GroupByKey.
>> Maybe I'm doing something wrong, but for some reason I cannot get
>> GroupByKey not to crash the program.
>>
>> I have edited wordcount.go and minimal_wordcount.go to work similarly to
>> my own program, and it crashes for those as well.
>>
>> Here is the snippet of code I added to minimal_wordcount (full source
>> attached):
>>
>>         // Concept #3: Invoke the stats.Count transform on our
>> PCollection of
>>
>>         // individual words. The Count transform returns a new
>> PCollection of
>>
>>         // key/value pairs, where each key represents a unique word in
>> the text.
>>
>>         // The associated value is the occurrence count for that word.
>>
>>         singles := beam.ParDo(s, func(word string) (string, int) {
>>
>>                 return word, 1
>>
>>         }, words)
>>
>>
>>         grouped := beam.GroupByKey(s, singles)
>>
>>
>>         counted := beam.ParDo(s, func(word string, values func(*int)
>> bool) (string, int) {
>>
>>                 sum := 0
>>
>>                 for {
>>
>>                         var i int
>>
>>                         if values(&i) {
>>
>>                                 sum = sum + i
>>
>>                         } else {
>>
>>                                 break
>>
>>                         }
>>
>>                 }
>>
>>                 return word, sum
>>
>>         }, grouped)
>>
>>
>>         // Use a ParDo to format our PCollection of word counts into a
>> printable
>>
>>         // string, suitable for writing to an output file. When each
>> element
>>
>>         // produces exactly one element, the DoFn can simply return it.
>>
>>         formatted := beam.ParDo(s, func(w string, c int) string {
>>
>>                 return fmt.Sprintf("%s: %v", w, c)
>>
>>         }, counted)
>>
>>
>>
>> I also attached the full source code and output that happens when I run
>> both wordcount and minimal_wordcount.
>>
>> Am I just doing something wrong here? In any case, it seems inappropriate
>> to panic during runtime without any debugging information (save a stack
>> trace, but only if you call beamx.Run() as opposed to direct.Execute(),
>> which just dies without any info.
>>
>> Thank you so much,
>> 8
>>
>
>

Reply via email to