[ 
https://issues.apache.org/jira/browse/BEAM-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada updated BEAM-4518:
--------------------------------
    Fix Version/s:     (was: 2.6.0)
                   2.7.0

> Errors when running Python Game stats with a low fixed_window_duration in the 
> DirectRunner
> ------------------------------------------------------------------------------------------
>
>                 Key: BEAM-4518
>                 URL: https://issues.apache.org/jira/browse/BEAM-4518
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.5.0, 2.6.0
>            Reporter: Ahmet Altay
>            Priority: Major
>             Fix For: 2.7.0
>
>
> Using the injector and the following command to start the DirectRunner 
> pipeline:
> python -m apache_beam.examples.complete.game.game_stats \
> --project=google.com:clouddfe \
> --topic projects/google.com:clouddfe/topics/leader_board-$USER-topic-1 \
> --dataset ${USER}_test --fixed_window_duration 1
> Fails with:
> ValueError: PCollection of size 2 with more than one element accessed as a 
> singleton view. First two elements encountered are "13.93", "11.7777777778". 
> [while running 'CalculateSpammyUsers/ProcessAndFilter']
> Offending code is here:
> global_mean_score = (
>  sum_scores
>  | beam.Values()
>  | beam.CombineGlobally(beam.combiners.MeanCombineFn())\
>  .as_singleton_view())
>  # Filter the user sums using the global mean.
>  filtered = (
>  sum_scores
>  # Use the derived mean total score (global_mean_score) as a side input.
>  | 'ProcessAndFilter' >> beam.Filter(
>  lambda key_score, global_mean:\
>  key_score[1] > global_mean * self.SCORE_WEIGHT,
>  global_mean_score))
> Since global_mean_score is the result of CombineGlobally, this is either an 
> issue with CombineGlobally or side inputs implementation in DirectRunner. The 
> latter is more likely since it works on DataflowRunner.
> cc: [~mariagh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to