Hey,I would vote -0 : here is the explanation: 
I took a look at Nexmark dashboards for output size and performance for all the 
runners in all the modes around the date
of the release cut to search for regressions. 
I noted a regression on the performance of the spark runner. Query4, Query6, 
Query8 and Query9 running times were
multiplied by 2 to 3 around the date of 10/05/18. See 
https://apache-beam-testing.appspot.com/explore?dashboard=51383802
91571712So I searched in the commit history of the spark runner module for what 
happened around 10/05/18. And I found
this commit 
e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181: [BEAM-4783] 
Add bundleSize for splitting
BoundedSources
I don't know if it should be considered a blocker but we should definitely take 
another look at pull request #6181 that
seems to change the way we split on spark runner.
BestEtienne

Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
> +1 (binding)
> On 26.10.18 17:45, Kenneth Knowles wrote:
> Nice. Thanks.
> +1
> 
> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <[email protected] 
> <mailto:[email protected]>> wrote:
>     Thanks Tim!
>     This was my only hesitation, and sounds like we're in the clear here.
>     +1 (binding)    On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson    
> <[email protected]
> <mailto:[email protected]>> wrote:     >     > A colleague and I 
> tested on 2.7.0 and 2.8.0RC1:     >     > 1.
> Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in    spreadsheet)     
> > 2. Our Avro to Avro pipelines on
> Spark/YARN/HDFS (note we    backport the un-merged BEAM-5036 fix in our code) 
>     > 3. Our Avro to Elasticsearch
> pipelines on Spark/YARN/HDFS     >     > Everything worked, and performance 
> was similar on both.     > We built using
> maven pointing at    
> https://repository.apache.org/content/repositories/orgapachebeam-1049//     > 
>     > Based on this
> limited testing: +1     >     > Thank you to the release managers,     > Tim  
>    >     >     > On Thu, Oct 25, 2018 at
> 7:21 PM Tim <[email protected]    <mailto:[email protected]>> 
> wrote:     >>     >> I can do some tests
> on Spark / YARN tomorrow (CEST timezone).    Sorry I’ve just been too busy to 
> assist.     >>     >> Tim     >>     >>
> On 25 Oct 2018, at 18:59, Kenneth Knowles <[email protected]    
> <mailto:[email protected]>> wrote:     >>     >> I tried
> to do a more thorough job on this.     >>     >>  - I could not reproduce the 
> slowdown in Query 9. I believe
> the    variance was simply high given the parameters and environment     >>  
> - I saw the same slowdown in Query 8 when
> running as part of    the suite, but it vanished when I ran repeatedly on its 
> own, so    again it is not good
> methodology probably     >>     >> We do have the dashboard at    
> https://apache-beam-testing.appspot.com/dashboard-ad
> min though no    anomaly detection set up AFAIK.     >>     >>  - There is no 
> issue easily visible in
> DirectRunner:    
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424    
>  >>  - There is a
> notable degradation in Spark runner on 10/5 for    many queries.    
> https://apache-beam-testing.appspot.com/explore?da
> shboard=5138380291571712     >>  - Something minor happened for Dataflow 
> around 10/1:    https://apache-beam-testing.a
> ppspot.com/explore?dashboard=5670405876482048     >>  - Flink runner seems to 
> have had some fantastic
> improvements    :-)    
> https://apache-beam-testing.appspot.com/explore?dashboard=56992575877283844   
>   >>     >> So if
> there is a blocker it would really be the Spark runner    perf changes. Of 
> course, all these except Dataflow are using
> local    instances so may not be representative of larger scale AFAIK.     >> 
>     >> Kenn     >>     >> On Wed, Oct
> 24, 2018 at 9:48 AM Maximilian Michels    <[email protected] 
> <mailto:[email protected]>> wrote:     >>>     >>> I've run
> WordCount using Quickstart with the FlinkRunner    (locally and     >>> 
> against a Flink cluster).     >>>     >>>
> Would give a +1 but waiting what Kenn finds.     >>>     >>> -Max     >>>     
> >>> On 23.10.18 07:11, Ahmet Altay
> wrote:     >>> >     >>> >     >>> > On Mon, Oct 22, 2018 at 10:06 PM, 
> Kenneth Knowles    <[email protected]
> <mailto:[email protected]>     >>> > <mailto:[email protected] 
> <mailto:[email protected]>>> wrote:     >>> >     >>> >   
>  You two did so much verification I had a hard time    finding something     
> >>> >     where my help was meaningful!
> :-)     >>> >     >>> >     I did run the Nexmark suite on the DirectRunner 
> against    2.7.0 and     >>> >     2.8.0
> following     >>> >    
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunne
> r-locall     >>> >         
> <https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-direct
> runner-local>.     >>> >     >>> >     It is admittedly a very silly test - 
> the instructions leave     >>> >   
>  immutability enforcement on, etc. But it does appear that    there is a     
> >>> >     30% degradation in query 8 and
> 15% in query 9. These are    the pure     >>> >     Java tests, not the SQL 
> variants. The rest of the queries    are
> close     >>> >     enough that differences are not meaningful.     >>> >     
> >>> >     >>> > (It would be a good
> improvement for us to have alerts on daily     >>> > benchmarks if we do not 
> have such a concept already.)     >>>
> >     >>> >     >>> >     I would ask a little more time to see what is going 
> > on    here - is it     >>> >     a real
> performance issue or an artifact of how the tests are     >>> >     invoked, 
> or ...?     >>> >     >>> >     >>> >
> Thank you! Much appreciated. Please let us know when you are    done with     
> >>> > your investigation.     >>>
> >     >>> >     >>> >     Kenn     >>> >     >>> >     On Mon, Oct 22, 2018 
> > at 6:20 PM Ahmet
> Altay    <[email protected] <mailto:[email protected]>     >>> >     
> <mailto:[email protected]
> <mailto:[email protected]>>> wrote:     >>> >     >>> >         Hi all,     
> >>> >     >>> >         Did you have a
> chance to review this RC? Between me    and Robert     >>> >         we ran a 
> significant chunk of the validations.
> Let me    know if     >>> >         you have any questions.     >>> >     >>> 
> >         Ahmet     >>> >     >>> >     
>    On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay    <[email protected] 
> <mailto:[email protected]>     >>> >       
>  <mailto:[email protected] <mailto:[email protected]>>>    wrote:     >>> >     
> >>> >             Hi everyone,     >>>
> >     >>> >             Please review and vote on the release candidate    #1 
> > for the     >>> >             version
> 2.8.0, as follows:     >>> >             [ ] +1, Approve the release     >>> 
> >             [ ] -1, Do not approve the
> release (please    provide specific     >>> >             comments)     >>> > 
>     >>> >             The complete
> staging area is available for your    review,     >>> >             which 
> includes:     >>> >             * JIRA
> release notes [1],     >>> >             * the official Apache source release 
> to be    deployed to     >>> >
> dist.apache.org <http://dist.apache.org>    <http://dist.apache.org> [2], 
> which is     >>> >             signed with
> the key with fingerprint 6096FA00 [3],     >>> >             * all artifacts 
> to be deployed to the Maven
> Central     >>> >             Repository [4],     >>> >             * source 
> code tag "v2.8.0-RC1" [5],     >>> >     
>        * website pull request listing the release and    publishing     >>> > 
>             the API reference manual
> [6].     >>> >             * Python artifacts are deployed along with the 
> source     >>> >             release to the
> dist.apache.org    <http://dist.apache.org> <http://dist.apache.org> [2].     
> >>> >             * Validation sheet
> with a tab for 2.8.0 release    to help with     >>> >             validation 
> [7].     >>> >     >>> >             The
> vote will be open for at least 72 hours. It    is adopted     >>> >           
>   by majority approval, with at least 3
> PMC    affirmative votes.     >>> >     >>> >             Thanks,     >>> >   
>           Ahmet     >>> >     >>> >     
>        [1]     >>> >    
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=123439855
>    
>   >>> >                 
> <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985>
>   
>    >>> >             [2] https://dist.apache.org/repos/dist/dev/beam/2.8.00   
>   >>> >             <https://dist.apache
> .org/repos/dist/dev/beam/2.8.0>     >>> >             [3] 
> https://dist.apache.org/repos/dist/dev/beam/KEYSS     >>> > 
>            <https://dist.apache.org/repos/dist/dev/beam/KEYS>     >>> >       
>       [4]     >>> >    https://repositor
> y.apache.org/content/repositories/orgapachebeam-1049//     >>> >              
>    <https://repository.apache.org/conten
> t/repositories/orgapachebeam-1049/>     >>> >             [5] 
> https://github.com/apache/beam/tree/v2.8.0-RC11     >>>
> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>     >>> >      
> >        [6] https://github.com/apache/bea
> m-site/pull/5833     >>> >             
> <https://github.com/apache/beam-site/pull/583> and     >>> > 
> https://github.com
> /apache/beam/pull/67455     >>> >             
> <https://github.com/apache/beam/pull/6745>     >>> >           
>  [7]     >>> >    
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712
> 8166     >>> >                 
> <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/ed
> it#gid=1854712816>     >>> >     >>> >     >>> >

Reply via email to