Looking at the dates on the Spark runner git log there was a PR merged to
change Spark translation from classes to URNs. I cannot see how this can
impact performance. Looking at the other queries in the dashboards, there
seems to be a great variability in the executions of the Spark runner to
the point of feeling we don't have guarantees anymore. I wonder if this was
because of other loads shared in the server(s), or because our sample is
too small for the standard deviation.

I would proceed with the release, the real question is if we can somehow
constraint the execution of this tests to have a more consistent output.


On Fri, Dec 7, 2018 at 4:10 PM Etienne Chauchot <echauc...@apache.org>
wrote:

> Hi all,
> Regarding query7 in spark:
> - there doesn't seem to be a functional regression: query passes and
> output size is still the same
>
> - Also the performance degradation seems to be only on spark, the other
> runners do not seem to suffer from it.
>
> - performance degradation seems to be constant from 11/12 so we can
> eliminate temporary load on the jenkins server that would generate delays
> in Max transform.
>
> => query7 uses Max transform, fanout and side inputs, has one of these
> parts recently (11/12/18) changed in spark?
>
> Etienne
>
> Le jeudi 06 décembre 2018 à 21:32 -0800, Chamikara Jayalath a écrit :
>
> Udi or anybody else who is familiar about Nexmark,  please -1 the vote
> thread if you think this particular performance regression for Spark/Direct
> runners is a blocker. Otherwise I think we can continue the vote.
>
> Thanks,
> Cham
>
> On Thu, Dec 6, 2018 at 6:19 PM Chamikara Jayalath <chamik...@google.com>
> wrote:
>
> Are either of these regressions due to known issues ? If not should they
> be considered release blockers ?
>
> Thanks,
> Cham
>
> On Thu, Dec 6, 2018 at 6:11 PM Udi Meiri <eh...@google.com> wrote:
>
> For DirectRunner there are regressions in query 7 sql direct runner batch
> mode
> <https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424&widget=732741424&container=411089194>
>  (2x)
> and streaming mode (5x).
>
>
> On Thu, Dec 6, 2018 at 5:59 PM Udi Meiri <eh...@google.com> wrote:
>
> I see a regression for query 7 spark runner batch mode
> <https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712&widget=1782465104&container=462502368>
>  on
> about 2018-11-13.
> [image: image.png]
>
> On Thu, Dec 6, 2018 at 2:46 AM Chamikara Jayalath <chamik...@google.com>
> wrote:
>
> Hi everyone,
>
> Please review and vote on the release candidate #1 for the version 2.9.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which is signed with the key with fingerprint EEAC70DF3D0BC23B [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.9.0-RC1" [5],
> * website pull request listing the release [6] and publishing the API
> reference manual [7].
> * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
> * Validation sheet with a tab for 2.9.0 release to help with validation
> [7].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Cham
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12344258
> [2] https://dist.apache.org/repos/dist/dev/beam/2.9.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1054/
> [5] https://github.com/apache/beam/tree/v2.9.0-RC1
> [6] https://github.com/apache/beam/pull/7215
> [7] https://github.com/apache/beam-site/pull/584
> [8]
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=2053422529
>
>

Reply via email to