[
https://issues.apache.org/jira/browse/BEAM-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570847#comment-16570847
]
Reuven Lax commented on BEAM-5092:
----------------------------------
Looking at the PR there are only two things that could have affected Nexmark:
1. I changed various longs to DateTime objects. This might be slower, but is
a good change here.
2. I added a schema to Nexmark types for SQL nexmark, but looks like that is
taking priority over the hard-coded Nexmark Coder. This is easily fixed.
Also, SchemaCoder isn't expected to be slow, but on investigation it appears
that SchemaCoder does not implement structuralValue. This means that Beam will
encode elements every time it want to check them for equality, which is quite
expensive. This is also easy to fix.
> Nexmark 10x performance regression
> ----------------------------------
>
> Key: BEAM-5092
> URL: https://issues.apache.org/jira/browse/BEAM-5092
> Project: Beam
> Issue Type: New Feature
> Components: sdk-java-core
> Reporter: Andrew Pilloud
> Assignee: Reuven Lax
> Priority: Critical
>
> There looks to be a 10x performance hit on the DirectRunner and Flink nexmark
> jobs. It first showed up in this build:
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Direct/151/changes]
> [https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424]
> [https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)