GitHub user Xazax-hun opened a pull request:
https://github.com/apache/flink/pull/2211
[WIP][FLINK-3599] Code generation for PojoSerializer and PojoComparator
The current implementation of the serializers can be a
performance bottleneck in some scenarios. These performance problems were
also reported on the mailing list recently [1].
E.g. the PojoSerializer uses reflection for accessing the fields, which is
slow [2].
For the complete proposal see [3].
This pull request implements code generation support for PojoComparators
and PojoSerializers. On my machine I could measure about 10% performance
improvements for the WordCountPojo example. This pull request does not
implement distribution of the generated code to the task managers yet.
[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Tuple-performance-and-the-curious-JIT-compiler-td10666.html
[2]
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/PojoSerializer.java#L369
[3]
https://docs.google.com/document/d/1VC8lCeErx9kI5lCMPiUn625PO0rxR-iKlVqtt3hkVnk
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Xazax-hun/flink serializer_codegen
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2211.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2211
----
commit 6263ebe496ed7a0ac9ca9df35ffcdb8633519944
Author: Gabor Horvath <[email protected]>
Date: 2016-04-17T13:40:33Z
Implement PojoSerializer and PojoComparator generators.
commit be698b44453f10add284db1c5dee24f719a87902
Author: Gabor Horvath <[email protected]>
Date: 2016-07-03T13:58:41Z
Migrate code generation templates from string literals to files.
commit d8c63a1749a439907ef6bfbdb2da1962df7b61d3
Author: Gabor Horvath <[email protected]>
Date: 2016-07-06T11:23:29Z
Fix a bunch of test failures.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---