[jira] [Commented] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691914#comment-15691914
 ] 

ASF GitHub Bot commented on KAFKA-4424:
---

Github user MatthiasBechtold closed the pull request at:

https://github.com/apache/kafka/pull/2152


> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Matthias Bechtold (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691904#comment-15691904
 ] 

Matthias Bechtold commented on KAFKA-4424:
--

I see, thank you for your time.

I did not think right away that the order of tests would matter.

But this was very interesting to say the least, I think now that I was wrong 
about the final keyword there (interesting article from IBM there).

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691817#comment-15691817
 ] 

Michael Andre Pearce (IG) commented on KAFKA-4424:
--

The link you reference re virtual calls.

This is much more about monomorphic call or polymorphic calls. Making a class 
that implements an interface final, where the method invocation is by interface 
methods, does not change this.

This is more to do with the number of class's loaded that implement the 
interface. 

So in case of single implementation being used and loaded your jvm you have a 
monomorphic case for the interface, the JVM will inline this (final or not).

If you happen to have two implementations being used and loaded the jvm will 
still be able to inline but will create a branch case, the second loaded 
implementation will be slower if invoked due to the branch.

If you have more than two implementations loaded the JVM will on loading these 
do on stack replacement of the previously loaded inlined, and more to using 
visual tables.


You'll see this occur if you turn on -XX:+PrintCompilation

A classical implementation and write up showing this is:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note taking the code, and running it with or without final 
implementations makes no difference.

Also i've taken this classical test, for your final and non final cases 
(attached to this jira), if you note I've loaded two versions, one with the 
final being declared and loaded by the JVM first and vice versa. As you not in 
both the implementation loaded first due to the inlined branch will be more 
performant.

On checking your original test case we noted that the FinalByteArraySerializer 
version runs first (due to alphabetic ordering that test are run in) , as such 
it would be always the first in the inline branch benefitting from this, this 
would explain why it seems always final was negligible faster when running your 
benchmark test case.


> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
> Attachments: FinalTest.java, FinalTestReversed.java
>
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691736#comment-15691736
 ] 

Michael Andre Pearce (IG) commented on KAFKA-4424:
--

Just reading the link you reference.

Virtual calls and jump tables is monomorphic vs polymorphic calls.

If only one implementation(this implementation being final or not) of an 
interface is loaded then you will get a monomorphic implementation and it can 
be fully inlined. 

If you load another, your method is inlined but with a branch, once you load 
further hotspot will get on stack replacement and this is when jump tables are 
needed.

You can see this occurring with on stack replacement with 
-XX:+PrintCompilation, as classes are loaded.

A classical test case showing this:
http://mechanical-sympathy.blogspot.co.uk/2012/04/invoke-interface-optimisations.html

You'll note that making the class's final or not in the test, makes no 
difference to the outcome.

In case of serialisers/deserialisers this is our case as we have an interface 
that is implemented, as such making these implementations final doesn't do much.

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691571#comment-15691571
 ] 

Michael Andre Pearce (IG) commented on KAFKA-4424:
--

Hi Matthias,

When we took your test and ran it this evening (with the same parameters) we 
get:

# Run complete. Total time: 01:40:11

Benchmark   Mode   CntScore  Error  
Units
KafkaBenchmark.testFinalSerializer thrpt  1000 9378.179 ±   26.059  
ops/s
KafkaBenchmark.testFinalSerializerNoFlush  thrpt  1000  1283796.450 ± 4976.711  
ops/s
KafkaBenchmark.testSerializer  thrpt  1000 9325.273 ±   26.581  
ops/s
KafkaBenchmark.testSerializerNoFlush   thrpt  1000  1289296.549 ± 5127.774  
ops/s

The performance difference we are seeing is very negligible at best. We have 
run this across a few machines (1 macbook, 1 cisco blade server, 1 nutanix vm) 
within our company and get similar results between final and non-final. (we 
actually had one result come in from the linux vm running on our nutanix 
clusters where the non-final was negligible faster, we repeated and it reversed 
to have the other negligible faster, but it does show that this seems to be 
negligible).

We have run on all the machines using the latest Kakfa 0.10.1.0 version, on JDK 
1.8.0_112, VM 25.112-b16 and kafka setup locally (as per we understand you have 
done), the macbook was on El Capitan, and the cisco blade and nutanix vm are 
running RHEL7.

The above stats copied are from running in particular on a laptop (but are 
inline with what we've seen also in our server environment) just easier to copy 
from as our server environments are protected.

MacBook Pro (Retina, 15-inch, Mid 2015)
2.2 GHz Intel Core i7
16 GB 1600 MHz DDR3


This is what we have come to expect, essentially making a class final doesn't 
give as much a speed boost as people come to think, as the modern jvm compilers 
like hotspot (there is another commercial one which I'm sure we all know of ;) 
but as its propriety/commercial shall avoid it for this discussion ), they 
really do a lot of magic under the hood for us.

Whilst this article from way back when is very dated and now not relevant in 
regards to precise jvm internals as a lot has moved on. 
http://www.ibm.com/developerworks/java/library/j-jtp1029/index.html

There is one core take away i think is important, and is more important when 
making a decision to use final or not.

"
final classes and methods can be a significant inconvenience when programming 
-- they limit your options for reusing existing code and extending the 
functionality of existing classes. While sometimes a class is made final for a 
good reason, such as to enforce immutability, the benefits of using final 
should outweigh the inconvenience. Performance enhancement is almost always a 
bad reason to compromise good object-oriented design principles, and when the 
performance enhancement is small or nonexistent, this is a bad trade-off indeed.
"

As stated, this is a change to API level, then i think really this should need 
a KIP, and no doubt some further tests and arguments on a KIP discussion for 
the pros and con's.

Its worth noting also very recently the ProducerRecord and ConsumerRecord, for 
extensibility reasons were made non-final, if you take the current master.

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4424) Make serializer classes final

2016-11-23 Thread Matthias Bechtold (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690654#comment-15690654
 ] 

Matthias Bechtold commented on KAFKA-4424:
--

Thank you for your feedback, I can totally see your point.

So I did some tests concerning the *producer* throughput, using JMH:

https://github.com/MatthiasBechtold/kafka-throughput-test

I think performance somewhat varies based on test envrionment/method.

Anyways I experienced a small performance gain (between 0,26% - 5,5%) , which 
is of course somewhat greater in warmup phase (this also helps, doesn't it?).

This is not an end-to-end test as proposed, but I did not figure out how to 
quickly implement such test. This test should also provide at least some 
evidence towards possible performance gains.

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4424) Make serializer classes final

2016-11-20 Thread Michael Andre Pearce (IG) (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15681447#comment-15681447
 ] 

Michael Andre Pearce (IG) commented on KAFKA-4424:
--

Is there a supporting test to prove a significant performance improvement in an 
end 2 end setup after jit/jvm warm up? 

Obvious reasons is that this would break any code in organisations thats 
extended this as its part of the client api, so to make this change needs to 
prove a significant performance improvement % to warrant breaking these.

> Make serializer classes final
> -
>
> Key: KAFKA-4424
> URL: https://issues.apache.org/jira/browse/KAFKA-4424
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Matthias Bechtold
>Priority: Minor
>
> Implementations of simple serializers / deserializers should be final to 
> prevent JVM method call overhead.
> See also:
> https://wiki.openjdk.java.net/display/HotSpot/VirtualCalls
> This breaks the API slightly, inheritors must change to generic interfaces 
> Serializer / Deserializer. But architecture-wise final serialization classes 
> make the most sense to me.
> So what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)