[
https://issues.apache.org/jira/browse/CAMEL-16354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17619667#comment-17619667
]
Nicolas Filotto commented on CAMEL-16354:
-----------------------------------------
*JMH Settings/Environment*
{noformat}
# JMH version: 1.35
# VM version: JDK 11, OpenJDK 64-Bit Server VM, 11+28
# VM invoker: /usr/local/jdk-11/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (auto-detected, use
-Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op{noformat}
*Literal Separator*
{noformat}
Benchmark (type) Mode Cnt
Score Error Units
MyBenchmark.newSplitterWithSpecificSeparator TINY avgt 15
98.854 ± 0.592 ns/op
MyBenchmark.newSplitterWithSpecificSeparator SMALL avgt 15
321.351 ± 13.413 ns/op
MyBenchmark.newSplitterWithSpecificSeparator MEDIUM avgt 15
2905.660 ± 17.381 ns/op
MyBenchmark.newSplitterWithSpecificSeparator BIG avgt 15
28820.933 ± 81.483 ns/op
Benchmark (type) Mode Cnt Score
Error Units
MyBenchmark.currentSplitterWithSpecificSeparator TINY avgt 15 1722.947
± 7.436 ns/op
MyBenchmark.currentSplitterWithSpecificSeparator SMALL avgt 15 3795.835
± 22.725 ns/op
MyBenchmark.currentSplitterWithSpecificSeparator MEDIUM avgt 15 29253.195
± 118.111 ns/op
MyBenchmark.currentSplitterWithSpecificSeparator BIG avgt 15 281669.154
± 4146.332 ns/op
{noformat}
=> The new approach is about 10 times faster with a more predictable average
time whatever the number of tokens to extract
*Pattern Separator*
{noformat}
Benchmark (type) Mode Cnt
Score Error Units
MyBenchmark.newSplitterWithSpecificSeparator TINY avgt 15
419.647 ± 6.646 ns/op
MyBenchmark.newSplitterWithSpecificSeparator SMALL avgt 15
956.858 ± 8.325 ns/op
MyBenchmark.newSplitterWithSpecificSeparator MEDIUM avgt 15
7799.689 ± 111.446 ns/op
MyBenchmark.newSplitterWithSpecificSeparator BIG avgt 15
76060.763 ± 216.367 ns/op
Benchmark (type) Mode Cnt Score
Error Units
MyBenchmark.currentSplitterWithSpecificSeparator TINY avgt 15 2085.248
± 3.928 ns/op
MyBenchmark.currentSplitterWithSpecificSeparator SMALL avgt 15 5187.362
± 17.942 ns/op
MyBenchmark.currentSplitterWithSpecificSeparator MEDIUM avgt 15 44677.586
± 771.814 ns/op
MyBenchmark.currentSplitterWithSpecificSeparator BIG avgt 15 444297.273
± 10360.311 ns/op
{noformat}
=> The new approach is about 5 times faster with a more predictable average
time whatever the number of tokens to extract
Where:
- _TINY_ has 3 tokens to extract
- _SMALL_ has 10 tokens to extract
- _MEDIUM_ has 100 tokens to extract
- _BIG_ has 1000 tokens to extract
NB: Lower score is better as it is expressed in nanoseconds per operation
> camel-core - Optimize Splitters using Scanner
> ---------------------------------------------
>
> Key: CAMEL-16354
> URL: https://issues.apache.org/jira/browse/CAMEL-16354
> Project: Camel
> Issue Type: Improvement
> Components: camel-core
> Reporter: Claus Ibsen
> Assignee: Nicolas Filotto
> Priority: Major
> Fix For: 3.x
>
>
> Using {{org.apache.camel.util.Scanner}} to use for splitting could
> potentially be optimized for more basic splitting by single char as like we
> do for commas.
> The Scanner creates a lot of object allocations with reg exp patterns and
> whatnot that is way overkill.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)