[
https://issues.apache.org/jira/browse/NIFI-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15129540#comment-15129540
]
ASF GitHub Bot commented on NIFI-1460:
--------------------------------------
Github user trkurc commented on the pull request:
https://github.com/apache/nifi/pull/202#issuecomment-178942251
I did a quick experiment (code below), generated a 10^20 "UUID"'s using a
few mechanisms
1. as you did in this pr (~120ms)
2. an slightly different version of what is in the PR (~55ms)
3. using UUID with the atomic long as the low order bits (~800ms)
4. what was there originally (~3400ms)
5. Using JUG [1] to generate a time-based UUID (~800ms)
So, I think it is a net win with changing the method you created to return
a String rather than a Long (based on 1 vs 2). On the long as a string vs UUID
as a string, although it is faster by a factor of 6, I do think returning a
UUID as a string may better fit the contract of the UUID core attribute.
Although pretty much everywhere it is stored, it is done so as a String, the
comments clearly say UUID [2]. I'm not sure the technical debt accrued for a
Long vs UUID performance increase is a good trade in this instance.
Side note - the JUG github page is an interesting read. There are some good
benchmarks talking about how awesome it is.
[1] https://github.com/cowtowncoder/java-uuid-generator
[2]
https://github.com/apache/nifi/blob/master/nifi-commons/nifi-utils/src/main/java/org/apache/nifi/flowfile/attributes/CoreAttributes.java#L34
```java
public static void main(String args[]){
long iter = 1<<20;
AtomicLong ai = new AtomicLong(0);
long now = System.currentTimeMillis();
for(int i=0; i < iter; i++) {
Long x = ai.getAndIncrement();
String x2 = x.toString();
}
System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now
));
now = System.currentTimeMillis();
for(int i=0; i < iter; i++) {
String x2 = Long.toString(ai.getAndIncrement());
}
System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now
));
now = System.currentTimeMillis();
for(int i=0; i < iter; i++) {
UUID u = new UUID(0, ai.getAndIncrement());
String x2 = u.toString();
}
System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now
));
now = System.currentTimeMillis();
for(int i=0; i < iter; i++) {
UUID u = UUID.randomUUID();
String x2 = u.toString();
}
System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now
));
now = System.currentTimeMillis();
TimeBasedGenerator g = Generators.timeBasedGenerator();
for(int i=0; i < iter; i++) {
UUID u = g.generate();
String x2 = u.toString();
}
System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now
));
}
```
> Test Performance improvement. Test Timeout Mitigation.
> ------------------------------------------------------
>
> Key: NIFI-1460
> URL: https://issues.apache.org/jira/browse/NIFI-1460
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework, Tools and Build
> Affects Versions: 0.4.1
> Environment: linux, unix with true random number generator.
> Reporter: Puspendu Banerjee
> Priority: Minor
> Labels: performance
> Fix For: 0.5.0
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Existing test case
> nifi-framework-core/src/test/java/org/apache/nifi/controller/TestStandardFlowFileQueue.java
> uses a huge number of call to UUID.randomUUID() which is very slow in linux
> , unix environment if there is not much activity [ like mouse move etc.] .In
> addition to that UUID.randomUUID() depends on /dev/(u)random to get a random
> number, such system call costs IO and /dev/random is bandwidth/rate limited
> which again slows down overall performance.
> Workaround is rngd daemon(ref: http://linux.die.net/man/8/rngd)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)