Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
On 31.05.2015 13:28, Ulf Zibis wrote: Am 31.05.2015 um 08:38 schrieb Peter Levart: Hi, Yes, this one is much easier to grasp. As I understand the check is to avoid overflow in calculation of StringBuilder initial capacity (newLenHint). If overflow happened, newLenHint would be negative and StringBuilder construction would fail with NegativeArraySizeException. But the calculation of newLenHint: int newLenHint = value.length - targLen + replValue.length; ...considering the following: value.length = 0 targLength = 0 replValue.length = 0 targLength = value.length in case of overflow, it can only produce a negative value. So the check could simply be: if (newLenHint 0) { throw new OutOfMemoryError(); } Hm, what has this situation to do with Out-Of-Memory ? That should indicate that we were about to try to allocate an array of size Integer.MAX_VALUE. E.g. java.util.AbstractCollection.java: private static int hugeCapacity(int minCapacity) { if (minCapacity 0) // overflow throw new OutOfMemoryError (Required array size too large); In other words, IMHO NegativeArraySizeException is much better here and additionally saves performance. Additionally you could propagate it to InvalidArgumentException. I don't think it is InvalidArgument case. It just happens that the result is larger then allowed, so let's throw OOM! Sincerely yours, Ivan -Ulf
Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
Thanks! I'll simplify this check for overflow, as you're suggesting. Sincerely yours, Ivan On 31.05.2015 10:01, Xueming Shen wrote: On 5/30/2015 11:38 PM, Peter Levart wrote: Hi, Yes, this one is much easier to grasp. As I understand the check is to avoid overflow in calculation of StringBuilder initial capacity (newLenHint). If overflow happened, newLenHint would be negative and StringBuilder construction would fail with NegativeArraySizeException. But the calculation of newLenHint: int newLenHint = value.length - targLen + replValue.length; ...considering the following: value.length = 0 targLength = 0 replValue.length = 0 targLength = value.length in case of overflow, it can only produce a negative value. So the check could simply be: if (newLenHint 0) { throw new OutOfMemoryError(); } Right? Regards, Peter agreed.
Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
On 31.05.2015 18:54, Ivan Gerasimov wrote: Hi Remi! On 31.05.2015 11:43, Remi Forax wrote: I agree with the others the code is more readable. There is a slightly difference between the current behavior and the behavior of the proposed code, foo.replace(bar, null) should throw a NPE because replacement is null but the proposed code will return foo because replacement.toString() is called after the short-circuit test (j 0). Yes, you're right, thanks for catching it! And the regression test should have caught that, but I only had a.replace(a, null) there, which passed. I fixed the code and added the case to the regression test in the new webrev. Which is right here: http://cr.openjdk.java.net/~igerasim/8058779/05/webrev/ so the first part of the code should be: String starget = target.toString(); // implict nullcheck String srepl = replacement.toString(); // implicit nullcheck int j = indexOf(starget); if (j 0) { return this; } ... also 'i' can be initialized just before the 'do', there is no point to initialize it before. To finish, the two 'final' are useless here but i suppose it's a matter of style :) I moved declaration of i just to save a line. I don't think it decreased performance. Declaring the 'value' variable final was suggested by Martin, and I think it is reasonable (see http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-March/032601.html). The other variable is final just for symmetry :) Sincerely yours, Ivan cheers, Rémi On 05/31/2015 04:19 AM, Ivan Gerasimov wrote: Hi everyone! Here's another webrev, in which replace() is implemented with StringBuilder. On my benchmark it is almost as fast as the version backed with arrays, but this variant is much shorter. Credits to Sherman for combining the general algorithm with the case of empty target. Comments, further suggestions are welcome! BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779 WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/04/webrev/ Sincerely yours, Ivan
RE: JEP 132: More-prompt finalization
The problems start with the ReferenceQueue object and the heavy synchronization in it. Consider structures like j.u.WeakHashMap that need to expunge entries. If such structures are made somewhat concurrent, the lock in ReferenceQueue starts to become a serious problem: - In structures that are concurrent at the entry level (like jdk 8's ConcurrentHashMap), if methods like get() or put() try to expunge, the lock in ReferenceQueue renders the structures non-concurrent. - In structures that are multi-reader-single-writer locked, read methods cannot expunge (because they have to promote to a writer), but they can't even check the queue, because that turns the multi-reader structure into a synchronized one. - In addition to expunge calls contending on the ReferenceQueue lock, ReferenceHandler thread can also contend on the same lock. - There is no fast clear() method on ReferenceQueue. That would be quite useful on a resize event. The above issues forced us to have a dedicated thread that does periodic expunging of References. This works ok under light load, but can fall behind under heavy load. Thanks Moh @Moh Can you elaborate some more on what twists were necessary or what problems you had?
Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
Hi Remi! On 31.05.2015 11:43, Remi Forax wrote: I agree with the others the code is more readable. There is a slightly difference between the current behavior and the behavior of the proposed code, foo.replace(bar, null) should throw a NPE because replacement is null but the proposed code will return foo because replacement.toString() is called after the short-circuit test (j 0). Yes, you're right, thanks for catching it! And the regression test should have caught that, but I only had a.replace(a, null) there, which passed. I fixed the code and added the case to the regression test in the new webrev. so the first part of the code should be: String starget = target.toString(); // implict nullcheck String srepl = replacement.toString(); // implicit nullcheck int j = indexOf(starget); if (j 0) { return this; } ... also 'i' can be initialized just before the 'do', there is no point to initialize it before. To finish, the two 'final' are useless here but i suppose it's a matter of style :) I moved declaration of i just to save a line. I don't think it decreased performance. Declaring the 'value' variable final was suggested by Martin, and I think it is reasonable (see http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-March/032601.html). The other variable is final just for symmetry :) Sincerely yours, Ivan cheers, Rémi On 05/31/2015 04:19 AM, Ivan Gerasimov wrote: Hi everyone! Here's another webrev, in which replace() is implemented with StringBuilder. On my benchmark it is almost as fast as the version backed with arrays, but this variant is much shorter. Credits to Sherman for combining the general algorithm with the case of empty target. Comments, further suggestions are welcome! BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779 WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/04/webrev/ Sincerely yours, Ivan
Re: JEP 132: More-prompt finalization
Hi, Thanks for views and opinions. I'll try to confront them in-line... On 05/29/2015 04:18 AM, David Holmes wrote: Hi Peter, I guess I'm very concerned about the premise that finalization should scale to millions of objects and be performed highly concurrently. To me that's sending the wrong message about finalization. It also isn't the most effective use of cpu resources - most people would want to do useful work on most cpu's most of the time. Cheers, David @David Ok, fair enough. It shouldn't be necessary to scale finalization to millions of objects and be performed concurrently. Normal programs don't need this. But there is a diagnostic command being developed at this moment that displays the finalization queue. The utility of such command, as I understand, is precisely to display when the finalization thread can not cope and Finalizer(s) accumulate. So there must be that some hypothetical programs (ab)use finalization or are buggy (deadlock) so that the queue builds up. To diagnose this, a diagnostic command is helpful. To fix it, one has to fix the code. But what if the problem is not that much about the allocation/death rate of finalizable instances then it is about the heavy code of finalize() methods of those instances. I agree that such programs have a smell and should be rewritten to not use finalization but other means of cleanup such as multiple threads removing WeakReferences from the queue for example or something completely different and not based on Reference(s). But wouldn't it be nice if one could simply set a system property for the max. number of threads processing Finalizer(s)? I have prepared an improved variant of the prototype that employs a single ReferenceHandler thread and adds a ForkJoinPool that by default has a single worker thread which replaces the single finalization thread. So by default, no more threads are used than currently. If one wants (s)he can increase the concurrency of finalization with a system property. I have also improved the benchmarks that now focus on CPU overhead when processing references at more typical rates, rather than maximum throughput. They show that all changes taken together practically half the CPU time overhead of the finalization processing. So freed CPU time can be used for more useful work. I have also benchmarked the typical asynchronous WeakReference processing scenario where one thread removes enqueued WeakReferences from the queue. Results show about 25% decrease of CPU time overhead. Why does the prototype reduce more overhead for finalization than WeakReference processing? The main improvement in the change is the use of multiple doubly-linked lists for registration of Finalizer(s) and the use of lock-less algorithm for the lists. The WeakReference processing benchmark also uses such lists internally to handle registration/deregistration of WeakReferences so that the impact of this part is minimal and the difference of processing overheads between original and changed JDK code more obvious. (De)registration of Finalizer(s) OTOH is part of JDK infrastructure, so the improvement to registration list(s) also shows in the results. The results of WeakReferece processing benchmark also indicate that reverting to the use of a single finalization thread that just removes Finalizer(s) from the ReferenceQueue could lower the overhead even a bit further, but then it would not be possible to leverage FJ pool to simply configure the parallelism of finalization. If parallel processing of Finalizer(s) is an undesirable feature, I could restore the single finalization thread and the CPU overhead of finalization would be reduced to about 40% of current overhead with just the changes to data structures. So, for the curious, here's the improved prototype: http://cr.openjdk.java.net/~plevart/misc/JEP132/ReferenceHandling/webrev.02/ And here are the improved benchmarks (with some results inline): http://cr.openjdk.java.net/~plevart/misc/JEP132/ReferenceHandling/refproc/ The benchmark results in the ThroughputBench.java show the output of the test(s) when run with the Linux time command which shows the elapsed real time and the consumed user and system CPU times. I think this is relevant for measuring CPU overhead. So my question is: Is it or is it not desirable to have a configurable means to parallelize the finalization processing? The reduction of CPU overhead of infrastructure code should always be desirable, right? On 05/29/2015 05:57 AM, Kirk Pepperdine wrote: Hi Peter, It is a very interesting proposal but to further David’s comments, the life-cycle costs of reference objects is horrendous of which the actual process of finalizing an object is only a fraction of that total cost. Unfortunately your micro-benchmark only focuses on one aspect of that cost. In other words, it isn’t very representative of a real concern. In the real world the finalizer *must
Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
On 5/30/2015 11:38 PM, Peter Levart wrote: Hi, Yes, this one is much easier to grasp. As I understand the check is to avoid overflow in calculation of StringBuilder initial capacity (newLenHint). If overflow happened, newLenHint would be negative and StringBuilder construction would fail with NegativeArraySizeException. But the calculation of newLenHint: int newLenHint = value.length - targLen + replValue.length; ...considering the following: value.length = 0 targLength = 0 replValue.length = 0 targLength = value.length in case of overflow, it can only produce a negative value. So the check could simply be: if (newLenHint 0) { throw new OutOfMemoryError(); } Right? Regards, Peter agreed.
Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
On 05/31/2015 06:34 AM, Xueming Shen wrote: On 5/30/15 7:19 PM, Ivan Gerasimov wrote: Hi everyone! Here's another webrev, in which replace() is implemented with StringBuilder. On my benchmark it is almost as fast as the version backed with arrays, but this variant is much shorter. Credits to Sherman for combining the general algorithm with the case of empty target. Comments, further suggestions are welcome! BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779 WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/04/webrev/ Sincerely yours, Ivan This is one is much better:-) I would suggest to leave the overflow-conscious code to the StringBuilder. The same range check is being done inside ABS every time the repl string is appended into the buffer, it does not appear to be very helpful to have a redundant check here. And it seems this check only works for the single appearance of target string. 2260 // overflow-conscious code 2261 if (value.length - targLen Integer.MAX_VALUE - replValue.length) { 2262 throw new OutOfMemoryError(); 2263 } Hi, Yes, this one is much easier to grasp. As I understand the check is to avoid overflow in calculation of StringBuilder initial capacity (newLenHint). If overflow happened, newLenHint would be negative and StringBuilder construction would fail with NegativeArraySizeException. But the calculation of newLenHint: int newLenHint = value.length - targLen + replValue.length; ...considering the following: value.length = 0 targLength = 0 replValue.length = 0 targLength = value.length in case of overflow, it can only produce a negative value. So the check could simply be: if (newLenHint 0) { throw new OutOfMemoryError(); } Right? Regards, Peter
Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
I agree with the others the code is more readable. There is a slightly difference between the current behavior and the behavior of the proposed code, foo.replace(bar, null) should throw a NPE because replacement is null but the proposed code will return foo because replacement.toString() is called after the short-circuit test (j 0). so the first part of the code should be: String starget = target.toString(); // implict nullcheck String srepl = replacement.toString(); // implicit nullcheck int j = indexOf(starget); if (j 0) { return this; } ... also 'i' can be initialized just before the 'do', there is no point to initialize it before. To finish, the two 'final' are useless here but i suppose it's a matter of style :) cheers, Rémi On 05/31/2015 04:19 AM, Ivan Gerasimov wrote: Hi everyone! Here's another webrev, in which replace() is implemented with StringBuilder. On my benchmark it is almost as fast as the version backed with arrays, but this variant is much shorter. Credits to Sherman for combining the general algorithm with the case of empty target. Comments, further suggestions are welcome! BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779 WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/04/webrev/ Sincerely yours, Ivan
Re: 8058779: Faster implementation of String.replace(CharSequence, CharSequence)
Am 31.05.2015 um 08:38 schrieb Peter Levart: Hi, Yes, this one is much easier to grasp. As I understand the check is to avoid overflow in calculation of StringBuilder initial capacity (newLenHint). If overflow happened, newLenHint would be negative and StringBuilder construction would fail with NegativeArraySizeException. But the calculation of newLenHint: int newLenHint = value.length - targLen + replValue.length; ...considering the following: value.length = 0 targLength = 0 replValue.length = 0 targLength = value.length in case of overflow, it can only produce a negative value. So the check could simply be: if (newLenHint 0) { throw new OutOfMemoryError(); } Hm, what has this situation to do with Out-Of-Memory ? In other words, IMHO NegativeArraySizeException is much better here and additionally saves performance. Additionally you could propagate it to InvalidArgumentException. -Ulf
Re: RFR(M, v10): JDK-8059036 : Implement Diagnostic Commands for heap and finalizerinfo
Everyone, Please take a look at new version of the fix. http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.10/ Changes (to previous version) are in Finalizer.java and diagnosticCommand.cpp This version copy data from Map.Entry to array of FinalizerHistogramEntry instances then, VM prints content of this array. -Dmitry On 2015-05-28 21:06, Mandy Chung wrote: On 05/28/2015 07:35 AM, Peter Levart wrote: Hi Mandy, On 05/27/2015 03:32 PM, Mandy Chung wrote: Taking it further - is it simpler to return String[] of all classnames including the duplicated ones and have the VM do the count? Are you concerned with the size of the String[]? Yes, the histogram is much smaller than the list of all instances. There can be millions of instances waiting in finalizer queue, but only a few distinct classes. Do you happen to know what libraries are the major contributors to these millions of finalizers? It has been strongly recommended to avoid finalizers (see Effective Java Item 7). I'm not surprised that existing code is still using finalizers while we should really encourage them to migrate away from it. What could be done in Java to simplify things in native code but still not format the output is to convert the array of Map.Entry(s) into an Object[] array of alternating {String, int[], String, int[], } Would this simplify native code for the price of a little extra work in Java? The sizes of old and new arrays are not big (# of distinct classes of finalizable objects in the queue). I also prefer writing in Java and writing less native code (much simplified). It's an interface that we have to agree upon and keep it simple. Having the returned Object[] as alternate String and int[] is a reasonable compromise. ReferenceQueue.java - you can move @SuppressWarning from the method to just the field variable rn @SuppressWarnings(unchecked) Reference? extends T rn = r.next; Finalizer.java It's better to rename printFinalizationQueue as it inspects the finalizer reference queue (maybe getFinalizerHistogram?). Can you move this method to the end of this class and add the comment saying that this is invoked by VM for jcmd -finalizerinfo support and @return to describe the returned content. I also think you can remove @SuppressWarnings for this method. Mandy -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources.