Re: RFR: 8065172: More core reflection final and volatile annotations
Hi, Signature hashCode (the static class in MethodUtil) should be eagerly calculated (i.e. a final field too) since the only usage is being a key in the hashmap. I will add bit more later. Stanimir On Wed, Nov 19, 2014 at 2:42 AM, Martin Buchholz marti...@google.com wrote: Hi Joe, Peter, Paul This is the followup on thread safety I promised Peter. As I said before, I'm willing to take the risk to use fields written via a race if we have a reflective test proving it correct, and the bug contains the 0.1 version of such a test. https://bugs.openjdk.java.net/browse/JDK-8065172 http://cr.openjdk.java.net/~martin/webrevs/openjdk9/core-reflection-more-safety/
Re: RFR: 8061950: Class.getMethods() exhibits quadratic time complexity
HashArray.java: 82 Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)) This effectively multiples expectedMaxSize by 3, I guess you meant instead. The intention of capacity(int expectedMaxSize) is to return the smallest power of 2 number that is greater than 3 * expectedMaxSize / 2. The method is copied from IdentityHashMap, so I'm sure it does it right ;-) Indeed it does so, it can lead to almost 3x times (i.e. fill factor of .33) waste for some numbers (11, 22, 43) . But I supposed it's a good tradeoff. Perhaps extra a method about effective capacity? Don't know what you mean by effective capacity? The capacity() is meant to size the initial array. And it behaves correctly even for large requested expectedMaxSize values that would cause an overflow in naively written code. I meant to extract (not extra) a method for s + (s 1) but it's used once only, so please disregard this part. There is some code duplication in the different impl. of consolidate() but trying to refactor it (in a static method) would cause some performance degradation (I don't think there is an intrinsic for the native Class.isInterface()). So, I guess it can be left that way, as it's not too much. Stanimir On Wed, Nov 5, 2014 at 6:58 PM, Peter Levart peter.lev...@gmail.com wrote: On 11/01/2014 06:49 PM, Peter Levart wrote: On 10/30/2014 07:13 PM, Martin Buchholz wrote: Which is why I was expecting to have to write an adaptive data structure that switches from linear search to hash-based when some threshold is exceeded. ...or two data structures. Since I have already arranged so that construction of MethodTable gets an upper limit on the total number of possible Method(s) that can be added to it, we can have two MethodTable implementations and switch based on this number: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.05/ Hi, I made some progress on this front. I created new MethodTable implementation (the third one) with similar performance as HashMap based, but producing half the garbage. I also added some optimizations to getMethod(), getMethods() and MethodTable in general that deal with special cases / parameter types comparison and have measurable impact on performance (for example loading all rt.jar classes and calling getMethods() for them): Original: 19658 classes loaded in 2.003623648 seconds. 494392 methods obtained in 0.932151066 seconds. 494392 methods obtained in 0.968247076 seconds. 494392 methods obtained in 0.989218185 seconds. 494392 methods obtained in 1.018700179 seconds. 494392 methods obtained in 0.945078767 seconds. 494392 methods obtained in 0.930856897 seconds. 494392 methods obtained in 0.905420555 seconds. 494392 methods obtained in 0.89253006 seconds. 494392 methods obtained in 0.914590141 seconds. 494392 methods obtained in 0.891616716 seconds. 494392 methods obtained in 0.891839815 seconds. 494392 methods obtained in 0.892863985 seconds. 494392 methods obtained in 0.960643274 seconds. 494392 methods obtained in 0.959266392 seconds. 494392 methods obtained in 0.894688322 seconds. 494392 methods obtained in 0.892751891 seconds. 494392 methods obtained in 0.912542838 seconds. 494392 methods obtained in 0.912219636 seconds. 494392 methods obtained in 0.895791468 seconds. 494392 methods obtained in 0.891673959 seconds. Patched: 19658 classes loaded in 2.145270948 seconds. 494398 methods obtained in 0.722630874 seconds. 494398 methods obtained in 0.697521755 seconds. 494398 methods obtained in 0.689642554 seconds. 494398 methods obtained in 0.742724992 seconds. 494398 methods obtained in 0.695693313 seconds. 494398 methods obtained in 0.685169108 seconds. 494398 methods obtained in 0.663012634 seconds. 494398 methods obtained in 0.666446935 seconds. 494398 methods obtained in 0.675662884 seconds. 494398 methods obtained in 0.65369404 seconds. 494398 methods obtained in 0.656608178 seconds. 494398 methods obtained in 0.668384051 seconds. 494398 methods obtained in 0.657548957 seconds. 494398 methods obtained in 0.672332234 seconds. 494398 methods obtained in 0.655699295 seconds. 494398 methods obtained in 0.671819628 seconds. 494398 methods obtained in 0.657232183 seconds. 494398 methods obtained in 0.654462137 seconds. 494398 methods obtained in 0.659630473 seconds. 494398 methods obtained in 0.674219391 seconds. (the difference in method count is expected - patched code contains new methods in existing rt.jar classes ;-) Here's new webrev: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.06/ The optimizations made from webrev.05 are: - getMethod() skips construction of MethodTable if there are no (super)interfaces. - getMethods() returns just declared public methods if there are no superclass and no (super)interfaces. - comparing method parameter types is optimized by adding two methods to
Re: RFR JDK-6321472: Add CRC-32C API
Hi Staffan, Given the tables in the static class init : 66 private transient final static int[] byteTable0 = byteTables[0]; 67 private transient final static int[] byteTable1 = byteTables[1]; 68 private transient final static int[] byteTable2 = byteTables[2]; 69 private transient final static int[] byteTable3 = byteTables[3]; 70 private transient final static int[] byteTable4 = byteTables[4]; 71 private transient final static int[] byteTable5 = byteTables[5]; 72 private transient final static int[] byteTable6 = byteTables[6]; 73 private transient final static int[] byteTable7 = byteTables[7]; I was wondering if assigning the tables in the static part in similar fashion would make the code much more readable: 66 private transient final static int[] byteTable0 = byteTables[idx(0)]; 67 private transient final static int[] byteTable1 = byteTables[idx(1)]; 68 private transient final static int[] byteTable2 = byteTables[idx(2)]; 69 private transient final static int[] byteTable3 = byteTables[idx(3)]; 70 private transient final static int[] byteTable4 = byteTables[idx(4(]; 71 private transient final static int[] byteTable5 = byteTables[idx(5)]; 72 private transient final static int[] byteTable6 = byteTables[idx(6)]; 73 private transient final static int[] byteTable7 = byteTables[idx(7)]; private static int idx(int i){ return ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN?7-i:i;}; Then in the loops you can skip checking the condition, something like: 296- if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) { 297-crc = byteTable7[crc 0xFF] 298- ^ byteTable6[(crc 8) 0xFF] 299- ^ byteTable5[(crc 16) 0xFF] 300- ^ byteTable4[crc 24] 301- ^ byteTable3[secondHalf 0xFF] 302- ^ byteTable2[(secondHalf 8) 0xFF] 303- ^ byteTable1[(secondHalf 16) 0xFF] 304- ^ byteTable0[secondHalf 24]; 305- } else { // ByteOrder.BIG_ENDIAN 306 crc = byteTable0[secondHalf 0xFF] 307 ^ byteTable1[(secondHalf 8) 0xFF] 308 ^ byteTable2[(secondHalf 16) 0xFF] 309 ^ byteTable3[secondHalf 24] 310 ^ byteTable4[crc 0xFF] 311 ^ byteTable5[(crc 8) 0xFF] 312 ^ byteTable6[(crc 16) 0xFF] 313 ^ byteTable7[crc 24]; 314 -} Since the byteTabeleN are properly assigned during the class init based on the endianess of the systen. Of course, there won't be any performance benefits since hotspot should constant fold the code but it'll be easier to maintain and read. Cheers Stanimir On Thu, Nov 6, 2014 at 12:18 PM, Staffan Friberg staffan.frib...@oracle.com wrote: Anyone have time to be the second reviewer? http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.07 The CCC request has been approved for the new API. Regards; Staffan On 10/22/2014 12:16 AM, Staffan Friberg wrote: Thanks for the review. Updated the @implSpec. Also includes Peter's good catch. Webrev: http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.06 //Staffan On 10/21/2014 02:09 PM, Xueming Shen wrote: Staffan, Thanks for the package.html update. Just wonder if it would be better to use buffer.remaining(), instead of the buffer.limit() - buffer.position() in Checksum.udpate(ByteBuffer)'s #implSpec The rest looks fine for me. -Sherman On 10/21/2014 01:11 PM, Staffan Friberg wrote: Converted. http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.05 //Staffan On 10/21/2014 12:34 PM, Joe Darcy wrote: Hi Staffan, If you are updating package.html, please also hg mv the file to be a package-info.java file with the equivalent javadoc. Thanks, -Joe On 10/21/2014 11:49 AM, Staffan Friberg wrote: Hi, Got an offline comment that the package.html should be update as well to cover CRC-32C. Otherwise there are no code changes in this new webrev. http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.04 //Staffan On 10/21/2014 10:28 AM, Staffan Friberg wrote: Hi Peter, Thanks for the comments.. 217 if (Unsafe.ADDRESS_SIZE == 4) { 218 // On 32 bit platforms read two ints instead of a single 64bit long When you're reading from byte[] using Unsafe (updateBytes), you have the option of reading 64bit values on 64bit platforms. When you're reading from DirectByteBuffer memory (updateDirectByteBuffer), you're only using 32bit reads. I will add a comment in the code for this decision. The reason is that read a long results in slightly worse performance in this
Re: RFR: 8061950: Class.getMethods() exhibits quadratic time complexity
A very small issue: MethodTable.java: 697 while (i hwm next == null) next = methods[i++]; In my opinion the condition should be while (next == null i hwm) as next will be non-null on each invocation of next() and generally it's more likely to exit the loop. I suppose HashMapImpl is unused . HashArray.java: 82 Integer.highestOneBit(expectedMaxSize + (expectedMaxSize 1)) This effectively multiples expectedMaxSize by 3, I guess you meant instead. Perhaps extra a method about effective capacity? I will take a deeper look a bit later. Stanimir On Wed, Nov 5, 2014 at 6:58 PM, Peter Levart peter.lev...@gmail.com wrote: On 11/01/2014 06:49 PM, Peter Levart wrote: On 10/30/2014 07:13 PM, Martin Buchholz wrote: Which is why I was expecting to have to write an adaptive data structure that switches from linear search to hash-based when some threshold is exceeded. ...or two data structures. Since I have already arranged so that construction of MethodTable gets an upper limit on the total number of possible Method(s) that can be added to it, we can have two MethodTable implementations and switch based on this number: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.05/ Hi, I made some progress on this front. I created new MethodTable implementation (the third one) with similar performance as HashMap based, but producing half the garbage. I also added some optimizations to getMethod(), getMethods() and MethodTable in general that deal with special cases / parameter types comparison and have measurable impact on performance (for example loading all rt.jar classes and calling getMethods() for them): Original: 19658 classes loaded in 2.003623648 seconds. 494392 methods obtained in 0.932151066 seconds. 494392 methods obtained in 0.968247076 seconds. 494392 methods obtained in 0.989218185 seconds. 494392 methods obtained in 1.018700179 seconds. 494392 methods obtained in 0.945078767 seconds. 494392 methods obtained in 0.930856897 seconds. 494392 methods obtained in 0.905420555 seconds. 494392 methods obtained in 0.89253006 seconds. 494392 methods obtained in 0.914590141 seconds. 494392 methods obtained in 0.891616716 seconds. 494392 methods obtained in 0.891839815 seconds. 494392 methods obtained in 0.892863985 seconds. 494392 methods obtained in 0.960643274 seconds. 494392 methods obtained in 0.959266392 seconds. 494392 methods obtained in 0.894688322 seconds. 494392 methods obtained in 0.892751891 seconds. 494392 methods obtained in 0.912542838 seconds. 494392 methods obtained in 0.912219636 seconds. 494392 methods obtained in 0.895791468 seconds. 494392 methods obtained in 0.891673959 seconds. Patched: 19658 classes loaded in 2.145270948 seconds. 494398 methods obtained in 0.722630874 seconds. 494398 methods obtained in 0.697521755 seconds. 494398 methods obtained in 0.689642554 seconds. 494398 methods obtained in 0.742724992 seconds. 494398 methods obtained in 0.695693313 seconds. 494398 methods obtained in 0.685169108 seconds. 494398 methods obtained in 0.663012634 seconds. 494398 methods obtained in 0.666446935 seconds. 494398 methods obtained in 0.675662884 seconds. 494398 methods obtained in 0.65369404 seconds. 494398 methods obtained in 0.656608178 seconds. 494398 methods obtained in 0.668384051 seconds. 494398 methods obtained in 0.657548957 seconds. 494398 methods obtained in 0.672332234 seconds. 494398 methods obtained in 0.655699295 seconds. 494398 methods obtained in 0.671819628 seconds. 494398 methods obtained in 0.657232183 seconds. 494398 methods obtained in 0.654462137 seconds. 494398 methods obtained in 0.659630473 seconds. 494398 methods obtained in 0.674219391 seconds. (the difference in method count is expected - patched code contains new methods in existing rt.jar classes ;-) Here's new webrev: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.06/ The optimizations made from webrev.05 are: - getMethod() skips construction of MethodTable if there are no (super)interfaces. - getMethods() returns just declared public methods if there are no superclass and no (super)interfaces. - comparing method parameter types is optimized by adding two methods to Method/LangReflectAccess/ReflectionFactory. New MethodTable implementation based on a linear-probe hash table is a space/garbage improvement. I took IdentityHashMap, removed unneeded stuff and modified it's API. The result is a HashArray. It's API is similar in function and form to java.util.Map, but doesn't use separate keys and values. An element of HashArray is a key and a value at the same time. Elements are always non-null, so the method return values are unambiguous. As HashArray is a linear-probe hash table and there are no Map.Entry objects involved, the underlying data structure is very simple and memory efficient. It is just a sparse array of elements with length
Re: RFR: 8060132: Handlers configured on abstract nodes in logging.properties are not always properly closed
945 } catch (Exception ex) { 946 System.err.println(Can't load log handler \ + type + \); 947 System.err.println( + ex); 948 ex.printStackTrace(); 949 } I'm not quite sure if handling Error (in particular any subclass of LinkageError) would make sense here (similar to the invocation of listeners). Of course there are other case handling just Exception and using word instead of 'type' but that's matter of another clean up. Stanimir On Wed, Nov 5, 2014 at 6:15 PM, Daniel Fuchs daniel.fu...@oracle.com wrote: Hi Mandy, Stanimir, Here is the new webrev: http://cr.openjdk.java.net/~dfuchs/webrev_8060132/webrev.03/index.html best regards, -- daniel On 05/11/14 12:47, Daniel Fuchs wrote: On 04/11/14 23:40, Mandy Chung wrote: On 11/4/14 8:48 AM, Daniel Fuchs wrote: Hi, Here is a revised patch that introduces a new logger.handlers.ensureCloseOnReset=true|false property. You have explored several different options and as you have found, none of them is ideal. I think having a property to specify ensureCloseOnReset would be reasonable to address the memory leak concern I have. Would it be better to name it logger.handlers.closeOnReset? Well - closeOnReset is the normal behavior - if the logger is still around. So I believe ensureCloseOnReset is a better name. I am less sure if we need to have a global switch to revert the default - do we expect the number of logger.handlers is large? I would expect there would be small number. I would expect the cases where you need to specify ensureCloseOnReset will be rare. So maybe we should wait until the case arises before implementing a global switch: we can add that later if it's needed. The local switches will provide a workable (even though possibly cumbersome) work around. This property can be used as a fallback to turn the new behavior off in the improbable case where an application may need to do so. I have updated the LogManager class-level API documentation. http://cr.openjdk.java.net/~dfuchs/webrev_8060132/webrev.02/ line 104-114: s/true/{@code true} {@code reset} should be {@link ...} to the reset method (the first occurrance) Will do. It doesn't seem necessary to specify strongly referenced. Maybe it's good to say it will ensure the handlers associated with the loggers will be closed at shutdown or at reset. OK If closeOnReset is false, application should ensure the handlers be closed before the logger gets garbage collected. I will try to find something. - something along that line. typos: 186: explicitely - explicitly I keep on doing this same mistake again and again :-( Perhaps PersistentLogger can be renamed to CloseOnResetLogger to make it clear what these loggers are kept for. Similarly for the variable name persistentLogger may be renamed too. OK. 917: final might be adding noise I like final ;-) 940: persitentLoggers OK line 917: you don't need (names.length == 0) to determine the default, right? if names.length == 0, it wouldn't add to the persistentLoggers list as no iteration in the for loop. I will use the syntax that Stanimir suggested. It's much simpler than what I originally wrote. If names.length == 0 we don't need to look up the ensureCloseOnReset property. Thanks Mandy, I will send a new webrev shortly - with both yours Stanimir comments integrated (barring the global switch - as we could do that in a followup revision if that's really needed). best regards, -- daniel Mandy
Re: RFR: 8060132: Handlers configured on abstract nodes in logging.properties are not always properly closed
Hi, 917 final boolean ensureCloseOnReset = names.length == 0 ? false 918 : getBooleanProperty(handlersPropertyName + .ensureCloseOnReset, 919 names.length 0); I think the statement would be a lot easier to read as: final boolean ensureCloseOnReset = names.length 0 getBooleanProperty(handlersPropertyName+ .ensureCloseOnReset, true) Also, probably adding a default for ensureCloseOnReset (via system property -Djava.util.logging.ensureCloseOnReset) would be less daunting to revert to the default behavior than setting it in the concrete properties files (that might be bundled a jar already). Stanimir On Tue, Nov 4, 2014 at 6:48 PM, Daniel Fuchs daniel.fu...@oracle.com wrote: Hi, Here is a revised patch that introduces a new logger.handlers.ensureCloseOnReset=true|false property. This property can be used as a fallback to turn the new behavior off in the improbable case where an application may need to do so. I have updated the LogManager class-level API documentation. http://cr.openjdk.java.net/~dfuchs/webrev_8060132/webrev.02/ best regards, -- daniel On 24/10/14 23:50, Daniel Fuchs wrote: Hi Mandy, Is this only problem to the abstract node of Loggers (i.e. the ancestor is in the logging configuration but the library/app never creates such a logger)? No - the problem is with any logger that has handlers configured from the logging.properties, and which the application creates and releases before the Cleaner thread is run. Your patch will keep all loggers defined in the configuration strong reference. Not all loggers defined in the configuration. Only those loggers for which a handler and is explicitly defined in the configuration and which have been created at least once - directly or indirectly (= through a child) by the application. Would that cause memory leak? The main source of memory leak - and probably the only one we should care about is when a Logger that we strong reference has a reference to an object whose class would be otherwise garbage collected. In other words - if the logger is the only thing that prevents an object, and its class, and its class loader, from being garbage collected. If I'm not mistaken - there are only three objects that could cause such an issue: - Instances of Level - Instances of Handler - Instances of ResourceBundle. Instances of Level are already held by Level.knownLevel - this is a known bug, and until that is fixed Logger won't be the only thing preventing a custom instance of Level from being garbage collected. In addition such a custom instance of Level (an instance of a custom Level subclass) can't be set directly from the configuration - so this can only happen if the application has set this level programmatically on the Logger. If I remember well Handlers created from the configuration are looked up from the System ClassLoader - so that shouldn't cause a memory leak either. Remains resource bundles - which might be an issue. However - the set of loggers for which a Handler is configured from the configuration file is by nature bounded. Therefore - it is my opinion that the proposed patch does not create a memory leak per-se: A memory leak can only happen if an application constantly (and programmatically) adds new Handler to the existing logger, believing that it will get a new instance of Logger each time because the old one should have been gc'ed. In that case - I believe the fault would be in the application, not in the Logger... If you think that we should still leave the door open for an application to explicitly request the old behavior - we could define a new property for the configuration file. e.g. something like: logger-name.handlers = handler list // no changes here logger-name.handlers.ensureCloseOnReset = false // new property // True by default when a logger has handlers configured from logging.properties. // Can be explicitly turned off for a given logger, in the configuration. // Ignored if the logger has no handler configured. // Not inherited by child loggers If logger-name.handlers.ensureCloseOnReset is explicitly set to false then the LogManager will not add the logger to the persistentLoggers list. Do you think we should do that? best regards, -- daniel On 10/24/14 8:31 PM, Mandy Chung wrote: On 10/10/2014 8:39 AM, Daniel Fuchs wrote: http://cr.openjdk.java.net/~dfuchs/webrev_8060132/webrev.01 Sorry for the delay. I have been pondering if there is a better alternative and looks like you have considered a few other options that none of them is a good one. typos: 174 explicitely 925: persitentLoggers Is this only problem to the abstract node of Loggers (i.e. the ancestor is in the logging configuration but the library/app never creates such a logger)? Your patch will keep all loggers defined in the
Re: RFR: 8061950: Class.getMethods() exhibits quadratic time complexity
Hi, Personally I have no objections to the latest webrev. It looks like you are creating a more sophisticated data structure with more garbage, which won't show up as much on our small-memory benchmarks. Which is why I was expecting to have to write an adaptive data structure that switches from linear search to hash-based when some threshold is exceeded. There is a hidden cost in calling Method.getParameterTypes() even with linear search. Although it can be optimized to always first check Method.getName() for reference equality So the new approach creates 2 more short lived objects per Method - HashMap.Node and the MethodList. MethodTable.Impl and its underlying Object[] are one time off as they are properly sized. I have no objection if the instances die trivially in the young gen. A possible optimization would be using a short-circuit in case of no interfaces and extending directly java.lang.Object (quite a common case). But even then the Object has quite a few public methods [getClass(), notify(), notiftyAll(), wait(), wait(long, int), wait(long), toString(), hashCode(), equals(Object)], so it requires another HashMap and checking how many of them are overridden. That also reminds me: Object methods should always be cached, regardless of the config flag. Object class cannot be redefined and the memory is constant. Stanimir --- (The introduction of default methods into openjdk makes method lookup even more confusing, and scares me, especially since I am uncertain we have enough tests...) --- If your changes to StarInheritance.java are general improvements, we could/should just check them in now (TDD?). --- Use javadoc comments /** even for private methods +/* This method returns 'root' Constructor object. It MUST be copied with ReflectionFactory + * before handed to any code outside java.lang.Class or modified. + */ private ConstructorT getConstructor0(Class?[] parameterTypes, --- The java way is to spell intfcsMethods interfacesMethods On Thu, Oct 30, 2014 at 9:53 AM, Peter Levart peter.lev...@gmail.com wrote: Hi, Here's a patch: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.04/ for the following issue: https://bugs.openjdk.java.net/browse/JDK-8061950 For those following the thread Loading classes with many methods is very expensive, this is a 4th iteration of this patch. I stepped back from the approach taken in 3rd webrev and rather refined approach taken in 2nd webrev. Instead of using a LinkedHashMap with values being linked-lists of nodes holding Method objects with same signature, which couldn't be made so that iteration would follow insertion order, I used plain HashMap this time. MethodNode(s) holding Method objects form a linked list of those sharing the same signature. In addition MethodNode(s) form a separate doubly-linked list of all nodes which is used to keep insertion order. The space use should be the same as I traded two object pointers in MethodNode for two less pointers in HashMap.Entry (vs. LinkedHashMap.Entry that was used before). So insertion order is not tracked by Map but instead by MethodTable now. I also track size of MethodTable now so there's no pre-traversing pass to allocate Method[] when requested. This version seems to be the fastest from all tried so far (measured by http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java using -Dsun.reflect.noCaches=true): Original: 19658 classes loaded in 2.027207954 seconds. 494392 methods obtained in 0.945519006 seconds. 494392 methods obtained in 0.95250834 seconds. 494392 methods obtained in 0.917841393 seconds. 494392 methods obtained in 0.971922977 seconds. 494392 methods obtained in 0.947145131 seconds. 494392 methods obtained in 0.937122672 seconds. 494392 methods obtained in 0.893975586 seconds. 494392 methods obtained in 0.90307736 seconds. 494392 methods obtained in 0.918782446 seconds. 494392 methods obtained in 0.881968668 seconds. Patched: 19658 classes loaded in 2.034081706 seconds. 494392 methods obtained in 0.8082686 seconds. 494392 methods obtained in 0.743307034 seconds. 494392 methods obtained in 0.751591979 seconds. 494392 methods obtained in 0.726954964 seconds. 494392 methods obtained in 0.761067189 seconds. 494392 methods obtained in 0.70825252 seconds. 494392 methods obtained in 0.696489363 seconds. 494392 methods obtained in 0.692555238 seconds. 494392 methods obtained in 0.695150116 seconds. 494392 methods obtained in 0.697665068 seconds. I also updated the test that checks multiple inheritance of abstract methods as I found that my 1st iteration of the algorithm was not getting the same results as original code although all jtreg tests were passing. I added 4 cases that include abstract classes as well as interfaces to the mix. Some of them would fail with my
Re: RFR (s): 4354680: Runtime.runFinalization() silently clears interrupted flag in the calling thread
Aside, I noticed in this code: 146 forkSecondaryFinalizer(new Runnable() { 147 private volatile boolean running; 148 public void run() { 149 if (running) 150 return; 151 final JavaLangAccess jla = SharedSecrets. getJavaLangAccess(); 152 running = true; that as we create a new Runnable on every call, the running field serves absolutely no purpose. Cheers, David It could be defense vs Thread.currentThread().run() in finalizer that would cause calling run() of Thread.target. I am not sure if that would cause invocation, though. But I have done similar non-senses... Stanimir
Re: Loading classes with many methods is very expensive
Hi Peter, The removal of value wrapper is a clever approach to reduce the new instances created although it feels very unnatural (at least to me). Small optimization; eagerly calculate the hash in the c'tor, hash = 149 method.getReturnType().hashCode() ^ 150 System.identityHashCode(method.getName()) ^ 151 Arrays.hashCode(method.getParameterTypes() and so the real int hashCode(){return seq+hash;} Also I am not sure if method.getName().hashCode() would be better or not, if previously identityHashCode is not invoked on that string and depending on the configured hashCode (-XX:hashCode) generator System.identityHashCode might be slow or worse call C random() which doesn't scale. Setting the hashCode requires a CAS too. CAS is of course one-time off but still About the order of the returned methods: if you remove and then put from/to the LHM the order of iteration is going to be greatly altered. Is that ok? Stanimir On Wed, Oct 29, 2014 at 7:16 PM, Peter Levart peter.lev...@gmail.com wrote: On 10/29/2014 02:08 PM, Joel Borggrén-Franck wrote: Hi Peter, I'm not entirely convinced this is a bug. The lookup order for getMethod has for a long time been walk up superclasses and return what you find there first without even looking at interfaces. It might be desirable to change that but I'm not sure. Hi Joel, It has been for a long time like that as you say, but for a long time Java did not have default methods. It's unexpected for getMethod() to return a method that is not contained in getMethods() result. Anyway, I have created a bug to track this: https://bugs.openjdk.java.net/browse/JDK-8062389 For next iteration of the getMethods() O(n^2) fix, I used a slightly different approach, which you might like more or not. Instead of using linked-lists of Method objects as values of a LinkedHashMap I created a special Key object, holding a reference to Method object and an additional 'seq' int field, which discriminates among methods with same signature. Values of LinkedHashMap are Method objects themselves: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.03/ I have encapsulated this functionality into a package-private java.lang.MethodTable. The implementation of this API can be easily changed to using linked-lists as values of LinkedHashMap if desired. The performance characteristics are similar, with hardly measurable advantage of this latest approach as can be seen from http://cr.openjdk.java.net/~ plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java benchmark: Original code: 19658 classes loaded in 2.071013902 seconds. 494392 methods obtained in 1.089983418 seconds. 494392 methods obtained in 0.952488497 seconds. 494392 methods obtained in 0.912878317 seconds. 494392 methods obtained in 0.940293784 seconds. 494392 methods obtained in 0.987640733 seconds. 494392 methods obtained in 0.925393355 seconds. 494392 methods obtained in 0.89397002 seconds. 494392 methods obtained in 0.915042463 seconds. 494392 methods obtained in 0.897669082 seconds. 494392 methods obtained in 0.878140502 seconds. Patched code: 19658 classes loaded in 2.153024197 seconds. 494392 methods obtained in 0.875651469 seconds. 494392 methods obtained in 0.791937742 seconds. 494392 methods obtained in 0.780995693 seconds. 494392 methods obtained in 0.759593461 seconds. 494392 methods obtained in 0.766528355 seconds. 494392 methods obtained in 0.756567663 seconds. 494392 methods obtained in 0.739177848 seconds. 494392 methods obtained in 0.729245613 seconds. 494392 methods obtained in 0.74081083 seconds. 494392 methods obtained in 0.731749505 seconds. Martin's ManyMethodsBenchmark shows this algorithm has O(n) time complexity too: Original: Base class load time: 131.95 ms getDeclaredMethods: 65521 methods, 32.00 ms total time, 0.0005 ms per method getMethods: 65530 methods, 44.24 ms total time, 0.0007 ms per method Derived class load time: 32525.23 ms getDeclaredMethods: 65521 methods, 30.37 ms total time, 0.0005 ms per method getMethods: 65530 methods, 7897.03 ms total time, 0.1205 ms per method Patched: Base class load time: 129.72 ms getDeclaredMethods: 65521 methods, 32.76 ms total time, 0.0005 ms per method getMethods: 65530 methods, 42.68 ms total time, 0.0007 ms per method Derived class load time: 31620.47 ms getDeclaredMethods: 65521 methods, 30.49 ms total time, 0.0005 ms per method getMethods: 65530 methods, 88.23 ms total time, 0.0013 ms per method I have also run Martin's LoadAllClassesAndMethods test (Thanks Martin, I changed it slightly so that exceptions are collected and reported at the end instead of bailing-out on first exception). Original LoadAllClassesAndMethods write classlist.txt: class load: 23052 classes, 1563.75 ms total time, 0.0678 ms per class 4 exceptions encountered:
Re: Loading classes with many methods is very expensive
On Mon, Oct 27, 2014 at 1:23 PM, Aleksey Shipilev aleksey.shipi...@oracle.com wrote: On 10/27/2014 01:53 AM, Peter Levart wrote: As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ This seems to be a candidate fix for https://bugs.openjdk.java.net/browse/JDK-8061950, right? If so, please assign yourself there. It might be advisable to start the separate RFR thread for a fix? My comments for the first reading of the patch: * Why MethodList maintains the linked list? Doesn't O(n) MethodList.add() lead to quadratic complexity again? Should we instead use the ArrayListMethod for storing method synonyms? But N would be the amount of exactly the same overridden methods from superclasses and declared in interfaces. Usually there would be zero or 1-2. So the in-place linked list conserves memory, the extra indirection of ArrayList for n=1 would also be slower. IMO, it's a well designed approach. The only case where it'd actually matter is hundreds of superclasses each overriding all of the their methods. Stanimir * Please convert some of the indexed loops into for-each loops for better readability? E.g: for (int i = 0; i intfcsMethods.length; i++) { for (Method m : intfcsMethods[i]) { ...to: for (Method[] ms : intfcsMethods) { for (Method m : ms) { * I would think the code would be more readable if MethodList.m was having a more readable name, e.g. MethodList.base or MethodList.image or MethodList.primary? * Formatting and logic in MethodList.consolidateWith gives me chills. I think at very least introducing variables for m.getDeclaringClass() and ml.m.getDeclaringClass() improve readability. Also, moving the comments at the end of the line should improve readability and better highlight the boolean expression you are spelling out. Below is the straight-forward rewrite: MethodList consolidateWith(MethodList ml) { Method mineM = m; Method otherM = ml.m; Class? mine = mineM.getDeclaringClass(); Class? other = otherM.getDeclaringClass(); if (other == mine // other is same method as mine || !Modifier.isAbstract(mineM.getModifiers()) // OR, mine is non-abstract method... (other.isAssignableFrom(mine) // ...which overrides other || other.isInterface() !mine.isInterface())) { // ...class method which wins over interface // just return return this; } if (!Modifier.isAbstract(otherM.getModifiers()) // other is non-abstract method... (mine.isAssignableFrom(other) // ...which overrides mine || mine.isInterface() !other.isInterface())) {// ...class method which wins over interface // knock m out and continue return (next == null) ? ml : next.consolidateWith(ml); } // else leave m in and continue next = (next == null) ? ml : next.consolidateWith(ml); return this; } * Should we use just methods.put() here? 2851 MethodList ml0 = methods.putIfAbsent(ml, ml); * I wonder if the code would be simpler if we push the MapMethodList, MethodList maintenance to some internal utility class? That would probably simplify the loops in privateGetPublicMethods? Thanks, -Aleksey.
Re: Loading classes with many methods is very expensive
On Mon, Oct 27, 2014 at 12:36 AM, Peter Levart peter.lev...@gmail.com wrote: On 10/26/2014 11:21 PM, Stanimir Simeonoff wrote: Great effort. From first glance: the hashCode and equals of MethodList use m.getParameterTypes() which is cloned. I'd rather pay the collision costs and rely on the default hashCode/equals of Method itself (hashCode ignores the parameters). Possibly hashCode can include the returnType but equals should be direct call to Method.equals. Can't do that as Method.equals also compares declaringClass, which is not part of method signature. We could use SharedSecrets to access the internal array of parameterTypes directly. Opps, very true. If SharedSecrets is not available, get the parameters in the constructor, so clone happens once only. Method.clone() should be comparable to the cost of a LHM node, so the overhead is not so high. I doubt the JIT can EA clone() which would be the best case scenario. I was wondering if the computation of size is better than just array copy (as LHM.iterator is effectively linked list), however for small number of methods it would keep all the references in L1 so that would be better in the common case. Stanimir Regards, Peter Stanimir On Sun, Oct 26, 2014 at 10:25 PM, Peter Levart peter.lev...@gmail.com wrote: Hi all, I revamped the Class.getMethods() implementation so that it is faster for common cases and O(n) at the same time, which makes Martin's test happy (at least in part that measures getMethods() speed - the class loading / linkage in VM is a separate issue). With the following test that loads all classes from rt.jar and calls getMethods() on each of them: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java And system property 'sun.reflect.noCaches=true' (so that we exercise the logic in every loop - not just 1st), original code prints: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. 494141 methods obtained in 0.905235658 seconds. 494141 methods obtained in 0.914434303 seconds. 494141 methods obtained in 0.887212805 seconds. 494141 methods obtained in 0.888929483 seconds. 494141 methods obtained in 0.883309141 seconds. 494141 methods obtained in 0.88341098 seconds. 494141 methods obtained in 0.897397146 seconds. 494141 methods obtained in 0.885677466 seconds. 494141 methods obtained in 0.895834176 seconds. Patched code does the same about 10% faster: 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. 494124 methods obtained in 0.785342465 seconds. 494124 methods obtained in 0.784852619 seconds. 494124 methods obtained in 0.793450205 seconds. 494124 methods obtained in 0.849915078 seconds. 494124 methods obtained in 0.77835511 seconds. 494124 methods obtained in 0.764144701 seconds. 494124 methods obtained in 0.754122383 seconds. 494124 methods obtained in 0.747961897 seconds. 494124 methods obtained in 0.7489937 seconds. Martin's test prints on my computer with original code the following: Base class load time: 177.80 ms getDeclaredMethods: Methods: 65521, Total time: 35.79 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 50.15 ms, Time per method: 0.0008 ms Derived class load time: 34015.70 ms getDeclaredMethods: Methods: 65521, Total time: 33.82 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 8122.00 ms, Time per method: 0.1239 ms And with patched code this: Base class load time: 157.16 ms getDeclaredMethods: Methods: 65521, Total time: 65.77 ms, Time per method: 0.0010 ms getMethods: Methods: 65530, Total time: 44.64 ms, Time per method: 0.0007 ms Derived class load time: 33996.69 ms getDeclaredMethods: Methods: 65521, Total time: 32.63 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 92.12 ms, Time per method: 0.0014 ms Here's a webrev of the patch: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.01/ Patched code is simpler (65 lines gone) and I hope, easier to understand and change (I think a change in spec is coming in JDK9 which will handle abstract interface methods the same way as default, right Joel?) I also took the liberty to eliminate some redundant array and Field/Method/Constructor copies. get[Method0,Field0,Counstuctor0] now return 'root' objects. Copying is performed in methods that call them and expose the objects to any code outside java.lang.Class. Also, findFields() and findMethods() don't do copying of Field/Method objects themselves, but rather live that to methods that call them. getInterfaces() method now delegates to getInterfaces(boolean copyArray) so that internally, not array copying is performed. All 55 java/lang/reflect jtreg tests pass with this patch. Regards, Peter
Re: Loading classes with many methods is very expensive
55 java/lang/reflect jtreg tests still pass. As they did before, which means that we don't have a coverage for such cases. I'll see where I can add such a case (EnumSet for example, which inherits from Set interface and AbstractColection class via two different paths, so Set.size()/iterator() and AbstractCollection.size()/iterator() are both returned from getMethods())... Reminds me the case when a package private class implementing a public interface cannot have its methods invoked via object.getClass().getMethod(name, classes).invoke(object, args) outside the package. Also I can't find any spec. on how multiple equivalent methods should be returned. Regards, Peter
Re: RFR JDK-6321472: Add CRC-32C API
Unsafe is available, so the fields (array, offset) can be read directly UNSAFE.getObject(buffer, hbOffset), UNSAFE.getObject(buffer, offsetOffset). No need for MethodHandlers. During class init the offsets have to be resolved, pretty much like any CAS utilizing algorithm. I didn't propose it as readOnlyBuffers are very, very rarely used and even more unlikely to be used to calculate checksums. It just makes the code ugly. Stanimir On Thu, Oct 23, 2014 at 11:05 AM, Peter Levart peter.lev...@gmail.com wrote: On 10/23/2014 03:52 AM, Staffan Friberg wrote: Webrev with these last updates. Added more tests to make sure CRC32C, CRC32 and Checksum default methods all are covered. http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.07 Hi Staffan, Regarding default case: 168 } else { 169 byte[] b = new byte[Math.min(buffer.remaining(), 4096)]; 170 while (buffer.hasRemaining()) { 171 int length = Math.min(buffer.remaining(), b.length); 172 buffer.get(b, 0, length); 173 update(b, 0, length); 174 } 175 } Have you tried using get()/getInt() directly on the (ro) ByteBuffer instead of copying to byte[] chunks? Intuitively one would expect it perform faster if a redundant copy is avoided. Ah, you already told us that you plan to use intrinsic for CRC32C in the future, so you want to have addresses at hand. A hackish way to avoid copying in this case is to access the byte[] and offset using reflection. But this would have to be wrapped with doPrivileged() which would worsen performance for small buffers. A way to avoid repeated access checks is to do them at class initialization time, using MethodHandle(s). For example, something like: private static final MethodHandle bbArrayGetter; private static final MethodHandle bbArrayOffsetGetter; static { MethodHandle hbGetter; MethodHandle offsetGetter; try { Field hbField = ByteBuffer.class.getDeclaredField(hb); Field offsetField = ByteBuffer.class. getDeclaredField(offset); AccessController.doPrivileged(new PrivilegedActionVoid() { @Override public Void run() { hbField.setAccessible(true); offsetField.setAccessible(true); return null; } }); hbGetter = MethodHandles.lookup().unreflectGetter(hbField); offsetGetter = MethodHandles.lookup(). unreflectGetter(offsetField); } catch (NoSuchFieldException | IllegalAccessException e) { hbGetter = null; offsetGetter = null; } bbArrayGetter = hbGetter; bbArrayOffsetGetter = offsetGetter; } private static byte[] getArrayOrNull(ByteBuffer bb) { if (bb.hasArray()) return bb.array(); if (bbArrayGetter != null) { try { return (byte[]) bbArrayGetter.invokeExact(bb); } catch (Throwable e) { throw new InternalError(e); } } return null; } private static int getArrayOffset(ByteBuffer bb) { if (bb.hasArray()) return bb.arrayOffset(); if (bbArrayOffsetGetter != null) { try { return (int) bbArrayOffsetGetter.invokeExact(bb); } catch (Throwable e) { throw new InternalError(e); } } throw new UnsupportedOperationException(); } Regards, Peter //Staffan On 10/22/2014 05:37 PM, Stanimir Simeonoff wrote: Hi Staffan, The readonly buffer (ByteBuffer.asReadOnlyBuffer()) don't have array() working. You can use int length = Math.min(buffer.remaining, b.length) instead, same with new byte[Math.min(4096, buffer.remaining)]. Using smaller chunks will be more performance friendly than allocating/eating up a huge byte[]. If you feel like, add a test with a heap bytebuffer.asReadOnlyBuffer(). Stanimir On Thu, Oct 23, 2014 at 3:06 AM, Staffan Friberg staffan.frib...@oracle.com mailto:staffan.frib...@oracle.com wrote: Hi, I was thinking about this earlier when I started writing the patch and then I forgot about it again. I haven't been able to figure out when the code will be executed. ByteBuffer is implemented in such a way that only the JDK can extend it and as far as I can tell you can only create 3 types of ByteBuffers (Direct, Mapped and Heap), all of which will be handled by the more efficient calls above. That said just to make the code a bit safer from OOM it is probably best to update the default method and all current implementations which all use the same pattern. A reasonable solution should be the following code byte[] b = new byte[(buffer.remaining() 4096
Re: RFR JDK-6321472: Add CRC-32C API
, UNSAFE.getObject(buffer, offsetOffset) Obviously should be Unsafe.getInt(buffer, offsetOffset) On Thu, Oct 23, 2014 at 11:16 AM, Stanimir Simeonoff stani...@riflexo.com wrote: Unsafe is available, so the fields (array, offset) can be read directly UNSAFE.getObject(buffer, hbOffset), UNSAFE.getObject(buffer, offsetOffset). No need for MethodHandlers. During class init the offsets have to be resolved, pretty much like any CAS utilizing algorithm. I didn't propose it as readOnlyBuffers are very, very rarely used and even more unlikely to be used to calculate checksums. It just makes the code ugly. Stanimir On Thu, Oct 23, 2014 at 11:05 AM, Peter Levart peter.lev...@gmail.com wrote: On 10/23/2014 03:52 AM, Staffan Friberg wrote: Webrev with these last updates. Added more tests to make sure CRC32C, CRC32 and Checksum default methods all are covered. http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.07 Hi Staffan, Regarding default case: 168 } else { 169 byte[] b = new byte[Math.min(buffer.remaining(), 4096)]; 170 while (buffer.hasRemaining()) { 171 int length = Math.min(buffer.remaining(), b.length); 172 buffer.get(b, 0, length); 173 update(b, 0, length); 174 } 175 } Have you tried using get()/getInt() directly on the (ro) ByteBuffer instead of copying to byte[] chunks? Intuitively one would expect it perform faster if a redundant copy is avoided. Ah, you already told us that you plan to use intrinsic for CRC32C in the future, so you want to have addresses at hand. A hackish way to avoid copying in this case is to access the byte[] and offset using reflection. But this would have to be wrapped with doPrivileged() which would worsen performance for small buffers. A way to avoid repeated access checks is to do them at class initialization time, using MethodHandle(s). For example, something like: private static final MethodHandle bbArrayGetter; private static final MethodHandle bbArrayOffsetGetter; static { MethodHandle hbGetter; MethodHandle offsetGetter; try { Field hbField = ByteBuffer.class.getDeclaredField(hb); Field offsetField = ByteBuffer.class. getDeclaredField(offset); AccessController.doPrivileged(new PrivilegedActionVoid() { @Override public Void run() { hbField.setAccessible(true); offsetField.setAccessible(true); return null; } }); hbGetter = MethodHandles.lookup().unreflectGetter(hbField); offsetGetter = MethodHandles.lookup(). unreflectGetter(offsetField); } catch (NoSuchFieldException | IllegalAccessException e) { hbGetter = null; offsetGetter = null; } bbArrayGetter = hbGetter; bbArrayOffsetGetter = offsetGetter; } private static byte[] getArrayOrNull(ByteBuffer bb) { if (bb.hasArray()) return bb.array(); if (bbArrayGetter != null) { try { return (byte[]) bbArrayGetter.invokeExact(bb); } catch (Throwable e) { throw new InternalError(e); } } return null; } private static int getArrayOffset(ByteBuffer bb) { if (bb.hasArray()) return bb.arrayOffset(); if (bbArrayOffsetGetter != null) { try { return (int) bbArrayOffsetGetter.invokeExact(bb); } catch (Throwable e) { throw new InternalError(e); } } throw new UnsupportedOperationException(); } Regards, Peter //Staffan On 10/22/2014 05:37 PM, Stanimir Simeonoff wrote: Hi Staffan, The readonly buffer (ByteBuffer.asReadOnlyBuffer()) don't have array() working. You can use int length = Math.min(buffer.remaining, b.length) instead, same with new byte[Math.min(4096, buffer.remaining)]. Using smaller chunks will be more performance friendly than allocating/eating up a huge byte[]. If you feel like, add a test with a heap bytebuffer.asReadOnlyBuffer(). Stanimir On Thu, Oct 23, 2014 at 3:06 AM, Staffan Friberg staffan.frib...@oracle.com mailto:staffan.frib...@oracle.com wrote: Hi, I was thinking about this earlier when I started writing the patch and then I forgot about it again. I haven't been able to figure out when the code will be executed. ByteBuffer is implemented in such a way that only the JDK can extend it and as far as I can tell you can only create 3 types of ByteBuffers (Direct, Mapped and Heap), all of which will be handled by the more efficient calls above. That said just to make the code a bit safer from OOM it is probably best to update the default method and all
Re: Loading classes with many methods is very expensive
I was under the impression the declaring order has been abandoned in 1.7 which I considered quite a let down as allowed ordering the function calls in JMX in a non-chaotic way. On Thu, Oct 23, 2014 at 4:37 PM, Peter Levart peter.lev...@gmail.com wrote: On 10/23/2014 01:57 AM, Stanimir Simeonoff wrote: Class.java can easily be improved by adding an auxiliary HasMapString, Integer/int[] indexing the position in the array by name and then checking only few overloaded signatures in case of a match. The auxiliary map can be deployed only past some threshold of methods, so it should not affect the common case. Alternatively sorting the array and using binary search is quite trivial to implement as well. Java Class.getMethods() implementation is complicated by the fact that, although not specified, the order of methods in returned array is important. Once it changed, if I remember correctly, and broke many programs, so it had to be restored... Peter Btw, really nice benchmark. Stanimir On Thu, Oct 23, 2014 at 1:53 AM, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class. getMethods-benchmark/test/java/lang/Class/getMethods/ ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms
Re: RFR JDK-6321472: Add CRC-32C API
On Thu, Oct 23, 2014 at 12:10 AM, Bernd Eckenfels e...@zusammenkunft.net wrote: Hello, just a question in the default impl: +} else { +byte[] b = new byte[rem]; +buffer.get(b); +update(b, 0, b.length); +} would it be a good idea to actually put a ceiling on the size of the array which is processed at once? This is an excellent catch. Should not be too large, probably 4k or so. Stanimir Am Tue, 21 Oct 2014 10:28:50 -0700 schrieb Staffan Friberg staffan.frib...@oracle.com: Hi Peter, Thanks for the comments.. 217 if (Unsafe.ADDRESS_SIZE == 4) { 218 // On 32 bit platforms read two ints instead of a single 64bit long When you're reading from byte[] using Unsafe (updateBytes), you have the option of reading 64bit values on 64bit platforms. When you're reading from DirectByteBuffer memory (updateDirectByteBuffer), you're only using 32bit reads. I will add a comment in the code for this decision. The reason is that read a long results in slightly worse performance in this case, in updateBytes it is faster. I was able to get it to run slightly faster by working directly with the address instead of always adding address + off, but this makes things worse in the 32bit case since all calculation will now be using long variables. So using the getInt as in the current code feels like the best solution as it strikes the best balance between 32 and 64bit. Below is how updateByteBuffer looked with the rewrite I mentioned. ong address = ((DirectBuffer) buffer).address(); crc = updateDirectByteBuffer(crc, address + pos, address + limit); private static int updateDirectByteBuffer(int crc, long adr, long end) { // Do only byte reads for arrays so short they can't be aligned if (end - adr = 8) { // align on 8 bytes int alignLength = (8 - (int) (adr 0x7)) 0x7; for (long alignEnd = adr + alignLength; adr alignEnd; adr++) { crc = (crc 8) ^ byteTable[(crc ^ UNSAFE.getByte(adr)) 0xFF]; } if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) { crc = Integer.reverseBytes(crc); } // slicing-by-8 for (; adr (end - Long.BYTES); adr += Long.BYTES) { int firstHalf; int secondHalf; if (Unsafe.ADDRESS_SIZE == 4) { // On 32 bit platforms read two ints instead of a single 64bit long firstHalf = UNSAFE.getInt(adr); secondHalf = UNSAFE.getInt(adr + Integer.BYTES); } else { long value = UNSAFE.getLong(adr); if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) { firstHalf = (int) value; secondHalf = (int) (value 32); } else { // ByteOrder.BIG_ENDIAN firstHalf = (int) (value 32); secondHalf = (int) value; } } crc ^= firstHalf; if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) { crc = byteTable7[crc 0xFF] ^ byteTable6[(crc 8) 0xFF] ^ byteTable5[(crc 16) 0xFF] ^ byteTable4[crc 24] ^ byteTable3[secondHalf 0xFF] ^ byteTable2[(secondHalf 8) 0xFF] ^ byteTable1[(secondHalf 16) 0xFF] ^ byteTable0[secondHalf 24]; } else { // ByteOrder.BIG_ENDIAN crc = byteTable0[secondHalf 0xFF] ^ byteTable1[(secondHalf 8) 0xFF] ^ byteTable2[(secondHalf 16) 0xFF] ^ byteTable3[secondHalf 24] ^ byteTable4[crc 0xFF] ^ byteTable5[(crc 8) 0xFF] ^ byteTable6[(crc 16) 0xFF] ^ byteTable7[crc 24]; } } if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) { crc = Integer.reverseBytes(crc); } } // Tail for (; adr end; adr++) { crc = (crc 8) ^ byteTable[(crc ^ UNSAFE.getByte(adr)) 0xFF]; } return crc; } Also, in updateBytes, the usage of Unsafe.ARRAY_INT_INDEX_SCALE/ARRAY_LONG_INDEX_SCALE to index a byte array sounds a little scary. To be ultra portable you could check that ARRAY_BYTE_INDEX_SCALE == 1 first and refuse to use Unsafe for byte arrays if it is not
Re: Loading classes with many methods is very expensive
Class.java can easily be improved by adding an auxiliary HasMapString, Integer/int[] indexing the position in the array by name and then checking only few overloaded signatures in case of a match. The auxiliary map can be deployed only past some threshold of methods, so it should not affect the common case. Alternatively sorting the array and using binary search is quite trivial to implement as well. Btw, really nice benchmark. Stanimir On Thu, Oct 23, 2014 at 1:53 AM, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/test/java/lang/Class/getMethods/ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms
Re: RFR JDK-6321472: Add CRC-32C API
Hi Staffan, The readonly buffer (ByteBuffer.asReadOnlyBuffer()) don't have array() working. You can use int length = Math.min(buffer.remaining, b.length) instead, same with new byte[Math.min(4096, buffer.remaining)]. Using smaller chunks will be more performance friendly than allocating/eating up a huge byte[]. If you feel like, add a test with a heap bytebuffer.asReadOnlyBuffer(). Stanimir On Thu, Oct 23, 2014 at 3:06 AM, Staffan Friberg staffan.frib...@oracle.com wrote: Hi, I was thinking about this earlier when I started writing the patch and then I forgot about it again. I haven't been able to figure out when the code will be executed. ByteBuffer is implemented in such a way that only the JDK can extend it and as far as I can tell you can only create 3 types of ByteBuffers (Direct, Mapped and Heap), all of which will be handled by the more efficient calls above. That said just to make the code a bit safer from OOM it is probably best to update the default method and all current implementations which all use the same pattern. A reasonable solution should be the following code byte[] b = new byte[(buffer.remaining() 4096) ? buffer.remaining() : 4096]; while (buffer.hasRemaining()) { int length = (buffer.remaining() b.length) ? buffer.remaining() : b.length; buffer.get(b, 0, length); update(b, 0, length); } Xueming, do you have any further comment? Regards, Staffan On 10/22/2014 03:04 PM, Stanimir Simeonoff wrote: On Thu, Oct 23, 2014 at 12:10 AM, Bernd Eckenfels e...@zusammenkunft.net wrote: Hello, just a question in the default impl: +} else { +byte[] b = new byte[rem]; +buffer.get(b); +update(b, 0, b.length); +} would it be a good idea to actually put a ceiling on the size of the array which is processed at once? This is an excellent catch. Should not be too large, probably 4k or so. Stanimir Am Tue, 21 Oct 2014 10:28:50 -0700 schrieb Staffan Friberg staffan.frib...@oracle.com: Hi Peter, Thanks for the comments.. 217 if (Unsafe.ADDRESS_SIZE == 4) { 218 // On 32 bit platforms read two ints instead of a single 64bit long When you're reading from byte[] using Unsafe (updateBytes), you have the option of reading 64bit values on 64bit platforms. When you're reading from DirectByteBuffer memory (updateDirectByteBuffer), you're only using 32bit reads. I will add a comment in the code for this decision. The reason is that read a long results in slightly worse performance in this case, in updateBytes it is faster. I was able to get it to run slightly faster by working directly with the address instead of always adding address + off, but this makes things worse in the 32bit case since all calculation will now be using long variables. So using the getInt as in the current code feels like the best solution as it strikes the best balance between 32 and 64bit. Below is how updateByteBuffer looked with the rewrite I mentioned. ong address = ((DirectBuffer) buffer).address(); crc = updateDirectByteBuffer(crc, address + pos, address + limit); private static int updateDirectByteBuffer(int crc, long adr, long end) { // Do only byte reads for arrays so short they can't be aligned if (end - adr = 8) { // align on 8 bytes int alignLength = (8 - (int) (adr 0x7)) 0x7; for (long alignEnd = adr + alignLength; adr alignEnd; adr++) { crc = (crc 8) ^ byteTable[(crc ^ UNSAFE.getByte(adr)) 0xFF]; } if (ByteOrder.nativeOrder() == ByteOrder.BIG_ENDIAN) { crc = Integer.reverseBytes(crc); } // slicing-by-8 for (; adr (end - Long.BYTES); adr += Long.BYTES) { int firstHalf; int secondHalf; if (Unsafe.ADDRESS_SIZE == 4) { // On 32 bit platforms read two ints instead of a single 64bit long firstHalf = UNSAFE.getInt(adr); secondHalf = UNSAFE.getInt(adr + Integer.BYTES); } else { long value = UNSAFE.getLong(adr); if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) { firstHalf = (int) value; secondHalf = (int) (value 32); } else { // ByteOrder.BIG_ENDIAN firstHalf = (int) (value 32); secondHalf = (int) value; } } crc ^= firstHalf; if (ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN) { crc = byteTable7[crc 0xFF
Re: RFR: 8061244 Use of stack-less ClassNotFoundException thrown from (URL)ClassLoader.findClass()
to super.findClass(name) does the job. So here's an updated webrev that also includes more descriptive message: http://cr.openjdk.java.net/~plevart/jdk9-dev/ClassLoader.CNFE/webrev.02/ Regards, Peter On 10/16/2014 09:35 AM, Peter Levart wrote: On 10/16/2014 01:49 AM, Stanimir Simeonoff wrote: First, really nice tests. As for hiding: missing the real classloader impl. might be quite a bumper for some middleware implementations. That would make pretty hard to trace dependency issues without explicit logging, the latter is usually available but still. Also it'd depend if the classloaders actually use super.findClass() or not. IMO, the option should be switchable via some system property. With a little tweak, the message of the stack-less exception thrown from findClass() methods can be extended to include the classloader's runtime class name and this message can be inherited by a replacement stack-full exception. So the stack-trace would look like: Exception in thread main java.lang.ClassNotFoundException: XXX (thrown by: sun.misc.Launcher$AppClassLoader) at java.lang.ClassLoader.loadClass(ClassLoader.java:366) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:265) at Test.doIt(Test.java:7) at Test.main(Test.java:11) Instead of what we have now: Exception in thread main java.lang.ClassNotFoundException: XXX at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:426) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:317) at java.lang.ClassLoader.loadClass(ClassLoader.java:359) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:265) at Test.doIt(Test.java:7) at Test.main(Test.java:11) Would this be enough to cover your concern? Regards, Peter I am not sure about the need to hide the stackless c-tor as the effect can be achieved by overriding fillInStackTrace(); w/o the extra baggage of JavaLangAccess. Overall very decent improvement. Cheers Stanimir
Re: RFR 9: 8048124 Read hijrah-config-umalqura.properties as a resource
Hi, A minor issue: there are quite a few instances of parsing integers via Integer.valueOf(String) instead just Integer.parseInt(String): HijrahChronology.java: 864 int year = Integer.valueOf(key); 972 months[i] = Integer.valueOf(numbers[i]); Those should be optimized not to create substrings either 994 ymd[0] = Integer.valueOf(string.substring(0, 4)); 995 ymd[1] = Integer.valueOf(string.substring(5, 7)); 996 ymd[2] = Integer.valueOf(string.substring(8, 10)); Stanimir On Mon, Oct 20, 2014 at 9:17 PM, roger riggs roger.ri...@oracle.com wrote: Hi, Updated with recommendations. Webrev: http://cr.openjdk.java.net/~rriggs/webrev-hijrah-config-8049376/ Issue: https://bugs.openjdk.java.net/browse/JDK-8048124 Thanks, Roger On 10/20/2014 11:25 AM, Alan Bateman wrote: On 20/10/2014 16:11, roger riggs wrote: What permission would be needed to read the resource? The limited doPrivileged should include the minimum permission. The resources will be be resources.jar so I think read access to that should be sufficient. If you run a small test with -Djava.security.manager that triggers the config file to load then it would help verify that. When we move to the modular image then the resources will be elsewhere in the runtime image so if you really want to use limited doPrivileged here then access to ${java.home}/** should do it. Of course not using limited doPrivileged is a possibility too. -Alan
Re: RFR JDK-6321472: Add CRC-32C API
Hi, I don't quite see the point of this line: private transient final static ByteOrder ENDIAN = ByteOrder.nativeOrder().equals(ByteOrder.BIG_ENDIAN) ? ByteOrder.BIG_ENDIAN : ByteOrder.LITTLE_ENDIAN; should be just private static final ByteOrder ENDIAN = ByteOrder.nativeOrder(); Also you don't need equals for enums alikes, just reference comparison (==) as they are constants. Alternatively you can just replace it with a boolean as there are no more endianess options. Stanimir On Fri, Oct 17, 2014 at 9:58 PM, Staffan Friberg staffan.frib...@oracle.com wrote: On 10/17/2014 04:05 AM, Alan Bateman wrote: On 17/10/2014 02:42, Staffan Friberg wrote: Hi, This RFE adds a CRC-32C class. It implements Checksum so it will have the same API CRC-32, but use a different polynomial when calculating the CRC checksum. CRC-32C implementation uses slicing-by-8 to achieve high performance when calculating the CRC value. A part from adding the new class, java.util.zip.CRC32C, I have also added two default methods to Checksum. These are methods that were added to Adler32 and CRC32 in JDK 8 but before default methods were added, which was why they were only added to the implementors and not the interface. Bug: https://bugs.openjdk.java.net/browse/JDK-6321472 Webrev: http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.00 I looked over the javadoc, I haven't found time to study the implementation in CRC32C closely yet. Hopefully Sherman will be able to review as he I think he has prototyped several CRC32C implementations. On Checksum#update(ByteBuffer) then a suggestion for: The checksum is updated using buffer.remaining, starting a buffer.position is to replace it with: The checksum is updated with the remaining bytes in the buffer, starting at the buffer's position. Yes that reads much better. Updated CRC32 and Adler32 as well since they have the same text. In the @implNote then I wonder if it would be better to just leave out the note about when the invariants are broken, we don't do that in other places where breakage this is detected. Also I wonder if the note about For best performance, ... should be an @apiNote. Done, removed the assert comment and changed the performance note to an @apiNote. Should CRC32C be final unless we have a good reason for it not to be final? I simply followed what was used for CRC32 and Adler32, but I don't see a reason for not making it final. Guess it is too late to make those two classes final though? In CRC32C then I assume you don't need the p/p for the first statement. The @see Checksum might not be too interesting here given that it will be linked to anyway by way of implement Checksum. Removed @see in all three classes. I think it's more usual to list the @param tags before the @throws. I removed that @param tags as they are already described in the Checksum interface and will be picked up from there by Javadoc. Will do the same in CRC32 and Adler32 as well. For the bounds check then you might want to look at the wording in other areas (in java.lang or java.io for example) to get this better wording (as off+len can overflow). Done, I update CRC32 and Adler32 as well to make keep them as similar as possible. A minor comment on CRC32C is that you might want to keep the line lengths consistent with the rest of the code. I only mention is for future side-by-side reviews to make it a bit easier and avoid too much horizontal scrolling. Done, lines are now 80 or very close in a few cases where breaking them made the code harder to read. -Alan Here is a new webrev with the updates from Alan's and Peter's suggestions. http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.01 Thanks, Staffan
Re: RFR JDK-6321472: Add CRC-32C API
Also, ede Unsafe.ADDRESS_SIZE == 4 That doesn't imply 32bit systems as with less than 32GiB (on 64bit) the default is using compressed options and the address size is still only 4bytes. I usually use something like this (in user space code) to detect the architecture private static final boolean is64Bit; static{ final String p = java.security.AccessController.doPrivileged(new PrivilegedActionString() { @Override public String run() { return System.getProperty(os.arch, x64)+System.getProperty(sun.arch.data.model, ); } }); is64Bit = p.indexOf(64)=0; }; Stanimir On Fri, Oct 17, 2014 at 9:58 PM, Staffan Friberg staffan.frib...@oracle.com wrote: On 10/17/2014 04:05 AM, Alan Bateman wrote: On 17/10/2014 02:42, Staffan Friberg wrote: Hi, This RFE adds a CRC-32C class. It implements Checksum so it will have the same API CRC-32, but use a different polynomial when calculating the CRC checksum. CRC-32C implementation uses slicing-by-8 to achieve high performance when calculating the CRC value. A part from adding the new class, java.util.zip.CRC32C, I have also added two default methods to Checksum. These are methods that were added to Adler32 and CRC32 in JDK 8 but before default methods were added, which was why they were only added to the implementors and not the interface. Bug: https://bugs.openjdk.java.net/browse/JDK-6321472 Webrev: http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.00 I looked over the javadoc, I haven't found time to study the implementation in CRC32C closely yet. Hopefully Sherman will be able to review as he I think he has prototyped several CRC32C implementations. On Checksum#update(ByteBuffer) then a suggestion for: The checksum is updated using buffer.remaining, starting a buffer.position is to replace it with: The checksum is updated with the remaining bytes in the buffer, starting at the buffer's position. Yes that reads much better. Updated CRC32 and Adler32 as well since they have the same text. In the @implNote then I wonder if it would be better to just leave out the note about when the invariants are broken, we don't do that in other places where breakage this is detected. Also I wonder if the note about For best performance, ... should be an @apiNote. Done, removed the assert comment and changed the performance note to an @apiNote. Should CRC32C be final unless we have a good reason for it not to be final? I simply followed what was used for CRC32 and Adler32, but I don't see a reason for not making it final. Guess it is too late to make those two classes final though? In CRC32C then I assume you don't need the p/p for the first statement. The @see Checksum might not be too interesting here given that it will be linked to anyway by way of implement Checksum. Removed @see in all three classes. I think it's more usual to list the @param tags before the @throws. I removed that @param tags as they are already described in the Checksum interface and will be picked up from there by Javadoc. Will do the same in CRC32 and Adler32 as well. For the bounds check then you might want to look at the wording in other areas (in java.lang or java.io for example) to get this better wording (as off+len can overflow). Done, I update CRC32 and Adler32 as well to make keep them as similar as possible. A minor comment on CRC32C is that you might want to keep the line lengths consistent with the rest of the code. I only mention is for future side-by-side reviews to make it a bit easier and avoid too much horizontal scrolling. Done, lines are now 80 or very close in a few cases where breaking them made the code harder to read. -Alan Here is a new webrev with the updates from Alan's and Peter's suggestions. http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.01 Thanks, Staffan
Re: RFR JDK-6321472: Add CRC-32C API
Thanks Staffan, You're absolutely right. Funny enough I was absolutely sure compressed ops affected it. Thanks Stanimir On Sat, Oct 18, 2014 at 12:00 AM, Staffan Friberg staffan.frib...@oracle.com wrote: Hi Stanimir, Unsafe.ADDRESS_SIZE is the size of a native pointer, and is not affected by how the JVM handles Java references on the heap. -- import sun.misc.Unsafe; public class AddressSize { public static void main(String[] args) { System.out.println(Address Size: + Unsafe.ADDRESS_SIZE); } } -- $ java -showversion -XX:-UseCompressedOops AddressSize java version 1.8.0_25 Java(TM) SE Runtime Environment (build 1.8.0_25-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode) Address Size: 8 $ java -showversion -XX:+UseCompressedOops AddressSize java version 1.8.0_25 Java(TM) SE Runtime Environment (build 1.8.0_25-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode) Address Size: 8 $ ./jdk1.8.0_20-b26_32bit/bin/java -showversion AddressSize java version 1.8.0_20 Java(TM) SE Runtime Environment (build 1.8.0_20-b26) Java HotSpot(TM) Server VM (build 25.20-b23, mixed mode) Address Size: 4 //Staffan On 10/17/2014 12:54 PM, Stanimir Simeonoff wrote: Also, ede Unsafe.ADDRESS_SIZE == 4 That doesn't imply 32bit systems as with less than 32GiB (on 64bit) the default is using compressed options and the address size is still only 4bytes. I usually use something like this (in user space code) to detect the architecture private static final boolean is64Bit; static{ final String p = java.security.AccessController.doPrivileged(new PrivilegedActionString() { @Override public String run() { return System.getProperty(os.arch, x64)+System.getProperty(sun.arch.data.model, ); } }); is64Bit = p.indexOf(64)=0; }; Stanimir On Fri, Oct 17, 2014 at 9:58 PM, Staffan Friberg staffan.frib...@oracle.com wrote: On 10/17/2014 04:05 AM, Alan Bateman wrote: On 17/10/2014 02:42, Staffan Friberg wrote: Hi, This RFE adds a CRC-32C class. It implements Checksum so it will have the same API CRC-32, but use a different polynomial when calculating the CRC checksum. CRC-32C implementation uses slicing-by-8 to achieve high performance when calculating the CRC value. A part from adding the new class, java.util.zip.CRC32C, I have also added two default methods to Checksum. These are methods that were added to Adler32 and CRC32 in JDK 8 but before default methods were added, which was why they were only added to the implementors and not the interface. Bug: https://bugs.openjdk.java.net/browse/JDK-6321472 Webrev: http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.00 I looked over the javadoc, I haven't found time to study the implementation in CRC32C closely yet. Hopefully Sherman will be able to review as he I think he has prototyped several CRC32C implementations. On Checksum#update(ByteBuffer) then a suggestion for: The checksum is updated using buffer.remaining, starting a buffer.position is to replace it with: The checksum is updated with the remaining bytes in the buffer, starting at the buffer's position. Yes that reads much better. Updated CRC32 and Adler32 as well since they have the same text. In the @implNote then I wonder if it would be better to just leave out the note about when the invariants are broken, we don't do that in other places where breakage this is detected. Also I wonder if the note about For best performance, ... should be an @apiNote. Done, removed the assert comment and changed the performance note to an @apiNote. Should CRC32C be final unless we have a good reason for it not to be final? I simply followed what was used for CRC32 and Adler32, but I don't see a reason for not making it final. Guess it is too late to make those two classes final though? In CRC32C then I assume you don't need the p/p for the first statement. The @see Checksum might not be too interesting here given that it will be linked to anyway by way of implement Checksum. Removed @see in all three classes. I think it's more usual to list the @param tags before the @throws. I removed that @param tags as they are already described in the Checksum interface and will be picked up from there by Javadoc. Will do the same in CRC32 and Adler32 as well. For the bounds check then you might want to look at the wording in other areas (in java.lang or java.io for example) to get this better wording (as off+len can overflow). Done, I update CRC32 and Adler32 as well to make keep them as similar as possible. A minor comment on CRC32C is that you might want to keep the line lengths consistent with the rest of the code. I only mention is for future side-by-side reviews to make it a bit easier and avoid too much horizontal scrolling. Done
Re: Preview: JDK-8055854 Examine overhead in java.net.URLClassLoader (stack-less exceptions)
First, really nice tests. As for hiding: missing the real classloader impl. might be quite a bumper for some middleware implementations. That would make pretty hard to trace dependency issues without explicit logging, the latter is usually available but still. Also it'd depend if the classloaders actually use super.findClass() or not. IMO, the option should be switchable via some system property. I am not sure about the need to hide the stackless c-tor as the effect can be achieved by overriding fillInStackTrace(); w/o the extra baggage of JavaLangAccess. Overall very decent improvement. Cheers Stanimir On Thu, Oct 16, 2014 at 1:30 AM, Peter Levart peter.lev...@gmail.com wrote: Hi, Recently a patch was applied to JDK9 for issue JDK-8057936: java.net.URLClassLoader.findClass uses exceptions in control flow, which optimizes frequent exceptional path out of doPrivileged block in URLClassLoader.findClass() and replaces it with null return (which is not wrapped with PrivilegedActionException) and hence eliminates one exception construction. This was the low-hanging fruit. Here I present a similar-sized fruit which hangs a little higher... The most time-consuming part of exception construction is remembering the stack trace - the Throwable.fillInStackTrace() method. Do we always need stack-traces? ClassLoader API consists of two protected methods that are meant to be used for communication among ClassLoaders: protected Class? loadClass(String name, boolean resolve) throws ClassNotFoundException protected Class? findClass(String name) throws ClassNotFoundException loadClass() is meant to call findClass() and parent loader's loadClass() and both of them are meant to be arranged in subclass overrides which call super methods. ClassNotFoundException thrown in that arrangement is used only to signal unsuccessful loading of a particular class from one ClassLoader to the other or from super method to overriding method where stack traces are not needed. ClassLoader.loadClass(String name) public method is the public API for programmatic access and Class.forName() public API is wired through it. VM upcall also appears to be using the public loadClass(String) method. I searched for callers of protected loadClass(String name, boolean resolve) method and apart from some tests, all JDK code seems to calls this method either from overriding protected method of a subclass (the super call) or from protected loadClass(String name, boolean resolve) method of another ClassLoader that happens to have access to it either because the caller is in the same package as the callee or nested in the same enclosing class. The notable exceptions are: sun.applet.AppletClassLoader (a subclasss of URLClassLoader): overrides the method with public method (potentially accessible as public API), calls super and propagates unchanged CNFE, but it also overrides findClass(), calls super and never propagates exceptions from super (throws it's own CNF exceptions with full stacks). java.net.FactoryURLClassLoader (a subclass of URLClassLoader): overrides the method with public method (probably a mistake), calls super and propagates CNFE, but the class is package-private, so there's no public access to this method. sun.misc.Launcher.AppClassLoader (a subclass of URLClassLoader): overrides the method with public method(probably a mistake), calls super and propagates CNFE, but the class is package-private, so there's no public access to this method. And of course: java.lang.ClassLoader: delegates to the method from public loadClass(String name) method. That appears to be it. So here's a preview of a patch that uses stack-less ClassNotFoundException(s) for intra/inter-class-loader communication and replaces them with full-stack variants on the public API border: http://cr.openjdk.java.net/~plevart/jdk9-dev/ClassLoader.CNFE/webrev.01/ I also prepared a benchmark which shows the improvement (results attached as comments): http://cr.openjdk.java.net/~plevart/jdk9-dev/ClassLoader.CNFE/CLBench.java The loadSameClassFailureand loadSameClassSuccess are throughput benchmarks which constantly call loadClass() on the AppClassLoader with the same unexistent and existent class names respectively. loadSameClassFailure shows improvement because CNF exception thrown from ExtClassLoader (parent of AppClassLoader) has no stack. The loadNewClassFailure and loadNewClassSuccess are one-shot benchmarks which run each shot on fresh VM instance. A shot loads 100,000 different classes. Both benchmarks show improvement because of intra/inter-class-loader communication is using stack-less CNF exceptions. I guess there exist other internal Oracle benchmarks that could be run with this patch to get some more evidence of possible improvement. So what is the effect on stack-traces? The following simple test: package test; public class Test { static void doIt() throws Exception {
Re: RFR (XS) 8060485: (str) contentEquals checks the String contents twice on mismatch
Hi, This is an unrelated issue, yet is there any reason for the inner loop of equals to be written in such a (confusing) way? if (n == anotherString.value.length) { char v1[] = value; char v2[] = anotherString.value; int i = 0; while (n-- != 0) { if (v1[i] != v2[i]) return false; i++; } return true; } instead of just for (int i=0;in;i++) if (v1[i]!=v2[i]) return false; Stanimir On Tue, Oct 14, 2014 at 8:13 PM, Chris Hegarty chris.hega...@oracle.com wrote: On 14 Oct 2014, at 17:33, Martin Buchholz marti...@google.com wrote: Looks good to me! +1 'noreg-hard' -Chris. On Tue, Oct 14, 2014 at 9:05 AM, Aleksey Shipilev aleksey.shipi...@oracle.com wrote: Hi, Please review a trivial change in String.contentEquals: https://bugs.openjdk.java.net/browse/JDK-8060485 http://cr.openjdk.java.net/~shade/8060485/webrev.00/ It improves the performance drastically: http://cr.openjdk.java.net/~shade/8060485/perf.txt ...not to mention it improves the code readability, and protects us from rogue CharSequence-s. Testing: microbenchmarks, jdk/test/String* jtreg. Thanks, -Aleksey.
Re: RFR (XS) 8060485: (str) contentEquals checks the String contents twice on mismatch
a On Tue, Oct 14, 2014 at 10:55 PM, Alan Bateman alan.bate...@oracle.com wrote: On 14/10/2014 18:38, Aleksey Shipilev wrote: Thanks guys! And of course, I managed to do two minor mistakes in a two-line change: the indentation is a bit wrong, and cast to String is redundant. Here is the updated webrev and the changeset (need a Sponsor!): http://cr.openjdk.java.net/~shade/8060485/webrev.01/ http://cr.openjdk.java.net/~shade/8060485/8060485.changeset -Aleksey. Updated version looks okay. I wonder how common it might be to call this method with String vs. a StringBuilder, just wondering if it should check for a String first (although the type check should be really quick and probably wouldn't make a difference). I was wondering the exactly same but usually allowing CharSequence in the code means likely no String, so while there is no statistical evidence it seems reasonable. Another minor issue is the legacy contentEquals(StringBuffer sb) that can call directly synchronized(sb){nonSyncContentEquals(sb);} but C2 should be able to inline it pretty good just in case. Stanimir
Re: RFR: 8060130: Simplify the synchronization of defining and getting java.lang.Package
The commented annotation in ClassLoader.java must be removed (line 268) // @GuardedBy(itself) Stanimir On Fri, Oct 10, 2014 at 6:10 PM, Claes Redestad claes.redes...@oracle.com wrote: Hi all, please review this patch which attempts to clean up synchronization and improve scalability when defining and getting java.lang.Package objects. webrev: http://cr.openjdk.java.net/~redestad/8060130/webrev.02/ bug: https://bugs.openjdk.java.net/browse/JDK-8060130 Testing: jtreg, UTE vm.parallel_class_loading.testlist, various benchmarks Torturing the retrieval code with a simple microbenchmark[1] shows that the existing code could cause bottlenecks, but also that the proposed patch works slightly faster even in uncontended cases: Benchmark Mode Samples Score Score error Units baseline, 1 thread: o.s.SimpleBench.getClassPackagethrpt 10 11.6210.618 ops/us o.s.SimpleBench.getPackage thrpt 10 41.7543.381 ops/us o.s.SimpleBench.getPackagesthrpt 10 0.0090.000 ops/us baseline, 2 threads: o.s.SimpleBench.getClassPackagethrpt 10 7.8841.977 ops/us o.s.SimpleBench.getPackage thrpt 10 17.0138.079 ops/us o.s.SimpleBench.getPackagesthrpt 10 0.0040.001 ops/us patch applied, 1 thread: o.s.SimpleBench.getClassPackagethrpt 10 13.5190.170 ops/us o.s.SimpleBench.getPackage thrpt 10 59.9990.988 ops/us o.s.SimpleBench.getPackagesthrpt 10 0.0190.001 ops/us patch applied, 2 threads: o.s.SimpleBench.getClassPackagethrpt 10 19.1813.688 ops/us o.s.SimpleBench.getPackage thrpt 10 79.708 31.220 ops/us o.s.SimpleBench.getPackagesthrpt 10 0.0250.006 ops/us /Claes [1] package org.sample; import org.openjdk.jmh.annotations.*; @State(Scope.Thread) public class SimpleBench { @Benchmark public Package[] getPackages() { return Package.getPackages(); } @Benchmark public Package getClassPackage() { return this.getClass().getPackage(); } @Benchmark public Package getPackage() { return Package.getPackage(java.util.zip); } }
Re: JDK-6774110 lock file is not deleted when child logger is used
Hi, LogManager.reset() should invoke a package private method to delete all lock files. However, that would require holding the FileHandler.locks monitor during the resetting of FileHandlers, not just the deletion process. Something like that, plus new PrivilegedAction(). static void deleteAllLocks(){ synchronized(locks){ for (String file : locks) new File(file).delete(); locks.clear(); } } Alternatively the deletion could just be part of the Cleaner shutdownhook with another sun.misc.Cleaner per FileHandler that deletes the file. (Handlers can be shared amongst loggers, so they cannot be closed explicitly). There is a certain risk as file.delete() can be a very slow operation, though (ext3 [concurrently] deleting large files for example). Stanimir On Thu, Oct 9, 2014 at 4:59 PM, Daniel Fuchs daniel.fu...@oracle.com wrote: Thanks Jason. I wonder if that may be another issue. Interesting. I'll see if I can work out a test case for that tomorrow. With the test case provided in the bug - tested on 7, 8, and 9, the only file that remained at the end was 'log' (which is as it should be - and I ran the test case several times with each JDK) - which lets me think that maybe the issue was different. Now what you describe looks indeed like a bug that should still be present in the code base. I didn't think about that scenario, thanks for pointing it out! If i can write a reproducer (which should not be too difficult), it will be a good incentive to attempt a fix :-) Thanks again, -- daniel On 10/9/14 9:56 PM, Jason Mehrens wrote: Daniel, The evaluation on this bug is not quite correct. What is going on here is the child logger is garbage collected which makes the FileHandler unreachable from the LogManager$Cleaner which would have closed the attached FileHandler. In the example, there is no hard reference that escapes the 'execute' method. Prior to fixing JDK-6274920: JDK logger holds strong reference to java.util.logging.Logger instances, the LogManager$Cleaner would have deleted the lock file on shutdown. Now that the loggers are GC'able, one possible fix would be change the FileHandler.locks static field to MapString,FileHandler where the key is the file name and the value is the FileHandler that is open. Then in the LogManager$Cleaner could close any entries in that map after LogManager.reset() is executed. Jason
Re: JDK-6774110 lock file is not deleted when child logger is used
On Fri, Oct 10, 2014 at 5:13 PM, Jason Mehrens jason_mehr...@hotmail.com wrote: Hi Daniel, Stanimir, Closing the Handler is the main goal which takes care of the lock files. As far as a strong reference to the logger you don't need that. What you need to do is store a reference to the Logger.handlers List in the LogManager$LoggerWeakRef and reap the handlers inside of dispose. Now the only other issue is if one handler has been added to multiple loggers which could close a handler early. That is so uncommon I don't think it is worth the effort to correct. The caller can just fix the code to use a strong reference. Actually we have framework that uses shared handlers quite liberally (like writing lots of stuff in a common file) but it also keeps strong references to any logger configured. The framework doesn't use the properties file but a custom xml-based config. My point is that: shared handlers may not be that uncommon and the change may break existing code as any shared handler could be prematurely closed. The Handler automatic closure remains problematic as they don't have a defined lifecycle. close() should be invoked after there are no references and it requires calling a potentially overridden method. It can be carried by PhantomReference+WeakRefence combo, though. Date: Fri, 10 Oct 2014 11:39:41 +0200 From: daniel.fu...@oracle.com To: stani...@riflexo.com CC: jason_mehr...@hotmail.com; core-libs-dev@openjdk.java.net Subject: Re: JDK-6774110 lock file is not deleted when child logger is used Hi Stanimir, Jason, On 10/10/14 10:02, Stanimir Simeonoff wrote: Hi, LogManager.reset() should invoke a package private method to delete all lock files. However, that would require holding the FileHandler.locks monitor during the resetting of FileHandlers, not just the deletion process. Something like that, plus new PrivilegedAction(). static void deleteAllLocks(){ synchronized(locks){ for (String file : locks) new File(file).delete(); locks.clear(); } } There's more than the deletion of the lock file unfortunately. I believe the handlers should be properly closed. A handler with an XMLFormatter for instance needs to write the tail of the file. Alternatively the deletion could just be part of the Cleaner shutdownhook with another sun.misc.Cleaner per FileHandler that deletes the file. (Handlers can be shared amongst loggers, so they cannot be closed explicitly). There is a certain risk as file.delete() can be a very slow operation, though (ext3 [concurrently] deleting large files for example). That's a solution I envisaged and rejected because of the constraints we have when running in the ReferenceHandler thread. I don't think it would be appropriate to close a Handler in that thread. I'm leaning towards suggesting that the LogManager should hold a strong reference on the loggers for which a Handler is explicitly configured in the configuration file. It would ensure that these loggers are still around when reset() is called. best regards, -- daniel Stanimir On Thu, Oct 9, 2014 at 4:59 PM, Daniel Fuchs daniel.fu...@oracle.com mailto:daniel.fu...@oracle.com wrote: Thanks Jason. I wonder if that may be another issue. Interesting. I'll see if I can work out a test case for that tomorrow. With the test case provided in the bug - tested on 7, 8, and 9, the only file that remained at the end was 'log' (which is as it should be - and I ran the test case several times with each JDK) - which lets me think that maybe the issue was different. Now what you describe looks indeed like a bug that should still be present in the code base. I didn't think about that scenario, thanks for pointing it out! If i can write a reproducer (which should not be too difficult), it will be a good incentive to attempt a fix :-) Thanks again, -- daniel On 10/9/14 9:56 PM, Jason Mehrens wrote: Daniel, The evaluation on this bug is not quite correct. What is going on here is the child logger is garbage collected which makes the FileHandler unreachable from the LogManager$Cleaner which would have closed the attached FileHandler. In the example, there is no hard reference that escapes the 'execute' method. Prior to fixing JDK-6274920: JDK logger holds strong reference to java.util.logging.Logger instances, the LogManager$Cleaner would have deleted the lock file on shutdown. Now that the loggers are GC'able, one possible fix would be change the FileHandler.locks static field to MapString,FileHandler where the key is the file name and the value is the FileHandler that is open. Then in the LogManager$Cleaner could close any entries in that map after LogManager.reset() is executed. Jason
Re: RFR: 8060132: Handlers configured on abstract nodes in logging.properties are not always properly closed
persistentLoggers should be cleared on reset(), imo. Also using count and then for in to loop over an array looks a bit ugly, should be just for (int i=0; inames.length; i++){ final String className = names[i];..} Calling the class name 'word' is weird as well. (I do understand that's not new but it's a good time to get it straight). Cheers Stanimir On Fri, Oct 10, 2014 at 5:51 PM, Daniel Fuchs daniel.fu...@oracle.com wrote: Hi, Please find below a possible fix for: 8060132: Handlers configured on abstract nodes in logging.properties are not always properly closed https://bugs.openjdk.java.net/browse/JDK-8060132 webrev: http://cr.openjdk.java.net/~dfuchs/webrev_8060132/webrev.00/ Other options have been discussed in this other email thread: Subject: JDK-6774110 lock file is not deleted when child logger is used http://mail.openjdk.java.net/pipermail/core-libs-dev/2014- October/029038.html best regards, -- daniel
Re: RFR: 8060132: Handlers configured on abstract nodes in logging.properties are not always properly closed
Please disregard the remark about count, as any exception would make the use of the index variable incorrect. Stanimir On Fri, Oct 10, 2014 at 6:10 PM, Stanimir Simeonoff stani...@riflexo.com wrote: persistentLoggers should be cleared on reset(), imo. Also using count and then for in to loop over an array looks a bit ugly, should be just for (int i=0; inames.length; i++){ final String className = names[i];..} Calling the class name 'word' is weird as well. (I do understand that's not new but it's a good time to get it straight). Cheers Stanimir On Fri, Oct 10, 2014 at 5:51 PM, Daniel Fuchs daniel.fu...@oracle.com wrote: Hi, Please find below a possible fix for: 8060132: Handlers configured on abstract nodes in logging.properties are not always properly closed https://bugs.openjdk.java.net/browse/JDK-8060132 webrev: http://cr.openjdk.java.net/~dfuchs/webrev_8060132/webrev.00/ Other options have been discussed in this other email thread: Subject: JDK-6774110 lock file is not deleted when child logger is used http://mail.openjdk.java.net/pipermail/core-libs-dev/2014- October/029038.html best regards, -- daniel
Re: RFR: 8043306 - Provide a replacement for the API that allowed to listen for LogManager configuration changes
Hello, Just a note, acc is going to leak the context class loader until the listener is removed. That should be noted in the documentation. Also if there is a runtime exception during run() of a listener it will block any other listeners to be invoked and the exception is going to be propagated past readConfiguration(). I suppose that's ok however the listeners are to be invoked in a 'random' order depending on their identityHashCode. So if there is an exception in the last registered there is no guarantee to invoke even the 1st added listener. The entire process is not deterministic which is not ok for listeners invocation. Stanimir
Re: RFR: 8043306 - Provide a replacement for the API that allowed to listen for LogManager configuration changes
Good point. But I expect the concrete listener class will also be usually loaded by the context class loader - so isn't that kind of expected anyway? I didn't want to stuff too many implementation details into the spec. Maybe that could be an API note however. Would anyone be able to suggest a good wording? On leaks. The c-tor of LogManager leaks the thread group and the current AccessControlContext till JVM shutdown. The constructor code should be something like the one below with Cleaner getting ThreadGroup in its c-tor as well. java.security.AccessController.doPrivileged(//zip thread.inheritedAccessControlContext new java.security.PrivilegedActionVoid() { public Void run() { // put the cleaner in the systsm threadgroup ThreadGroup grp = Thread.currentThread().getThreadGroup(); for(ThreadGroup parent;null!=(parent = grp.getParent());) { grp = parent; } Cleaner cleaner = new Thread(grp); cleaner.setContextClassLoader(null);//zip the classloader // Add a shutdown hook to close the global handlers. try { Runtime.getRuntime().addShutdownHook(cleaner); } catch (IllegalStateException e) { // If the VM is already shutting down, // We do not need to register shutdownHook. } return null; } }); Also if there is a runtime exception during run() of a listener it will block any other listeners to be invoked and the exception is going to be propagated past readConfiguration(). I suppose that's ok however the listeners are to be invoked in a 'random' order depending on their identityHashCode. So if there is an exception in the last registered there is no guarantee to invoke even the 1st added listener. The entire process is not deterministic which is not ok for listeners invocation. The 'old' deprecated code was already like that. This doesn't mean we shouldn't try to make it better - but I have to ask whether it's worth the trouble. However the listeners are to be invoked in the same order they have been added. Are there applications where there will be several listeners? Not sure, however listeners can be added at any time. readConfiguration(InputStream) is public too, so it's plausible. And if there are - shouldn't these listener run their code in a try { } catch ( ) { } to ensure that the exception will be logged - or rather reported - where it belong? IMO, the API can't demand any cooperation from user space code . IMO the only thing we could do here is to silently drop any runtime/exception or error, which I am a bit reluctant to do; I believe it's the concrete implementation of the listener which should take care of that. I'd rather run the listeners in try/catch and throw the 1st exception at the end. Defining the 1st however requires obeying the order the listeners have been added. As far as I see there is no documentation how the listeners are supposed to behave or be invoked. Stanimir
Re: RFR: 8043306 - Provide a replacement for the API that allowed to listen for LogManager configuration changes
Hi Peter, I like the LHM (personal favorite data structure) approach with the dual-purpose key. Stanimir On Fri, Sep 12, 2014 at 2:55 PM, Peter Levart peter.lev...@gmail.com wrote: On 09/12/2014 12:45 PM, Daniel Fuchs wrote: However the listeners are to be invoked in the same order they have been added. I am still unconvinced this is worth the additional complexity it would bring to the implementation. The deprecated methods were using HashMap to store listeners, and therefore the order in which listeners were invoked was also undefined. Nobody has ever complained. What about using a synchronized LinkedHashMapListenerAction, ListenerAction() with the following double-purpose inner class: private static final class ListenerAction implements PrivilegedActionVoid { private final Runnable listener; private final AccessControlContext acc; ListenerAction(Runnable listener, AccessControlContext acc) { this.listener = listener; this.acc = acc; } void invoke() { if (acc == null) { run(); } else { AccessController.doPrivileged(this, acc); } } // not to be run() directly - use invoke() instead @Override public Void run() { listener.run(); return null; } @Override public int hashCode() { return System.identityHashCode(listener); } @Override public boolean equals(Object obj) { return (obj instanceof ListenerAction) ((ListenerAction) obj).listener == listener; } } (http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/ 3ae82f0c6b31/src/share/classes/java/util/logging/LogManager.java#l1431) I'd rather run the listeners in try/catch and throw the 1st exception at the end. Defining the 1st however requires obeying the order the listeners have been added. As far as I see there is no documentation how the listeners are supposed to behave or be invoked. We could do that - invoke all listeners and rethrow the exception caught by the first listener that failed (on the basis that if any other listener subsequently fails it may be because the previous listener has left the system in an inconsistent state, and that therefore the first exception caught has 'more value' than the others). I am not completely convinced this is a better behavior than simply stopping invoking listeners at the first exception. I wonder if there would be a good wording to say that listeners are not supposed to throw, but that if they do, the exceptions will be propagated... You're probably right. But in case you changed your mind, you could rethrow the 1st exception thrown, with possible further exceptions added to it a suppressed exceptions... Regards, Peter best regards, -- daniel
Re: Why is finalize wrong?
Hi Andrew, On Wed, Sep 3, 2014 at 12:58 PM, Andrew Haley a...@redhat.com wrote: On 03/09/14 01:15, Stanimir Simeonoff wrote: Like David Lloyd mentioned finalize() can be invoked concurrently to some methods (if there is no reachability to 'this'). OTOH I see finalize useful for managing native resources mostly and possibly stopping (signaling) threads left unmanaged. In that particular case synchronization is a must, so it comes naturally. But finalize isn't useful for managing native resources because finalizers may or may not run, as you note. People will test their programs with one implementation then discover, when their programs are deployed, that they sometimes mysteriously fail. In particular, you might run out of file handles. I meant cleaning up rather than managing per se. To make it clear: finalization is no substitute for proper lifecycle - anything that has open() should have a following up close(), otherwise it's a bug. Finalization still helps as safenet vs leaving native resources totally unallocated and leaking. Yet, finalizers have to run somehow since the direct (and mapped) buffers very heavily rely on them. The direct/mapped buffers are ones that don't have proper lifecycle, esp. the mapped ones The latter don't offer munmap. I believe the java.nio.Bits still includes a notorious call in a static void reserveMemory(long size, int cap) { //check if above the direct memory threashold and... System.gc(); try { Thread.sleep(100); } catch (InterruptedException x) { // Restore interrupt status Thread.currentThread().interrupt(); } } in an attempt to cope with the lack of proper and timely finalization. Prior the fix the OOM when using DirectBuffers was quite guaranteed unless pooling the buffers on your own. On a flip note, I think it's kind of a bug as it has to manually call run System.runFinalization() prior throwing the OOM, but it opts to sleep 100ms instead. If the finalizers don't run prior the process termination, that's ok as the OS should be able to clean up any native resources allocated on its own. However finalizers not running is not that different from a weak/phantom reference not being enqueued, practically the odds are quite the same. The impl. of the finalization is actually a high priority dedicated thread that processes a ReferenceQueue. As Jarolsav put it, I don't think this is explainable to regular Java developers. As David Lloyd put it, If you still think that finalize is a good idea, given that it's basically defective *and* there is almost always a better solution, then I will be quite astounded. As I put it, finalize is broken. I'd not say broken, as it works exactly as specified. Still I utterly agree it requires pretty deep knowledge on the inner works to make it somewhat useful. Stanimir
Re: Why is finalize wrong?
I meant cleaning up rather than managing per se. To make it clear: finalization is no substitute for proper lifecycle - anything that has open() should have a following up close(), otherwise it's a bug. Finalization still helps as safenet vs leaving native resources totally unallocated and leaking. Yet, finalizers have to run somehow since the direct (and mapped) buffers very heavily rely on them. DirectByteBuffers rely on Cleaner(s), which are PhanthomReferences and are managed (invoked) by the ReferenceHandler thread directly. Finalizers are just enqueued by ReferenceHandler thread, but they are invoked by the special FinalizerThread. That's because Cleaner(s) are internal JDK mechanism, while finalizers invoke client code so they can stall the FinalizerThread. A bug in a client code can bring FinalizerThread to a halt. Cleansers totally slipped my mind regarding DirectBuffers(almost feel ashamed about). Yes, all the nio is using them.unlike java.io.FIleInputStream, java.util.zip.Deflater/Inflater. I remember a process getting justice by the oom_killer b/c a Deflater was not end()'d properly. I am not sure if Deflater (which a memory hog) could use Cleaner, though. . Again halting the Finalizer is a slow leak. The memory will fill with tons of java.lang.ref.Finalizer instances, so either way it'd be OOM, just takes more time and blows up unexpectedly. As said, DirectByteBuffer(s) use Cleaner(s), which are more predictable. But even with Cleaner(s), a multithreaded test could be constructed which reserves direct memory with greater rate than single ReferenceHandler thread can unreserve it. Until this was fixed: https://bugs.openjdk.java.net/browse/JDK-6857566 Thanks again for the Cleaner reminder. @David, Pooling doesn't help 32bit processes running out of address space due to lack of munmap. FileChannelImpl employs the same tactic to call System.gc() and wait. Stanimir
Re: Why is finalize wrong?
Hi Jaroslav, In my opinion the greatest downside of the finalize() is that involves enlisting in a shared queue prior to the object creation. The queue is shared amongst the entire VM and that could be a potential bottleneck. In practical terms having a finalize method involves a full memory fence when a new instance is created (or cloned) and removing a finalize() method can break some racy code. Like David Lloyd mentioned finalize() can be invoked concurrently to some methods (if there is no reachability to 'this'). OTOH I see finalize useful for managing native resources mostly and possibly stopping (signaling) threads left unmanaged. In that particular case synchronization is a must, so it comes naturally. Personally I never use dedicated threads with ReferenceQueue but I poll (expunge) the queue on add/put. If the there are too many successful ReferenceQueue.poll() hijacking the Finalizer thread is an option. Something among the lines: public static interface CleanerT, Ref extends Reference? extends T{ void clean(Ref ref); } public static T, Ref extends Reference? extends T void expunge(CleanerT, Ref cleaner, final ReferenceQueueT queue, int expungeThreshold){ Reference? extends T w=queue.poll(); if (w==null) return;//keep the method short, so it's inlined; queue being empty is the fast lane expungeImpl(cleaner, w, queue, expungeThreshold); } static T, Ref extends Reference? extends T void expungeImpl(final CleanerT, Ref cleaner, Reference? extends T ref, final ReferenceQueueT queue, int expungeThreshold){ expunge(cleaner, ref); for (;--expungeThreshold0 null!=(ref=queue.poll());){ expunge(cleaner, ref); } if (expungeThreshold==0){//delegate a full expunge to the finalizer //the anon. class stores cleaner and queue in fields, and the c-tor has happens before with finalize() unlike any other method new Object(){//feel free to extend/wrap WeakRefence(cleaner) and also keep weak ref to queue useful for non-native resources protected void finalize(){//keep in mind: the execution of cleaner.clean(Ref) will not have the right ContextClassLoader, ThreadGroup, AccessControlContext expunge(cleaner, queue, Integer.MAX_VALUE); } }; } } @SuppressWarnings(unchecked) private static T, Ref extends Reference? extends T void expunge(CleanerT, Ref cleaner, Reference? extends T ref){ cleaner.clean((Ref) ref); } Some colorization: http://pastebin.com/1V9Jhavz Of course the finalize method is not guaranteed to execute at all but if expungeThreshold is greater than 1 and expunge is called on each add/put operation, still it should be ok. Cheers Stanimir On Mon, Aug 4, 2014 at 4:46 PM, Jaroslav Tulach jaroslav.tul...@oracle.com wrote: Hi. Last week we touched topic of finalization and what is wrong with it. I proposed three reasons why Object.finalize is bad. Is it correct and extensive list or would you change it or expand it? Thanks as ... # 1 - Because of automatic JDK management thread? # 2 - Or because once finalize() is called, one still has reference to the gone object and can re-activate it? #3 - Or because the finalizer thread is shared between completely unrelated objects and misbehavior of one can cause problems for others? ...I consider #2 the worst idea: My personal top three starts with #2. That is logically invalid. #3 can be a problem, but so can be any other misbehaving thread in an application. #1 is completely OK, from my perspective and I think I got some support... Weak references and reference queues were introduced to address some of the short comings of finalization as well as providing other capabilities within a GC'd environment. ...I just have to say that the ReferenceReferenceQueue as of JDK8 is not enough. As the case of activeReferenceQueue shows, the fact that each instance of ReferenceQueue where one uses remove() needs one dedicated thread is a big flaw. Fixing the flaw in an external library is not easy (see the summary from last week: https://bugs.openjdk.java.net/browse/JDK-8051843?focusedCommentId=13531670p [1]age=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#comment-13531670). Moreover David Holmes finds introductions of new system level threads uneasy: The threading aspect is somewhat tangential. Any time you choose to introduce a new thread to perform a task you have to be concerned about the lifetime of the task and thus the thread. But when thinking about it, we can have functionality of activeReferenceQueue in JDK without introducing any new thread! We can reuse the existing finalizer one! I plan to propose a patch which will look at the activeReferenceQueue problem from a completely new perspective and offer lightweight finalize solution that will have above discussed properties #1 and #3, but will not suffer from #2 (the biggest problem with
Re: Impact of code difference in Collection#contains() worth improving?
Somewhat off topic. The linked implementation of android lacks integer overflow on size. That might be actually irrelevant on android due to OOM happening earlier than integer overflow. The latter that made me check the JDK impl. I know most concurrent collections handle integer overflow (CLQ, CHM for instance) but I didn't expect LinkedList would be a integer overflow victim. There is no prevention of size going negative. It's unlikely to cause security issues as LinkedList is pretty much shunned (rightfully) and seldom used. Stanimir On Sat, Aug 30, 2014 at 8:04 PM, Fabian Lange lange.fab...@gmail.com wrote: Hello dear list, it was not my intention to steal precious developer time by annoying you guys with my finding. I really wanted to understand why there is a source difference. So I went ahead and looked at it from the only aspect I could estimate contributing to it. I am well aware of the fact that JIT does an amazing job optimizing code, so it surprises me how in some replies there is an negative undertone and randomness attributed to JIT. Right now I assume that my question upset some people, because it is of course not nice that people question code which was written over a decade ago. If not discussing the question on a technical level, I do not know why the argument of time wasting on micro level is made in multiple page long e-mails, which for sure also took precious time to write. So thanks to everybody who taught me a bit about inner workings of JIT, and special thanks to Martin who did have the courage to submit a bug and webrev! Fabian: PS: Out of curiosity I looked at the contains implementation here: https://android.googlesource.com/platform/libcore/+/master/luni/src/main/java/java/util/LinkedList.java it is manually inlined to avoid counting the position at all. I wonder if JIT would do that at any point as well. On Wed, Aug 27, 2014 at 4:02 PM, Fabian Lange lange.fab...@gmail.com wrote: Hi all, I have been involved recently in some theoretical or nonsensical discussions about microbenchmarking, jit compiling assemblies and so fort. One example was LinkedList vs ArrayList. What I noticed is that those two have a different implementation for contains(): ArrayList: public boolean contains(Object o) { return indexOf(o) = 0; } LinkedList: public boolean contains(Object o) { return indexOf(o) != -1; } Logically this is of course identical due to the contract of contains which returns either -1 or the =0 index of the element. This code has been like this almost forever, and I was wondering if this actually makes a difference in CPU cycles. And in fact this code compiles into different assembler instructions. The array list does a test against 0 and conditional move, while the linked list does a jump equals on -1. Again that is not surprising, because the actual java source is different. But I wonder if both options are equally good in cold performance and when jitted based on parameter values. Wouldn't one implementation be better than the other? And why is not the better implementation taken in both classes (and maybe other Collections which use indexOf) ? Is the answer that this has always been like this and the benefit is not worth the risk of touching ancient code? And if not for performance, would code clarify and similarity be an argument? (this message was posted to jdk8-dev initially, thanks to Dalibor Topic for the pointer to this list) Fabian
Re: Exposing LZ77 sliding window size in the java.util.zip.[Inflator|Defaltor] API
Hi, If there would be any changes to Deflater/Inflater I'd strongly suggest enabling them to operate with direct buffers. Working with byte[] forces a copy in the native code which is suboptimal. Probably there should be a concern on memLevel for Deflate as well, iirc it uses the 8 by default (max being 9). memLevel is different than the sliding window but still allocates 2^(7+memLevel+1) bytes. Several years ago I found out a java implementation of Deflate out-performed the provided native code. So I abandoned the built-in library. Plus it's possible to implement Z_SYNC_FLUSH naturally instead of changing the compression level back and forth to achieve similar effect. Also reusing a java pooled Deflater doesn't need free the allocated memory unlike calling end() of the zlib one. The only downside probably is the increased heap usage which could be an issue for huge heaps (although it's mostly primitive arrays). Stanimir On Tue, Aug 26, 2014 at 11:45 AM, Mark Thomas ma...@apache.org wrote: Hi, I'm currently working on the implementation of the WebSocket permessage-deflate extension for Apache Tomcat. I am using the JRE provided classes java.util.zip.[Inflator|Defaltor] to do the compression and decompression. I have a working implementation but there is one feature I can't implement because the necessary API isn't available. The WebSocket permessage-deflate specification [1] allows both the client and server to specify a number between 8 to 15 inclusive indicating the base-2 logarithm of the LZ77 sliding window size. The purpose of this feature is to enable both the client and the server to limit the size of the buffer they need to maintain to reduce their memory footprint per connection. Would it be possible for the java.util.zip.Inflator and java.util.zip.Defaltor API to be extended to expose the LZ77 window size? My suggestion would be an additional constructor for each class that took and additional int parameter for window size but I'd be happy with any API that provided control of the window size. Cheers, Mark [1] http://tools.ietf.org/html/draft-ietf-hybi-permessage-compression-18
java.util.logging.FileHandler integer overflow prevents file rotation
java.util.logging.FileHandler uses ints to track the written bytes that can cause negative value for java.util.logging.FileHandler.MeteredStream.written as there is no check in any of the write methods. The overflow prevents the check in FileHandler.publish(LogRecord) meter.written = limit to ever trigger, esp. setting limit to Integer.MAX_VALUE that effectively always prevents file rotation. On a flip note, MeteredStream should be a static class. On another flip note, FileHandler should be using long as part of the API and internal implementation. ints are limited to 2GiB even when properly implemented that's a relatively low amount for certain types of logging files. So I propose adding overloaded c-tors FileHandler(String pattern, long limit, int count, boolean append) and FileHandler(String pattern, long limit, int count). LogManager should have a method long getLongProperty(String name, long defaultValue) to initialize limit. Thanks Stanimir
sun.net.www.protocol.https.HttpsURLConnectionImpl doesn't equal to self
Hi, As the title says there is a major bug with HttpsURLConnection as it breaks the contract of equals. So the instance cannot be contained (mostly removed) in any collection reliably. (Concurrent)HashMap has a provision to test equality by reference prior calling equals, though. Finding the bug in production is quite a ride as the http variant doesn't exhibit the problem. Here is a simple test case. public static void main(String[] args) throws Throwable{ URLConnection c = new URL(https://oracle.com;).openConnection(); System.out.println(c.getClass().getName()+ equals self: +c.equals(c)); c.getInputStream().close(); System.out.println(c.getClass().getName()+ equals self: +c.equals(c)); } The culprit is in HttpsURLConnectionImpl.equals that blindly calls delagate.equals: public boolean equals(Object obj) { return delegate.equals(obj); } It should be changed to: public boolean equals(Object obj) { return this==obj || (obj instanceof HttpsURLConnectionImpl) delegate.equals( ((HttpsURLConnectionImpl)obj).delegate ); } The class has some other issue that involves declaring finalize method to simply call delegate's dispose. The finalize method is unneeded and just creates unnecessary link in the finalization queue + Finalizer object. Thanks Stanimir
Re: Does this look like a race?
Hi, The real problem would be fields strikelist, graybits and the like not inUse per se. They are not volatile either and there might not be proper happens before edges, so they could easily be updated out of order. For instance getGrayBits may throw a NPE. IMO, inUse is overall a poor choice to implement the shared singleton, esp since introduces a synchronized block that can be easily avoided. inUse and the sync block can be replaced by an AtomicReference +CAS on get. Something like that: private static final AtomicReferenceGlyphList shared=new AtomicReferenceGlyphList (reusableGL); public static GlyphList getInstance() { GlyphList result=shared.get(); return result!=null shared.compareAndSet(result, null)?result:new GlyphList(); } public void dispose(){ if (this==reusableGL){ strikelist = null; shared.set(this);//or if (!shared.compareAndSet(null, this)) throw new AssertionError(); } } Regards Stanimir On Tue, Aug 12, 2014 at 6:01 PM, Andrew Haley a...@redhat.com wrote: (Please forgive me if this should go to the AWT list or somesuch. It seems to me like this is not really a graphics-related issue, but one of Java concurrency.) This is in JDK9, sun.font.GlyphList. There is a non-volatile boolean inUse, and it is not always written in a synchronized block. It is used to allow exclusive access to a singleton instance. It seems to me that, at a minimum, inUse should be volatile, or e.g. strikelist might be overwritten to null after some other thread has started using this GlyphList. Do you agree? Thanks, Andrew. /* This scheme creates a singleton GlyphList which is checked out * for use. Callers who find its checked out create one that after use * is discarded. This means that in a MT-rendering environment, * there's no need to synchronise except for that one instance. * Fewer threads will then need to synchronise, perhaps helping * throughput on a MP system. If for some reason the reusable * GlyphList is checked out for a long time (or never returned?) then * we would end up always creating new ones. That situation should not * occur and if it did, it would just lead to some extra garbage being * created. private static GlyphList reusableGL = new GlyphList(); private static boolean inUse; ... public static GlyphList getInstance() { /* The following heuristic is that if the reusable instance is * in use, it probably still will be in a micro-second, so avoid * synchronising on the class and just allocate a new instance. * The cost is one extra boolean test for the normal case, and some * small number of cases where we allocate an extra object when * in fact the reusable one would be freed very soon. */ if (inUse) { return new GlyphList(); } else { synchronized(GlyphList.class) { if (inUse) { return new GlyphList(); } else { inUse = true; return reusableGL; } } } } ... /* There's a reference equality test overhead here, but it allows us * to avoid synchronizing for GL's that will just be GC'd. This * helps MP throughput. */ public void dispose() { if (this == reusableGL) { if (graybits != null graybits.length MAXGRAYLENGTH) { graybits = null; } usePositions = false; strikelist = null; // remove reference to the strike list inUse = false; } }
Re: Lightweight finalization for the JDK without any drawbacks was: Why is finalize wrong?
Hi Jaroslav, The snippet is likely to just result into a NPE at obj.toString() than anything else as obj must be a member and volatile to even compile. Stanimir On Fri, Aug 8, 2014 at 11:01 AM, Jaroslav Tulach jaroslav.tul...@oracle.com wrote: Hello Doug, thanks for your reply and clarification about finalization. May I ask for one more clarification, this time about WeakReference behavior? There has been a claim made in this thread that an object which is really gone (from a point of view of WeakReference) may have its instance method being executed. Something like: obj = new Object(); ref = new WeakReferenceObject(obj); class OtherThread extends Thread { public void run() { obj.toString(); obj = null; } } new OtherThread().start(); obj = null; while (ref.get() != null) { // wait } assert ref.get() == null; // can the OtherThread be calling toString() on obj now!? From my point of view, this is impossible. Once ref.get() returns null, the obj is no longer reachable and is really gone. Would you be so kind and confirm this observation or is there a chance Andrew could be right? On Aug 7 2014 Andrew Haley wrote: It's not a bug: it's in the specification, has been since 1996. Given the fact that WeakReference co. has been introduced in JDK 1.2 and that version was released at the end of 1998, I bet it was just a communication problem, as Andrew must have been talking about something else than WeakReference + ReferenceQueue behavior. -jt On 08/07/2014 03:43 AM, Jaroslav Tulach wrote: Dne Čt 7. srpna 2014 08:17:16, Andrew Haley napsal(a): I'm wondering whether it's really worth your time applying band-aids to a broken mechanism. Under the assumption WeakReference and ReferenceQueue are flawless, there is no reason to call a mechanism based on WeakReference and ReferenceQueue synergy broken. An object can be really gone even when one of its instance methods is still executing. That would probably signal a major flaw in the JVM. Finalization is not broken in the sense of not meeting its specs (see http://docs.oracle.com/javase/specs/jls/se7/html/jls-12.html#jls-12.6) but some surprising and undesirable cases are allowed, and actually occur, in highly concurrent programs. Including cases where finalizers may run concurrently with the final invocation of a method on the target. Because the specs interact with the details of the memory model, we have been discussing (on jmm-dev list) possible changes and improvements. But in the mean time (or in any case), the best practice is to consider finalization as a last resort, after ruling out all other possible cleanup strategies. Finalization is easy to spec, implement, and control in purely sequential systems, but is a questionable idea in concurrent ones. -Doug Dne Čt 7. srpna 2014 16:50:39, Andrew Haley napsal(a): On 07/08/14 08:43, Jaroslav Tulach wrote: On 06/08/14 11:17, Jaroslav Tulach wrote: Sure, but that cannot happen with activeQueue() as the referent is really gone before appropriate Reference.run() is executed. I'm sure that it can happen, in just the same way. You are more than welcomed to prove your claim and report bug to JDK. I'd say. It's not a bug: it's in the specification, has been since 1996. Andrew.
Re: [concurrency-interest] ThreadLocalRandom clinit troubles
Hi Peter, Irrc, ServiceLoader.load(NameServiceDescriptor.class) uses Thread.contextClassLoader to load the services. Depending how late the NameServices is getting initialized, perhaps it can be used to circumvent the loader available at clinit of InetAddress. I am not sure it could be a real security concern (as the caller has to be authorized to create custom classloaders), yet the behavior is not exactly the same. Storing Thread.currentThread().getContextClassLoader() during clinit in a WeakRefernce (and clearing on use) should be closest to the original code. Cheers, Stanimir On Fri, Jun 20, 2014 at 2:20 PM, Peter Levart peter.lev...@gmail.com wrote: On 06/20/2014 11:10 AM, Peter Levart wrote: Perhaps a more lazy initialization of NetworkInterface class that does not trigger initialization of NS providers could help. We just need to invoke two methods on NetworkInterface: - static NetworkInterface.getNetworkInterfaces() - instance NetworkInterface.getHardwareAddress() both of which could provide the result without the need of NS providers, I think. This would solve the most general case of using TLR. The case that doesn't involve SecureRandom's help which I think is rarely needed and not default. Regards, Peter Hi, A patch that solves this is a patch to InetAddress. We can't suppress initialization of InetAddress as part of NetworkInterface initialization, but we can suppress initialization of NameService providers as part of InetAddress initialization by moving them into a nested class called NameServices: http://cr.openjdk.java.net/~plevart/jdk9-dev/InetAddress. NameServices/webrev.01/ This should solve Martin's issue, but the patch to TLR initialization could be applied nevertheless, since it might help overcome possible issues when using SecureRandom for TLR's seed. Regards, Peter ___ Concurrency-interest mailing list concurrency-inter...@cs.oswego.edu http://cs.oswego.edu/mailman/listinfo/concurrency-interest