DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=35052>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ· INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=35052 ------- Additional Comments From [EMAIL PROTECTED] 2005-07-12 01:27 ------- (In reply to comment #10) > Logger.getLogger("logger1"); > > Using a string constant here, "logger1" was already interned true > "logger1".equals("logger1") is fast. No need to explicitly do an intern() I agree that the literal strings in code are automatically interned, and the String.equals method should be checking for identity before bothering to do a loop checking the string content [1]. There's still the overhead of a (non-virtual) method invocation and return though. However the main point of this is to optimise the comparisons executed as part of Hashtable.get when looking up the logger from the map of all loggers. And Hashtable.get may well do several *failed* comparisons before finding the right logger to return. The operation "string1" == "string2" is fast always; if the references are different then the result is false. However "string1".equals("string2") has a fair bit of overhead. The String.equals method needs to do: * if (param instanceof String) * cast param to String * if string lengths differ: return false * otherwise compare each byte [1] [1] or potentially compare hashcodes. The reference comparison starts to look nice.. How many such non-equal comparisons occur in the Hashtable lookup? Well that depends upon the hashcode collision rate inside the hashtable; obviously two strings whose hashcodes cause them to be allocated to different buckets never get compared. The java.util.HashMap class has a default loadFactory of 0.75. So at a wild guess I would think only about 30% of lookups would hit a bucket with more than one entry, and in 50% of those cases the right entry would be first. So I agree this optimisation is not critical, but it could have a measurable improvement. > 2. You're evaluating the string, you're doing: > > int i = 1; > Logger.getLogger("logger" + i); > ... > Logger.getLogger("logger" + i); > > In which case, the cost of intern() is MUCH more expensive than intern anyway, > so why bother? True. However such usage really is a little bizarre. > > 3. You're storing the evaluated string and a combination of above. Again, > intern() is more expensive than equals() so why bother. > > I don't understand the need for intern() at all. Sounds like a premature > optimization bug. A very common pattern is: Logger.getLogger(someClass.getName()); I've checked with Sun java 1.3.1 and 1.5 and the string returned by someClass.getName does get interned automatically like literal strings in the code. The question is whether that behaviour should be relied on or not. I'm not aware of anything in the java specs that mandate this string going into the intern pool (unlike literals in the code which *must* go there) but I think it likely that all Java implementations *will* do this; when the JVM is loading the raw .class file into memory it needs to store the literal strings somewhere and putting them in the intern pool seems as good a place as any. So: first of all, string comparisons in the logger Hashtable lookup will not be common (maybe 30% of lookups). And in all the scenarios where interning makes sense, the strings seem to already be interned; comparing two equal interned strings with .equals should be close to the speed of comparing them with ==. Where the two strings are actually different (maybe 15% of lookups) there are two cases: * strings have different length (most of the time) --> minor win for the interned comparison * strings have same length (pretty rare) --> major win for the interned comparison Summary: From a theoretical point I tend to agree that this use of String.intern won't have much effect and could have been omitted. Of course actually testing this might be wise -- real stats win over theory any day :-). But on the other hand, String.intern only does active harm when the user is generating large numbers of logger names dynamically - and I can't see any sane reason for anyone to do that. [1] Actually, GNU Classpath doesn't! Currently comparison of a string object with itself does a reasonable amount of work (though not a scan of the actual string data). I've sent off an email querying this. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
