[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857098#comment-13857098 ] Dawid Weiss commented on LUCENE-5372: - I wanted to apply these patches but then looked deeper and it seems we can't just apply them without some consideration. For example, while reviewing, I noticed things such as this one: {code} - * All use of StringBuffers has been refactored to StringBuilder for speed. + * All use of StringBuilders has been refactored to StringBuilder for speed. {code} which seem to be an auto-replacement artifact. While this is a no-problem, this may be: {code} +++ b/lucene/core/src/java/org/apache/lucene/analysis/tokenattributes/CharTermAttributeImpl.java @@ -144,8 +144,8 @@ public class CharTermAttributeImpl extends AttributeImpl implements CharTermAttr } else if (csq instanceof CharBuffer ((CharBuffer) csq).hasArray()) { final CharBuffer cb = (CharBuffer) csq; System.arraycopy(cb.array(), cb.arrayOffset() + cb.position() + start, termBuffer, termLength, len); - } else if (csq instanceof StringBuffer) { -((StringBuffer) csq).getChars(start, end, termBuffer, termLength); + } else if (csq instanceof StringBuilder) { +((StringBuilder) csq).getChars(start, end, termBuffer, termLength); {code} but CharTermAttributeImpl already has an if clause for StringBuilder, the full code is: {code} if (csq instanceof String) { ((String) csq).getChars(start, end, termBuffer, termLength); } else if (csq instanceof StringBuilder) { ((StringBuilder) csq).getChars(start, end, termBuffer, termLength); } else if (csq instanceof CharTermAttribute) { System.arraycopy(((CharTermAttribute) csq).buffer(), start, termBuffer, termLength, len); } else if (csq instanceof CharBuffer ((CharBuffer) csq).hasArray()) { final CharBuffer cb = (CharBuffer) csq; System.arraycopy(cb.array(), cb.arrayOffset() + cb.position() + start, termBuffer, termLength, len); } else if (csq instanceof StringBuffer) { ((StringBuffer) csq).getChars(start, end, termBuffer, termLength); } else { while (start end) termBuffer[termLength++] = csq.charAt(start++); // no fall-through here, as termLength is updated! return this; } {code} I would actually leave it for Uwe to modify the api checker rules and then hand-pick offenders. Your contribution won't be lost, Joshua, it'll just go in via another route (not a direct patch, rather a good suggestion :). IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372-lucene5339.patch, 5372-v2.patch, 5372.patch, LUCENE-5372-forbidden.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852717#comment-13852717 ] Uwe Schindler commented on LUCENE-5372: --- Oh oh, we should put StringBuffer on the forbidden-apis list. It is one entry in our base.txt signatures file (please don't add a new one). We replaced all StringBuffers already earlier when we moved to Lucene 3.0 (which is the first one for Java 5), so we can disallow StringBuffer everywhere! IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852728#comment-13852728 ] Uwe Schindler commented on LUCENE-5372: --- Should we put the remaining StringBuffers on a separate issue or fix it here? IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch, LUCENE-5372-forbidden.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852727#comment-13852727 ] Uwe Schindler commented on LUCENE-5372: --- his one is wanted, but mostly obsolete - maybe put on exclusion list: {noformat} [forbidden-apis] Forbidden class/interface use: java.lang.StringBuffer [Use StringBuilder instead, which has no synchronization] [forbidden-apis] in org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl (CharTermAttributeImpl.java:148) {noformat} IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch, LUCENE-5372-forbidden.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852732#comment-13852732 ] Dawid Weiss commented on LUCENE-5372: - I think we can fix it here since Josh brought that up. IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch, LUCENE-5372-forbidden.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852836#comment-13852836 ] Michael McCandless commented on LUCENE-5372: bq. Do you want me to hold it, Mike? Yes, please! +1 to fix StringBuffers here! Thanks, Joshua. IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch, LUCENE-5372-forbidden.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853260#comment-13853260 ] Joshua Hartman commented on LUCENE-5372: I'll make the StringBuffer - StringBuilder fix on trunk the branch mentioned by Mike and attach new patches for each. Expect the patch this evening or tomorrow. IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch, LUCENE-5372-forbidden.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851794#comment-13851794 ] Mark Miller commented on LUCENE-5372: - Yes, 4x=6, 5x=7 IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852199#comment-13852199 ] Joshua Hartman commented on LUCENE-5372: I can also optimize memory usage by precalculating the maximum size in advance for the StringBuilder for each collection. May be overkill - what are your thoughts? I am new here. http://wiki.apache.org/lucene-java/HowToContribute implies I should wait for the patch to be pulled in by a lucene dev. Is this a correct interpretation? IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852214#comment-13852214 ] Dawid Weiss commented on LUCENE-5372: - No need to be paranoid about performance here, Josh. The patch is fine, I'll apply it, although tomorrow because it's gotten really late over here. IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852216#comment-13852216 ] Michael McCandless commented on LUCENE-5372: Hi Joshua, thank you for the patch here, but we are in the process of simplifying the facet APIs (on this branch: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5339 ), and I believe most (all?) of the affected code here has been removed. So I would hold off for now, or maybe check out the branch and see if any of these O(N^2) problems still remain. Thanks! IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852707#comment-13852707 ] Dawid Weiss commented on LUCENE-5372: - Do you want me to hold it, Mike? Josh -- perhaps you could rebase your patch against the branch Mike pointed to (if there's anything left)? IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Assignee: Dawid Weiss Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance
[ https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851447#comment-13851447 ] Dawid Weiss commented on LUCENE-5372: - Looks good to me and I think it's applicable to 4.x and 5.x (StringBuilder requires Java = 1.5 but both of these branches do, right)? IntArray toString has O(n^2) performance Key: LUCENE-5372 URL: https://issues.apache.org/jira/browse/LUCENE-5372 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Joshua Hartman Priority: Minor Fix For: 5.0, 4.7 Attachments: 5372.patch This is pretty minor, but I found a few issues with the toString implementations while looking through the facet data structures. The most egregious is the use of string concatenation in the IntArray class. I have fixed that using StringBuilders. I also noticed that other classes were using StringBuffer instead of StringBuilder. According to the javadoc, This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org