[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-26 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857098#comment-13857098
 ] 

Dawid Weiss commented on LUCENE-5372:
-

I wanted to apply these patches but then looked deeper and it seems we can't 
just apply them without some consideration. For example, while reviewing, I 
noticed things such as this one:
{code}
- * All use of StringBuffers has been refactored to StringBuilder for speed.
+ * All use of StringBuilders has been refactored to StringBuilder for speed.
{code}
which seem to be an auto-replacement artifact. While this is a no-problem, this 
may be:
{code}
+++ 
b/lucene/core/src/java/org/apache/lucene/analysis/tokenattributes/CharTermAttributeImpl.java
@@ -144,8 +144,8 @@ public class CharTermAttributeImpl extends AttributeImpl 
implements CharTermAttr
   } else if (csq instanceof CharBuffer  ((CharBuffer) csq).hasArray()) {
 final CharBuffer cb = (CharBuffer) csq;
 System.arraycopy(cb.array(), cb.arrayOffset() + cb.position() + start, 
termBuffer, termLength, len);
-  } else if (csq instanceof StringBuffer) {
-((StringBuffer) csq).getChars(start, end, termBuffer, termLength);
+  } else if (csq instanceof StringBuilder) {
+((StringBuilder) csq).getChars(start, end, termBuffer, termLength);
{code}
but CharTermAttributeImpl  already has an if clause for StringBuilder, the full 
code is:
{code}
  if (csq instanceof String) {
((String) csq).getChars(start, end, termBuffer, termLength);
  } else if (csq instanceof StringBuilder) {
((StringBuilder) csq).getChars(start, end, termBuffer, termLength);
  } else if (csq instanceof CharTermAttribute) {
System.arraycopy(((CharTermAttribute) csq).buffer(), start, termBuffer, 
termLength, len);
  } else if (csq instanceof CharBuffer  ((CharBuffer) csq).hasArray()) {
final CharBuffer cb = (CharBuffer) csq;
System.arraycopy(cb.array(), cb.arrayOffset() + cb.position() + start, 
termBuffer, termLength, len);
  } else if (csq instanceof StringBuffer) {
((StringBuffer) csq).getChars(start, end, termBuffer, termLength);
  } else {
while (start  end)
  termBuffer[termLength++] = csq.charAt(start++);
// no fall-through here, as termLength is updated!
return this;
  }
{code}
I would actually leave it for Uwe to modify the api checker rules and then 
hand-pick offenders. Your contribution won't be lost, Joshua, it'll just go in 
via another route (not a direct patch, rather a good suggestion :).

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372-lucene5339.patch, 5372-v2.patch, 5372.patch, 
 LUCENE-5372-forbidden.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-19 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852717#comment-13852717
 ] 

Uwe Schindler commented on LUCENE-5372:
---

Oh oh, we should put StringBuffer on the forbidden-apis list. It is one entry 
in our base.txt signatures file (please don't add a new one).

We replaced all StringBuffers already earlier when we moved to Lucene 3.0 
(which is the first one for Java 5), so we can disallow StringBuffer everywhere!

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-19 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852728#comment-13852728
 ] 

Uwe Schindler commented on LUCENE-5372:
---

Should we put the remaining StringBuffers on a separate issue or fix it here?

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch, LUCENE-5372-forbidden.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-19 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852727#comment-13852727
 ] 

Uwe Schindler commented on LUCENE-5372:
---

his one is wanted, but mostly obsolete - maybe put on exclusion list:

{noformat}
[forbidden-apis] Forbidden class/interface use: java.lang.StringBuffer [Use 
StringBuilder instead, which has no synchronization]
[forbidden-apis]   in 
org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl 
(CharTermAttributeImpl.java:148)
{noformat}

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch, LUCENE-5372-forbidden.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-19 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852732#comment-13852732
 ] 

Dawid Weiss commented on LUCENE-5372:
-

I think we can fix it here since Josh brought that up.

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch, LUCENE-5372-forbidden.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-19 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852836#comment-13852836
 ] 

Michael McCandless commented on LUCENE-5372:


bq. Do you want me to hold it, Mike?

Yes, please!

+1 to fix StringBuffers here!  Thanks, Joshua.

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch, LUCENE-5372-forbidden.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-19 Thread Joshua Hartman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853260#comment-13853260
 ] 

Joshua Hartman commented on LUCENE-5372:


I'll make the StringBuffer - StringBuilder fix on trunk the branch mentioned 
by Mike and attach new patches for each. Expect the patch this evening or 
tomorrow.

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch, LUCENE-5372-forbidden.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-18 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851794#comment-13851794
 ] 

Mark Miller commented on LUCENE-5372:
-

Yes, 4x=6, 5x=7

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-18 Thread Joshua Hartman (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852199#comment-13852199
 ] 

Joshua Hartman commented on LUCENE-5372:


I can also optimize memory usage by precalculating the maximum size in advance 
for the StringBuilder for each collection. May be overkill - what are your 
thoughts? I am new here.

http://wiki.apache.org/lucene-java/HowToContribute implies I should wait for 
the patch to be pulled in by a lucene dev. Is this a correct interpretation?

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-18 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852214#comment-13852214
 ] 

Dawid Weiss commented on LUCENE-5372:
-

No need to be paranoid about performance here, Josh. The patch is fine, I'll 
apply it, although tomorrow because it's gotten really late over here.

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-18 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852216#comment-13852216
 ] 

Michael McCandless commented on LUCENE-5372:


Hi Joshua, thank you for the patch here, but we are in the process of 
simplifying the facet APIs (on this branch: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5339 ), and I 
believe most (all?) of the affected code here has been removed.  So I would 
hold off for now, or maybe check out the branch and see if any of these O(N^2) 
problems still remain.

Thanks!

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-18 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852707#comment-13852707
 ] 

Dawid Weiss commented on LUCENE-5372:
-

Do you want me to hold it, Mike? Josh -- perhaps you could rebase your patch 
against the branch Mike pointed to (if there's anything left)?

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5372) IntArray toString has O(n^2) performance

2013-12-17 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851447#comment-13851447
 ] 

Dawid Weiss commented on LUCENE-5372:
-

Looks good to me and I think it's applicable to 4.x and 5.x (StringBuilder 
requires Java = 1.5 but both of these branches do, right)?

 IntArray toString has O(n^2) performance
 

 Key: LUCENE-5372
 URL: https://issues.apache.org/jira/browse/LUCENE-5372
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Joshua Hartman
Priority: Minor
 Fix For: 5.0, 4.7

 Attachments: 5372.patch


 This is pretty minor, but I found a few issues with the toString 
 implementations while looking through the facet data structures.
 The most egregious is the use of string concatenation in the IntArray class. 
 I have fixed that using StringBuilders. I also noticed that other classes 
 were using StringBuffer instead of StringBuilder. According to the javadoc,
 This class is designed for use as a drop-in replacement for StringBuffer in 
 places where the string buffer was being used by a single thread (as is 
 generally the case). Where possible, it is recommended that this class be 
 used in preference to StringBuffer as it will be faster under most 
 implementations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org