[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-05 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544830#comment-13544830
 ] 

Commit Tag Bot commented on LUCENE-4656:


[branch_4x commit] Uwe Schindler
http://svn.apache.org/viewvc?view=revisionrevision=1428675

Merged revision(s) 1428671 from lucene/dev/trunk:
LUCENE-4656: Fix regression in IndexWriter to work with empty TokenStreams that 
have no TermToBytesRefAttribute (commonly provided by CharTermAttribute), e.g., 
oal.analysis.miscellaneous.EmptyTokenStream. Remove EmptyTokenizer from 
test-framework.


 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-05 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544860#comment-13544860
 ] 

Commit Tag Bot commented on LUCENE-4656:


[trunk commit] Uwe Schindler
http://svn.apache.org/viewvc?view=revisionrevision=1428671

LUCENE-4656: Fix regression in IndexWriter to work with empty TokenStreams that 
have no TermToBytesRefAttribute (commonly provided by CharTermAttribute), e.g., 
oal.analysis.miscellaneous.EmptyTokenStream. Remove EmptyTokenizer from 
test-framework.


 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543125#comment-13543125
 ] 

Robert Muir commented on LUCENE-4656:
-

I would also say that we dont need EmptyTokenizer in test-framework. 
Its only there because 2 places use it, and both in a bogus way (in my opinion):
1. core/TestDocument
2. queryparsers

we should first fix TestDocument, its test does not care if the tokenstream is 
empty or anything:
{noformat}
Index: src/test/org/apache/lucene/document/TestDocument.java
===
--- src/test/org/apache/lucene/document/TestDocument.java   (revision 
1428441)
+++ src/test/org/apache/lucene/document/TestDocument.java   (working copy)
@@ -20,7 +20,7 @@
 import java.io.StringReader;
 import java.util.List;
 
-import org.apache.lucene.analysis.EmptyTokenizer;
+import org.apache.lucene.analysis.CannedTokenStream;
 import org.apache.lucene.analysis.MockAnalyzer;
 import org.apache.lucene.index.DirectoryReader;
 import org.apache.lucene.index.IndexReader;
@@ -318,7 +318,7 @@
   // LUCENE-3616
   public void testInvalidFields() {
 try {
-  new Field(foo, new EmptyTokenizer(new StringReader()), 
StringField.TYPE_STORED);
+  new Field(foo, new CannedTokenStream(), StringField.TYPE_STORED);
   fail(did not hit expected exc);
 } catch (IllegalArgumentException iae) {
   // expected
{noformat}

The queryparser test looks outdated, like its some test about when an Analyzer 
returns null?
Maybe the test can just be removed, but if we apply this patch, we could move 
EmptyTokenizer 
from test-framework/src/java to queryparser/src/test at least as an 
improvement, since it is kinda funky.


 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-03 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543127#comment-13543127
 ] 

Uwe Schindler commented on LUCENE-4656:
---

I would really like to remove that horrible piece of sh* :-)

 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-03 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543133#comment-13543133
 ] 

Adrien Grand commented on LUCENE-4656:
--

Uwe, I just ran all Lucene tests with your patch and they passed, so +1. +1 to 
removing EmptyTokenizer too.

 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-03 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543134#comment-13543134
 ] 

Uwe Schindler commented on LUCENE-4656:
---

In trunk it is not even used in core! Only in 4.x's TestDocument!

 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-03 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543144#comment-13543144
 ] 

Uwe Schindler commented on LUCENE-4656:
---

Sorry was wrong patch.

 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

2013-01-03 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543161#comment-13543161
 ] 

Uwe Schindler commented on LUCENE-4656:
---

I also cleaned up some imports in affected files. I will commit this later and 
backport.

 Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream 
 (without CharTermAttribute), fix BaseTokenStreamTestCase
 --

 Key: LUCENE-4656
 URL: https://issues.apache.org/jira/browse/LUCENE-4656
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Adrien Grand
Assignee: Uwe Schindler
Priority: Trivial
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, 
 LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, 
 LUCENE-4656.patch


 TestRandomChains can fail because EmptyTokenizer doesn't have a 
 CharTermAttribute and doesn't compute the end offset (if the offset attribute 
 was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org