Using the 2_3_2 git version and StandardAnalyzer: It looks like if I
index document1 and then document2, at least some of document1's tokens
go to document2 as well. If I index document3, it indexes document2's
tokens also but not document1's. The attached patch modifies a test
case. It works with SimpleAnalyzer, but breaks with StandardAnalyzer.
The StandardAnalyzer.cpp code itself looks rather simple. Any ideas
where the bug could be?
diff --git a/src/test/index/TestIndexWriter.cpp b/src/test/index/TestIndexWriter.cpp
index 94b9990..5f243c8 100644
--- a/src/test/index/TestIndexWriter.cpp
+++ b/src/test/index/TestIndexWriter.cpp
@@ -12,7 +12,7 @@
void testIWmergePhraseSegments(CuTest *tc){
char fsdir[CL_MAX_PATH];
_snprintf(fsdir, CL_MAX_PATH, "%s/%s",cl_tempDir, "test.indexwriter");
- SimpleAnalyzer a;
+ StandardAnalyzer a;
Directory* dir = FSDirectory::getDirectory(fsdir);
IndexWriter ndx2(dir,&a,true);
@@ -35,7 +35,7 @@ void testIWmergePhraseSegments(CuTest *tc){
doc1.add(
*_CLNEW Field(
_T("field0"),
- _T("value1 value0"),
+ _T("value2 value3"),
Field::STORE_YES | Field::INDEX_TOKENIZED
)
);
@@ -46,12 +46,12 @@ void testIWmergePhraseSegments(CuTest *tc){
//test the index querying
IndexSearcher searcher(fsdir);
Query* query0 = QueryParser::parse(
- _T("\"value0 value1\""),
+ _T("value0"),
_T("field0"),
&a
);
Hits* hits0 = searcher.search(query0);
- CLUCENE_ASSERT(hits0->length() > 0);
+ CLUCENE_ASSERT(hits0->length() != 1);
Query* query1 = QueryParser::parse(
_T("\"value1 value0\""),
_T("field0"),
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers