));
IndexWriter writer = new IndexWriter(dir, iwc);
Anything suspicious here?
Thanks
Ilya Zavorin
-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu]
Sent: Monday, March 26, 2012 1:48 PM
To: java-user@lucene.apache.org
Subject: RE: can't find common
On 3/26/2012 at 12:21 PM, Ilya Zavorin wrote:
> I am not seeing anything suspicious. Here's what I see in the HEX:
>
> "n.e" from "pain.electricity": 6E-2E-0D-0A-0D-0A-65
> (n-.-CR-LF-CR-LF-e) "e.H" from "sentence.He": 65-2E-0D-0A-48
I agree, standard DOS/Windows line endings.
> I am pretty sure
anks,
Ilya
-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu]
Sent: Monday, March 26, 2012 11:41 AM
To: java-user@lucene.apache.org
Subject: RE: can't find common words -- using Lucene 3.4.0
Ilya,
StandardAnalyzer treats all forms of newline as whitespace,
orin [mailto:izavo...@caci.com]
Sent: Monday, March 26, 2012 11:21 AM
To: java-user@lucene.apache.org
Subject: RE: can't find common words -- using Lucene 3.4.0
Steve,
Thanks much for the link: very useful!
I looked at the index and found that it contains terms like
electricitythis -- from D
analyzers for respective foreign texts
Thanks,
Ilya
-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu]
Sent: Monday, March 26, 2012 10:59 AM
To: java-user@lucene.apache.org
Subject: RE: can't find common words -- using Lucene 3.4.0
Hi Ilya,
What analyzers are you
nce" will not match.
Luke can tell you what's in your index: <http://code.google.com/p/luke/>
Steve
-Original Message-
From: Ilya Zavorin [mailto:izavo...@caci.com]
Sent: Monday, March 26, 2012 10:11 AM
To: java-user@lucene.apache.org
Subject: can't find common
I am writing a Lucene based indexing-search app and testing it using some
simple docs and querries. I have 3 simples docs that are shown at the bottom of
the this email between pairs of "==="s and about a dozen terms.
One of them is "electricity". As you can see, it appears in al