RE: can't find common words -- using Lucene 3.4.0

2012-03-28 Thread Ilya Zavorin
)); IndexWriter writer = new IndexWriter(dir, iwc); Anything suspicious here? Thanks Ilya Zavorin -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: Monday, March 26, 2012 1:48 PM To: java-user@lucene.apache.org Subject: RE: can't find common

RE: can't find common words -- using Lucene 3.4.0

2012-03-26 Thread Steven A Rowe
On 3/26/2012 at 12:21 PM, Ilya Zavorin wrote: > I am not seeing anything suspicious. Here's what I see in the HEX: > > "n.e" from "pain.electricity": 6E-2E-0D-0A-0D-0A-65 > (n-.-CR-LF-CR-LF-e) "e.H" from "sentence.He": 65-2E-0D-0A-48 I agree, standard DOS/Windows line endings. > I am pretty sure

RE: can't find common words -- using Lucene 3.4.0

2012-03-26 Thread Ilya Zavorin
anks, Ilya -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: Monday, March 26, 2012 11:41 AM To: java-user@lucene.apache.org Subject: RE: can't find common words -- using Lucene 3.4.0 Ilya, StandardAnalyzer treats all forms of newline as whitespace,

RE: can't find common words -- using Lucene 3.4.0

2012-03-26 Thread Steven A Rowe
orin [mailto:izavo...@caci.com] Sent: Monday, March 26, 2012 11:21 AM To: java-user@lucene.apache.org Subject: RE: can't find common words -- using Lucene 3.4.0 Steve, Thanks much for the link: very useful! I looked at the index and found that it contains terms like electricitythis -- from D

RE: can't find common words -- using Lucene 3.4.0

2012-03-26 Thread Ilya Zavorin
analyzers for respective foreign texts Thanks, Ilya -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: Monday, March 26, 2012 10:59 AM To: java-user@lucene.apache.org Subject: RE: can't find common words -- using Lucene 3.4.0 Hi Ilya, What analyzers are you

RE: can't find common words -- using Lucene 3.4.0

2012-03-26 Thread Steven A Rowe
nce" will not match. Luke can tell you what's in your index: <http://code.google.com/p/luke/> Steve -Original Message- From: Ilya Zavorin [mailto:izavo...@caci.com] Sent: Monday, March 26, 2012 10:11 AM To: java-user@lucene.apache.org Subject: can't find common

can't find common words -- using Lucene 3.4.0

2012-03-26 Thread Ilya Zavorin
I am writing a Lucene based indexing-search app and testing it using some simple docs and querries. I have 3 simples docs that are shown at the bottom of the this email between pairs of "==="s and about a dozen terms. One of them is "electricity". As you can see, it appears in al