Hi,
I am trying to index information in some proprietary-formatted files.
In particular, these files contain some IP addresses in dotted notation, e.g.,
aa.bb.cc.dd.
For my initial test, I have a Document implementation, and after I extract what
I need into a String named "Info", I do:
doc.add(new Field("contents", Info, Field.Store.YES, Field.Index.ANALYZED));
>From looking at the resulting index using Luke, it appears that I am getting
>terms for the full IP address string (e.g., "aa.bb.cc.dd"), but I am also
>getting terms for each octet of each IP address string, e.g.:
aa
bb
cc
dd
I'm still just getting started with Lucene, but from the research that I've
done, it seems like Lucene is treating the "." in the dotted notation strings
as "noise". Is that correct?
If so, is there a way to get it not to do that?
Thanks,
Jim
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]