Hi again, everyone. First of all, I want to thank everyone for their
extremely helpful replies so far.
Also, I just started reading the book "Lucene in Action" last night. So far
it's an awesome book, so a big thanks to the authors.

Anyhow, on to my question. As I've mentioned in several of my previous
messages, I am indexing different pieces of information about servers - in
particular, my question is about indexing the IP address and MAC address.

Using the StandardAnalyzer, an IP is kept as a single token ("192.168.1.100"),
and a MAC is broken up into one token per octet ("00", "17", "fd", "14",
"d3", "2a"). Many searches will be for partial IPs or MACs ("192.168",
"00:17:fd", etc).

Are either of these methods of indexing the addresses (single token vs
per-octet token) more or less efficient than the other when indexing large
numbers of these?

-- 
Joe Attardi
[EMAIL PROTECTED]
http://thinksincode.blogspot.com/

Reply via email to