Hi again, everyone. First of all, I want to thank everyone for their extremely helpful replies so far. Also, I just started reading the book "Lucene in Action" last night. So far it's an awesome book, so a big thanks to the authors.
Anyhow, on to my question. As I've mentioned in several of my previous messages, I am indexing different pieces of information about servers - in particular, my question is about indexing the IP address and MAC address. Using the StandardAnalyzer, an IP is kept as a single token ("192.168.1.100"), and a MAC is broken up into one token per octet ("00", "17", "fd", "14", "d3", "2a"). Many searches will be for partial IPs or MACs ("192.168", "00:17:fd", etc). Are either of these methods of indexing the addresses (single token vs per-octet token) more or less efficient than the other when indexing large numbers of these? -- Joe Attardi [EMAIL PROTECTED] http://thinksincode.blogspot.com/