Mark, This is simple enough that it should be easy to put together. If you search the ML archives you'll see that one of the common "tricks" is to "flip" host name parts (e.g. com.sematext.www). The details of this have been discussed before, so have a look.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Mark Ferguson <mark.a.fergu...@gmail.com> > To: java-user@lucene.apache.org > Sent: Friday, December 19, 2008 4:28:10 PM > Subject: Url Analyzer > > Hello, > > I was wondering if there had been any work done out there on an analyzer for > URL strings. I'm looking for something which will match on any of the words > in the domain or path of the URL. I am considering using a PatternAnalyzer > but I wanted to ask this group to see if this was something which has been > discussed here before. Thanks very much in advance, > > Mark Ferguson --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org