Sorry for the duplicated mail :-(, any advice on a configuration for searching emails in a field that does not have only email addresses, so the email addresses are contained in larger textual messages?
----- Mensaje original ----- De: "Ahmet Arslan" <iori...@yahoo.com> Para: solr-user@lucene.apache.org Enviados: Jueves, 14 de Marzo 2013 11:23:47 Asunto: Re: Question about email search Hi, Since you have word delimiter filter in your analysis chain, I am not sure if e-mail addresses are recognised. You can check that on solr admin UI, analysis page. If e-mail addresses kept one token, I would use leading wildcard query. &q=*@gmail.com There was a similar question recently: http://search-lucene.com/m/XF2ejnM6Vi2 --- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez <jlbetanco...@uci.cu> wrote: > From: Jorge Luis Betancourt Gonzalez <jlbetanco...@uci.cu> > Subject: Question about email search > To: solr-user@lucene.apache.org > Date: Thursday, March 14, 2013, 5:11 PM > I'm using solr 3.6.2 to crawl some > data using nutch, in my schema I've one field with all the > content extracted from the page, which could possibly > include email addresses, this is the configuration of my > schema: > > <fieldType name="text" > class="solr.TextField" > > positionIncrementGap="100" > autoGeneratePhraseQueries="true"> > <analyzer > type="index"> > > <tokenizer class="solr.StandardTokenizerFactory"/> > > <filter class="solr.StandardFilterFactory"/> > > <filter class="solr.ISOLatin1AccentFilterFactory"/> > > <filter class="solr.SnowballPorterFilterFactory" > languange="Spanish"/> > > <charFilter class="solr.HTMLStripCharFilterFactory"/> > > <filter class="solr.StopFilterFactory" > > ignoreCase="true" words="stopwords.txt"/> > > <filter class="solr.WordDelimiterFilterFactory" > > generateWordParts="1" > generateNumberParts="1" > > catenateWords="1" catenateNumbers="1" > catenateAll="0" > > splitOnCaseChange="1"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter > class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > The thing is that I'm trying to search against a field of > this type (text) with a value like "@gmail.com" and I'm > intended to get documents with that text, any advice? > > slds > -- > "It is only in the mysterious equation of love that any > logical reasons can be found." > "Good programmers often confuse halloween (31 OCT) with > christmas (25 DEC)" > >