Re: Question about email search

Jorge Luis Betancourt Gonzalez Thu, 14 Mar 2013 11:34:03 -0700

Sorry for the duplicated mail :-(, any advice on a configuration for searching 
emails in a field that does not have only email addresses, so the email 
addresses are contained in larger textual messages?


----- Mensaje original -----
De: "Ahmet Arslan" <iori...@yahoo.com>
Para: solr-user@lucene.apache.org
Enviados: Jueves, 14 de Marzo 2013 11:23:47
Asunto: Re: Question about email search

Hi,

Since you have word delimiter filter in your analysis chain, I am not sure if 
e-mail addresses are recognised. You can check that on solr admin UI, analysis 
page.

If e-mail addresses kept one token, I would use leading wildcard query.
&q=*@gmail.com

There was a similar question recently:
http://search-lucene.com/m/XF2ejnM6Vi2

--- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez <jlbetanco...@uci.cu> wrote:

> From: Jorge Luis Betancourt Gonzalez <jlbetanco...@uci.cu>
> Subject: Question about email search
> To: solr-user@lucene.apache.org
> Date: Thursday, March 14, 2013, 5:11 PM
> I'm using solr 3.6.2 to crawl some
> data using nutch, in my schema I've one field with all the
> content extracted from the page, which could possibly
> include email addresses, this is the configuration of my
> schema:
>
>         <fieldType name="text"
> class="solr.TextField"
>            
> positionIncrementGap="100"
> autoGeneratePhraseQueries="true">
>             <analyzer
> type="index">
>                
> <tokenizer class="solr.StandardTokenizerFactory"/>
>                
> <filter class="solr.StandardFilterFactory"/>
>                
> <filter class="solr.ISOLatin1AccentFilterFactory"/>
>                
> <filter class="solr.SnowballPorterFilterFactory"
> languange="Spanish"/>
>                
> <charFilter class="solr.HTMLStripCharFilterFactory"/>
>                
> <filter class="solr.StopFilterFactory"
>                
>     ignoreCase="true" words="stopwords.txt"/>
>                
> <filter class="solr.WordDelimiterFilterFactory"
>                
>     generateWordParts="1"
> generateNumberParts="1"   
>                
>     catenateWords="1" catenateNumbers="1"
> catenateAll="0"
>                
>     splitOnCaseChange="1"/>
>                
> <filter class="solr.LowerCaseFilterFactory"/>
>                
> <filter
> class="solr.RemoveDuplicatesTokenFilterFactory"/>
>             </analyzer>
>         </fieldType>
>
> The thing is that I'm trying to search against a field of
> this type (text) with a value like "@gmail.com" and I'm
> intended to get documents with that text, any advice?
>
> slds
> --
> "It is only in the mysterious equation of love that any
> logical reasons can be found."
> "Good programmers often confuse halloween (31 OCT) with
> christmas (25 DEC)"
>
>

Re: Question about email search

Reply via email to