You could use a PatternReplaceCharFilter before your tokenizer to replace
the dot with a space character.

Derek Poh <d...@globalsources.com> schrieb am Mi., 12. Okt. 2016 11:38:

> Seems like LetterTokenizerFactory tokenise/discard on numbers as well. The
> field does has values with numbers in them therefore it is not applicable.
> Thank you.
>
>
> On 10/12/2016 4:22 PM, Dheerendra Kulkarni wrote:
> > You can use LetterTokenizerFactory instead.
> >
> > Regards,
> > Dheerendra Kulkarni
> >
> > On Wed, Oct 12, 2016 at 6:24 AM, Derek Poh <d...@globalsources.com>
> wrote:
> >
> >> Hi
> >>
> >> How can I split words with period in between into separate tokens.
> >> Eg. "Co.Ltd" => "Co" "Ltd" .
> >>
> >> I am using StandardTokenizerFactory and it does notreplace periods
> (dots)
> >> that are not followed by whitespace are kept as part of the token,
> >> including Internet domain names.
> >>
> >> This is the field definition,
> >>
> >> <fieldType name="text_general" class="solr.TextField"
> >> positionIncrementGap="100">
> >>        <analyzer type="index">
> >>          <tokenizer class="solr.StandardTokenizerFactory"/>
> >>          <filter class="solr.StopFilterFactory" ignoreCase="true"
> >> words="stopwords.txt" />
> >>          <filter class="solr.LowerCaseFilterFactory"/>
> >>        </analyzer>
> >>        <analyzer type="query">
> >>          <tokenizer class="solr.StandardTokenizerFactory"/>
> >>          <filter class="solr.StopFilterFactory" ignoreCase="true"
> >> words="stopwords.txt" />
> >>          <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt"
> >> ignoreCase="true" expand="true"/>
> >>          <filter class="solr.LowerCaseFilterFactory"/>
> >>        </analyzer>
> >> </fieldType>
> >>
> >> Solr versionis 10.4.10.
> >>
> >> Derek
> >>
> >> ----------------------
> >> CONFIDENTIALITY NOTICE
> >> This e-mail (including any attachments) may contain confidential and/or
> >> privileged information. If you are not the intended recipient or have
> >> received this e-mail in error, please inform the sender immediately and
> >> delete this e-mail (including any attachments) from your computer, and
> you
> >> must not use, disclose to anyone else or copy this e-mail (including any
> >> attachments), whether in whole or in part.
> >> This e-mail and any reply to it may be monitored for security, legal,
> >> regulatory compliance and/or other appropriate reasons.
> >
> >
> >
>
> ----------------------
> CONFIDENTIALITY NOTICE
>
> This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-mail (including any
> attachments), whether in whole or in part.
>
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.
>
>

Reply via email to