Re: custome index rule

Ian Lea Mon, 24 Oct 2011 03:01:37 -0700

You can achieve pretty much anything by customizing parsers and
tokenizers but for your simple case I'd just use String.split() and
add the phrases one by one.  Something like


Document d = ...
String[] phrases = sentence,split(",");
for (String phrase : phrases) {
  d.add(new Field("phrase", phrase, ...);
}

I think that would achieve what you want.


On special characters. see
http://lucene.apache.org/java/3_4_0/queryparsersyntax.html#Escaping
Special Characters and QueryParser.escape(String s).


--
Ian.

On Mon, Oct 24, 2011 at 10:12 AM, janwen <[email protected]> wrote:
> Hi,
>  I want to implement a custom index rule:
>  Assume the sentence like the following:Note comma
>   I am in China,I am in USA,I am in UK
>
>  I hope lucene index above sentece based on the rule:
> 1)split the sentence with comma(,),so we get(I am in China)(I am in USA)(I am 
> in UK)
> 2)then lucene just store the short senteces from step 1,NOT_ANALYZED
>
> P.S How many characters lucene do not support,and What they are?
> I input a^b  and get exception:
>  org.apache.lucene.queryParser.ParseException: Cannot parse 'a^b: Lexical 
> error at line 1, column 4.  Encountered: "\u671d" (26397), after : ""
>
> thanks
>
> 2011-10-24
>
>
>
> janwen | China
> website : http://www.qianpin.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: custome index rule

Reply via email to