On Sat, Aug 20, 2011 at 7:00 PM, Robert Muir <rcm...@gmail.com> wrote:
> On Sat, Aug 20, 2011 at 3:34 AM, Trejkaz <trej...@trypticon.org> wrote:
>
>>
>> As an aside, Google's behaviour seems to follow the "old" way.  For
>> instance, [[ 限定 ]] returns 640,000,000 hits and [[ 限 定 ]] returns
>> 772,000,000.  (Interestingly, [[ "限定" ]] returns 643,000,000 hits.
>> Slightly more than you might expect.)
>>
>
> No it doesn't. query on 北京医科大学
>
> You are confusing tokenization with query-generation itself: if you
> want 限定 to be treated as a compound then use a tokenizer that does
> this.

Nope.  I'm not confusing the two, I just haven't seen the source code
for Google, so I can't say which level it was doing it at.  For my
example it seemed pretty opaque.

That's a good example, though.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to