Re: chineseAnalyzer

Ye T Thet Thu, 09 Dec 2010 08:51:02 -0800

Assuming analyzer plug-in is loaded correctly, I would check few places as
following...

1. Is document identified as Chinese document? If your nutch is set up the
usual,  Language identifier plug-in does the identification. It would assign
some language code in "lang" field. You can check that with Luke if value of
"lang" field is assigned properly.

2. If above is correct, plugin.xml for analyzer plugin. lang value in
implementation tag should match "lang" field in index. Example if your
"lang" field value is "cn" value in plugin.xmlo should be  <parameter
name="lang" value="cn"/>

I hope it would help you troubleshoot.

Cheers,

Ye

2010/12/9 Bupo Jung <[email protected]>

> this is the hadoop logs message about the plugin. it's loaded.
> "2010-12-08 21:59:48,888 INFO  plugin.PluginRepository - Chinese Analysis
> Plug-in (analysis-zh)"
>
>
> 2010/12/9 Ye T Thet <[email protected]>
>
> > You should check if analyzer is loaded properly. You can do so by
> checking
> > hadoop log file.
> >
> > Regards,
> >
> > Ye
> >
> > On Thu, Dec 9, 2010 at 8:21 PM, Bupo Jung <[email protected]> wrote:
> >
> > > Hi,
> > > I am trying to add a ChineseAnalyzer plugin to parse chinese documents
> > and
> > > index. And I found i was success to index the chinese documents( I can
> > see
> > > the indexs through luke, and it's crrect). But when i search the
> chinese
> > > words using org.apahce.nutch.searcher.NutchBean, I found the searcher
> did
> > > not parse the input chinese word string. So, it always return 0 hits.
> How
> > > can i fix it!
> > > Any clue ?
> > >
> >
>
>
>
> --
> 庄逸众
> 北京邮电大学
> Yizhong Zhuang
> Beijing University of Posts and Telecommunications
> Tel:+86-13810773197
> Email:[email protected] <email%[email protected]> <
> email%[email protected] <email%[email protected]>>
>

Re: chineseAnalyzer

Reply via email to