chee wu wrote:
> Thanks Sami. I tried LanguageIndexingFilter,and it seems the 
> LanguageIdentifier can't recognize Chinese now ?

No it doesn't. The list of languages can be checked here (*.ngp):
http://svn.apache.org/viewvc/lucene/nutch/branches/branch-0.8/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/

You can build a ngp profile for chinese, but i think that in language
identifiers current form it might not work that well.

You could also build an specialized identifier and add it as indexing
filter - the most basic form could just blindly set lang to Chinese if
that suits your use case.

--
 Sami Siren

> 
> ----- Original Message ----- 
> From: "Sami Siren" <[EMAIL PROTECTED]>
> To: <[email protected]>
> Sent: Sunday, January 07, 2007 5:47 PM
> Subject: Re: Nutch .81: the process to add a new analyzer ?
> 
> 
>> Chee Wu wrote:
>>> Hi,
>>>     I am trying to add a new analyzer for Chinese,and I found the
>>> code below in the "org.apache.nutch.indexer.Indexer"
>>>
>>> The question of mine is:
>>> For doc.get("lang"). Where and how can I  set the  "lang" property for
>> lang field is put there by language identifier plugin if it is active.
>>
>> http://lucene.apache.org/nutch/apidocs-0.8.x/org/apache/nutch/analysis/lang/LanguageIndexingFilter.html
>>
>> --
>> Sami Siren
>>


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to