Thanks for clearing up some doubts. But exactly how do i wrap it ? Do I need to make changes in code to utilize the new thaitokenizer ? If yes - where are the places that need modification ? Do I need to download a dev version and do a recompile ?
Please - if you could possibly tell me the steps - in brief - i would be highly obliged. Thanks, sanjeev. Jérôme Charron wrote: > >> i used an existing ThaiAnalyzer which was in lucene packlage. >> ok - i renamed the lucene.analysis.th.* to nutch.analysis.th.* - compiled >> and >> placed all class files in a jar - analysis-th.jar (do i need to bundle >> the >> ngp file in the jar as well ?) > > 1. You don't have to refactor the lucene analyzer. Just to wrap it like I > do > with french and german analyzers (they both use some analyzers from > lucene). > 2. Analyzer doesn't need ngp files... I think you misunderstood > something: > 2.1 In one side there is the language identifier that use NGP files to > identify language of a document > 2.2 In the other sided if a suitable analyzer is found for the identified > language, it is used to analyze the document. > > Regards > > Jérôme > > > -- > http://motrech.free.fr/ > http://www.frutch.org/ > > -- View this message in context: http://www.nabble.com/implement-thai-language-indexing-and-search-tf2641172.html#a7671727 Sent from the Nutch - Dev mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers