Hi all, Indexing Hebrew texts for later retrieval is not a trivial task. Of all languages, Hebrew seem to be the toughest to handle. Although several solutions exist, they are not necessarily providing the best results in terms of relevancy. Either way, there is no freely available solution allowing to index Hebrew even at the very basic level.
HebMorph was started with this in mind. It is a free, open-source effort for making Hebrew properly searchable by various IR software libraries, while maintaining decent recall, precision and relevancy in retrievals. During the work on this project, we will try and come up with different approaches to indexing Hebrew, and provide the tools to perform reliable comparisons between them. This project's ultimate goal is providing various IR libraries with the best Hebrew IR capabilities possible. Complete details are available at http://www.code972.com/blog/hebmorph/. As we progress, updates will be posted to that blog, to our mailing list, and on twitter (#HebMorph). If this is of an interest to you, we would appreciate your feedback and help. Please use our mailing list, or contact me privately, for any inquiries. Itamar Syn-Hershko ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers