On Mar 30, 2006, at 4:10 PM, mike c wrote:
Hi Erik,
Thanks for pointing this out - as I just got Ferret working with
indexes created using Nutch.  Any recommendations on how to address
this issue?

This is a particularly insidious issue. Java Lucene is not using pure UTF-8, whereas ports like Ferret are. But changing Java Lucene is a big deal and does introduce a (slight) performance hit apparently. The plan is for Java Lucene to be corrected in this regard at some point in the future, perhaps as soon as Lucene 2.0.

But for now, I don't know of a way to address this issue. I gave up on Ferret for the time being because of this incompatibility and am now prototyping with Solr while still using my custom XML-RPC search server for now.

        Erik




-Mike

On 3/30/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
There is one incompatibility between Ferret and Java Lucene of note.
It is the "UTF-8" issue that has surfaced with regards to Java
Lucene.  All can be well between Java Lucene and Ferret, until
characters in another range are indexed, and then Ferret will blow up
trying to search the index.  Maybe this has been worked around in a
more recent version of Ferret than I've tried?

        Erik


On Mar 30, 2006, at 2:50 PM, mike c wrote:

Thanks.  I'll try it out.  In the mean time, if I get Ferret working
I'll post an update.

-Mike

On 3/30/06, Steven Yelton <[EMAIL PROTECTED]> wrote:
I use WEBrick instead of tomcat to query and serve search results. I
used ruby's 'rjb' to bridge the gap.

http://raa.ruby-lang.org/project/rjb/

There may be more direct ways (ruby<->lucene), but this was quick and
easy and still has decent performance.

Steven

mike c wrote:

Hi all,
I was wondering if anyone is using Nutch (for crawling) with Ferret (indexing / searching). Basically, my front-end is built using Ruby
on Rails that's why I'm asking.  I have the Nutch crawler up and
running fine, but can't seem to figure out how to integrate the two.
Any help is appreciated.

Regards,
Mike





-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting
language
that extends applications into web and mobile media. Attend the
live webcast
and join the prime developer group breaking into this new coding
territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to