from:"Erik Hatcher"

Re: [Nutch-general] Re: Using Nutch with Ferret (ruby)

2006-03-31 Thread Erik Hatcher

gave up on Ferret for the time being because of this incompatibility and am now prototyping with Solr while still using my custom XML-RPC search server for now. Erik -Mike On 3/30/06, Erik Hatcher [EMAIL PROTECTED] wrote: There is one incompatibility between Ferret and Java

Re: [Nutch-general] Re: Using Nutch with Ferret (ruby)

2006-03-30 Thread Erik Hatcher

There is one incompatibility between Ferret and Java Lucene of note. It is the UTF-8 issue that has surfaced with regards to Java Lucene. All can be well between Java Lucene and Ferret, until characters in another range are indexed, and then Ferret will blow up trying to search the

Re: [Nutch-general] Nutch web services

2006-03-24 Thread Erik Hatcher

Nutch has a servlet that supports A9s OpenSearch API. Are you needing more capabilities than this offers? Erik On Mar 24, 2006, at 9:16 AM, Aled Jones wrote: Hi Might have asked this before, but has anyone developed web services for nutch? I know there are web services for

Re: Good man is Different than Man good in Nutch?

2005-11-30 Thread Erik Hatcher

On 29 Nov 2005, at 22:41, Victor Lee wrote: ok, now I remembered something from the book Lucene in Action, it said something about word distance. So that's why they returns different results. But still, when I remembered when I went to Google Adwords and get the new Maximum CPC estimates

Re: lucene jar version

2005-11-12 Thread Erik Hatcher

and 1.4.3 would do the trick, but would be a lot to wade through. Erik regards, [EMAIL PROTECTED] - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Thursday, November 10, 2005 1:51 AM Subject: Re: lucene jar version Nutch

Re: lucene jar version

2005-11-09 Thread Erik Hatcher

Nutch is using an un-official version of Lucene, which is some build from the trunk of Subversion. In the trunk of Lucene, those methods are deprecated and thus the 1.9- rc1-dev JAR you have has them flagged as such. Erik On 9 Nov 2005, at 17:37, Kenji wrote: Hi, I'm new here.

Re: Jira - Nutch 48 - did you mean patch

2005-10-31 Thread Erik Hatcher

No, Lucene does not have a built-in query that uses regular expressions. It's trivial to write a custom Query class like WildcardQuery that does regular expression searching. In fact, I've created this and am contributing it to Lucene as soon as I can (slowly but surely). As for how

Re: output format as xml?

2005-10-01 Thread Erik Hatcher

Nutch supports the OpenSearch API, which is a variant of RSS, and in XML. Erik On Sep 30, 2005, at 8:03 PM, gekkokid wrote: Howdy, if nutch doesnt support xml as a result format - its open source so you can customise it to your needs :) _gk - Original Message - From: XIN

Re: [Nutch-general] VOTE: (Re: RSS Feed Parser)

2005-08-11 Thread Erik Hatcher

+1 - with it disabled there isn't much risk. On Aug 11, 2005, at 6:07 PM, Andrzej Bialecki wrote: Chris Mattmann wrote: Hi Zaheed, Thanks for the nice comments. I've went ahead and wrote an HTML page that summarizes what I sent to Zaheed with respect to installing the parse-rss plugin.

Re: [Nutch-general] number of indexed pages

2005-07-29 Thread Erik Hatcher

Two options: bin/nutch readdb crawl/db -stats or use Luke (Google for luke lucene) to open the Lucene index. Erik On Jul 28, 2005, at 9:44 PM, blackwater dev wrote: After I finish a crawl...what is the best way to go into my crawl directory and get the number of indexed pages?

Re: [Nutch-general] Re: RDF plugin questions

2005-07-21 Thread Erik Hatcher

this integration task much much more difficult as it already is. Greetings, Stefan Am 19.07.2005 um 14:57 schrieb Erik Hatcher: Hi, I'm embarking on an adventure with Nutch to crawl 19th century digital scholarly archives (like the Rossetti Archive, where I work) for the nines.org system

Nutch + RDF for scholarly archives

2005-06-29 Thread Erik Hatcher

Is anyone here using Nutch for crawling digital scholarly archives? If so, are you also harvesting and indexing additional metadata? My group (http://www.patacriticism.org) is considering using Nutch to crawl a specific set of sites and index the HTML as full-text and also retrieve any

Re: [Nutch-general] Re: Using Nutch with Ferret (ruby)

Re: [Nutch-general] Re: Using Nutch with Ferret (ruby)

Re: [Nutch-general] Nutch web services

Re: Good man is Different than Man good in Nutch?

Re: lucene jar version

Re: lucene jar version

Re: Jira - Nutch 48 - did you mean patch

Re: output format as xml?

Re: [Nutch-general] VOTE: (Re: RSS Feed Parser)

Re: [Nutch-general] number of indexed pages

Re: [Nutch-general] Re: RDF plugin questions

Nutch + RDF for scholarly archives

12 matches

Site Navigation

Mail list logo

Footer information