Andrzej Bialecki wrote:

Hi EM,

In the future, please use the right Subject: line...

Sorry about that, missed it.

In any case, the instalation was rough but I managed to fully install it and preform searches, but there's so much to do...

The intended language isn’t English, what mapping should be done on the side of the crawler? I still want the search queries to be coming in as plain English, and to search the extended character set with it (map 'xyz' in foreign keyboart to 'abc' in english (ascii - something?).

You need to modify Nutch Analyzer to strip accents. The same Analyzer will be used for indexing and querying.

I'm completly lost by your answer, but I'll try to read up a bit more about the Analyzer.


It starts as a hobby project so I don’t plan spending too much $ on it.
Can I run the crawler on my home DSL (160kb/s down, 60kb up) and then upload the database for the search engine to the server? (this might be obsolet depending on the next question)

Depending on your settings, each downloaded page will take on average 15-20kB of disk space. I tried in the past to run crawlers at one location, and then moving segments data to another location... well, it was painful, because of the size of segment data.

Although the fetching is slow, the system works, 4000 pages in a segment of 50 MB or so. 12.5 kb/page.


Thanks for the kick-start help ;)
Emilijan

Reply via email to