Eric Holman wrote:
I have an example of nutch being run as a local search engine at http://www.searchmitchell.com <http://www.searchmitchell.com/>

This looks great! Would you like to add it to the wiki?

http://www.nutch.org/cgi-bin/twiki/view/Main/PublicServers

There are a few issues that I’m initially concerned about, and wondering if you could provide any comments/suggestions:

- Doesn’t seem to index sites using frames (e.g., _http://www.dicefinancial.com <http://www.dicefinancial.com/>)_

That's pretty standard. Please read:

http://searchenginewatch.com/webmasters/article.php/2167901

Yes, we could do better, but frame-based websites which really wish to be searchable should use NOFRAMES.

- Also, doesn’t seem to index those starting with a redirect (e.g., _http://www.cornerstonescareer.com <http://www.cornerstonescareer.com/>_)

Hmm. This should work. Can you provide more details about what happens with this one?


- Also has problems w/ querystrings at times (e.g., caught looping through a calendar on _http://www.focusag.us <http://www.focusag.us/>_)

Query strings can be tricky. The best approach is to use the regex url filter to avoid such pages.


- Grouping by same-hosts (already posted on this issue, and looks like you are working towards a solution. I’m excited to try this out, once implemented)

Yup, we're working on this.

Btw: I do have a business plan for this concept that has a good deal of interest in it, so if I can execute the plan, I would be interested in committing a percentage of the returns back to nutch development.

Nutch.org does not have employees. The best thing you can do is hire someone yourself to work on Nutch. Many folks on this mailing list would probably we willing to contract with you to add the features that you need to Nutch.


We should start a Wiki page listing such folks, like the Lucene Support page (http://wiki.apache.org/jakarta-lucene/Support). Can someone please add such a page to the Nutch wiki?

Doug




-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to