Matt,

it's the index that is used for searching, not the webdb.

What is the status of these pages in webdb? Likely they are not
fetched yet (DB_UNFETCHED), and thus can never be in your index.

These articles give very nice basic explanation of different concepts:

http://today.java.net/pub/a/today/2006/01/10/introduction-to-nutch-1.html
http://today.java.net/pub/a/today/2006/02/16/introduction-to-nutch-2.html

HTH Thomas

On 7/22/06, Matt Timion <[EMAIL PROTECTED]> wrote:
> Asking again hoping that someone can help me out.
>
> I have a number of pages from a certain domain in my database.  I can verify
> this when I use the command:
>
> bin/nutch admin crawl/db -textdump text
>
> I then look at the text.pages file and it has nearly 800 pages from that
> domain in my database.
>
> yet when I search for content from that domain nothing comes up.  Can anyone
> tell me why this would happen?
>
>

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to