look at org.apache.lucene.index.IndexReader.numDocs() method. You can
write a simple utility to run it in the shell.

On 7/28/07, Enzo Michelangeli <[EMAIL PROTECTED]> wrote:
> Is there a quick way of knowing how many pages are indexed (_not_ how many
> are referenced in crawldb as fetched URL's)? I could use Luke to peek inside
> the indexes and get the "Number of documents", but they are located on a
> remote headless server with only SSH access... (OK, I actually did access
> them using Sftpdrive, but I'd like to have a command line to invoke in a
> shell script...)
>
> Enzo
>
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nutch-general mailing list
Nutch-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to