On Thu, 6 May 2004, J wrote:

Hi,

> I have a few questions hopefully someone can help me out with.
>
> Is there a way to get a page count?

Do you mean the number of pages stored in WebDB (crawled by Nutch)?
Then, you can run something like the following:

----------Cut-------Here------------------------------
import java.lang.*;
import java.io.*;
import net.nutch.db.*;

public class DBStat {
    public static void main(String args[]) {
        try {
            WebDBReader db =
                new WebDBReader(new File("/Users/jungshik/db/nutch/db"));
            System.out.println("Number of links= " + db.numLinks());
            System.out.println("Number of pages= " + db.numPages());
            db.close();
        }
        catch (Exception e) {
            System.out.println("Caught Exception : " + e);
        }
    }
}
--------------Cut---------Here-----------------------

 Hope this helps,

 Jungshik


-------------------------------------------------------
This SF.Net email is sponsored by Sleepycat Software
Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver
higher performing products faster, at low TCO.
http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
_______________________________________________
Nutch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to