Are you trying to use Nutch's indexer? AFAIK that's deprecated, isn't it?

On Wed, Jun 22, 2011 at 2:19 PM, caomanhdat <[email protected]> wrote:
> Hi all
> I have a problem with get frequency of word in nutch :|
> in Lucene it quite easy through this code :
>
> Directory dir2 = FSDirectory.open(new File(indexDir));
>    IndexReader ir = IndexReader.open(dir2);
>    TermDocs termDocs = ir.termDocs(new Term("contents", "eBank"));
>    int count = 0;
>    while (termDocs.next()) {
>       count += termDocs.freq();
>    }
>
> But in nutch, the indexer quite weird so i can't do the same thing
>
>  Directory dir2 = FSDirectory.open(new File("D:\\nutch\\crawl\\indexes"));
>    IndexReader ir = IndexReader.open(dir2);
>    TermDocs termDocs = ir.termDocs(new Term("contents", "eBank"));
>    int count = 0;
>    while (termDocs.next()) {
>       count += termDocs.freq();
>    }
>
>
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/Get-frequency-of-word-tp3095236p3095236.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Reply via email to