I am knew to these things . Can u let me know in details, where this filter
if found and what is mercator ? I want to use it with java. Is there a way
by which I use this functionality from java and remove duplicate urls from
urls.txt file.

On 12/16/05, Zhao Loen <[EMAIL PROTECTED]> wrote:
>
> 1.bloom filter
> high effient algorithm to elimate duplicate URL.
>
> 2.based on disk hash table
> mercator uses it
>
> 2005/12/16, Arun Kumar Sharma <[EMAIL PROTECTED]>:
> >
> > Hi
> >      I have list of urls which may contain duplicate urls. I want to
> check
> > that there is no duplicate url insertion through WebDBInjector. Is there
> any
> > way to achieve this using nutch functionality???
> >     answer awaited anxiously...
> >
> >
> > Regards,
> >
> > Arun Kumar Sharma (Tech Lead -Java/J2EE)
> > Mob: +91.981.529.5761
> >
> >
> >
> >
> > Send instant messages to your online friends
> http://in.messenger.yahoo.com
> >
>
>
>
> --
> 想搜就搜
>

Reply via email to