I am knew to these things . Can u let me know in details, where this filter if found and what is mercator ? I want to use it with java. Is there a way by which I use this functionality from java and remove duplicate urls from urls.txt file.
On 12/16/05, Zhao Loen <[EMAIL PROTECTED]> wrote: > > 1.bloom filter > high effient algorithm to elimate duplicate URL. > > 2.based on disk hash table > mercator uses it > > 2005/12/16, Arun Kumar Sharma <[EMAIL PROTECTED]>: > > > > Hi > > I have list of urls which may contain duplicate urls. I want to > check > > that there is no duplicate url insertion through WebDBInjector. Is there > any > > way to achieve this using nutch functionality??? > > answer awaited anxiously... > > > > > > Regards, > > > > Arun Kumar Sharma (Tech Lead -Java/J2EE) > > Mob: +91.981.529.5761 > > > > > > > > > > Send instant messages to your online friends > http://in.messenger.yahoo.com > > > > > > -- > 想搜就搜 >
