1.bloom filter high effient algorithm to elimate duplicate URL. 2.based on disk hash table mercator uses it
2005/12/16, Arun Kumar Sharma <[EMAIL PROTECTED]>: > > Hi > I have list of urls which may contain duplicate urls. I want to check > that there is no duplicate url insertion through WebDBInjector. Is there any > way to achieve this using nutch functionality??? > answer awaited anxiously... > > > Regards, > > Arun Kumar Sharma (Tech Lead -Java/J2EE) > Mob: +91.981.529.5761 > > > > > Send instant messages to your online friends http://in.messenger.yahoo.com > -- 想搜就搜
