@amit :

here is the reason :-

each url say http://www.geeksforgeeks.org

you will hash following urls
http://www.geeksforgeeks.org
http://www.geeksforgeeks.org/archives
http://www.geeksforgeeks.org/archives/19248
http://www.geeksforgeeks.org/archives/1111
http://www.geeksforgeeks.org/archives/19221
http://www.geeksforgeeks.org/archives/19290
http://www.geeksforgeeks.org/archives/1876
http://www.geeksforgeeks.org/archives/1763

"http://www.geeksforgeeks.org"; is the redundant part in each url ..... it
would unnecessary m/m to save all URLs.

ok now say file have 20 million urls ..... .....now what would you do.??



On Wed, May 16, 2012 at 10:50 AM, Amit Mittal <[email protected]>wrote:

> Why hashing won;t work for millions of URL.
> If you hash each URL in to a distinct 32 bit integer, you can map 2^32 URL
> which is around 4 billion. it should work.
>
>
> On Wed, May 16, 2012 at 10:42 AM, atul anand <[email protected]>wrote:
>
>> i was thinking about using TRIE or patricia tree. hashing is another but
>> it wont work if URLs are in millions
>> is there any better data structure ?
>>
>>
>> On Tue, May 15, 2012 at 11:37 PM, Varun <[email protected]> wrote:
>>
>>> should be a tree based on domain in url and directory mentioned in url.
>>>
>>>
>>> On Tuesday, 15 May 2012 21:20:55 UTC+5:30, atul007 wrote:
>>>>
>>>> Given a file which contain millions of URL's. which data structure
>>>> would you use for storing these URL's . data structure used should store
>>>> and fetch data in efficient manner.
>>>
>>>
>>> On Tuesday, 15 May 2012 21:20:55 UTC+5:30, atul007 wrote:
>>>>
>>>> Given a file which contain millions of URL's. which data structure
>>>> would you use for storing these URL's . data structure used should store
>>>> and fetch data in efficient manner.
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "Algorithm Geeks" group.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msg/algogeeks/-/idbhSUZ6TNIJ.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected].
>>> For more options, visit this group at
>>> http://groups.google.com/group/algogeeks?hl=en.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Algorithm Geeks" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/algogeeks?hl=en.
>>
>
>
>
> --
> Regards
> Amit Mittal
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Algorithm Geeks" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/algogeeks?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.

Reply via email to