Well, I understood that you have a number (say x) and you want if the
number x is not in the file. Atul wrote:

"The numbers are all distinct. they may vary from 100,000,000 to
999,999,999 and there are roughly 300 million (ie. 300 * 1000,000
numbers approx.). So there are 600*1000,000 number which are not in
the file and we are asked to return any one these 600*1000,000 numbers
given the memory constraints. Moreover the numbers are in random
order. "

And in the first email:

"Find a 9-digit number that isn't present in the file."

The description is ambiguous. But I think you are right. My
interpretation was wrong.

On Fri, Jun 24, 2011 at 1:06 AM, Dave <[email protected]> wrote:
> @Douglas: I was assuming that the file contains the numbers in ASCII,
> and that normal file buffering is being used. If you think that I need
> to include the buffers in the given storage, then fine: just use half
> for buffers and half for the array of counters. The only change to the
> algorithm would be to replace 1000000 everywhere with 500000.
>
> I find your description below a bit confusing, where you say that you
> will search for the number sequentially. What number are you searching
> for? The problem is to find a number that is not in the file, not to
> check to see if a given number is not in the file. Suppose that the
> file contains account numbers; the problem is to find an unused
> account number. So how do you search sequentially through a file to
> find a number that is not in the file? If I have misinterpreted what
> you wrote, please present your entire algorithm.
>
> Dave
>
> On Jun 23, 10:18 pm, Douglas Diniz <[email protected]> wrote:
>> Dave, I think you method is very slow. If you have a array of 1000000
>> of 16 bits integers this mean that all the Ram is in use, and you will
>> need at least 300x10^6 disk access, because you will need the read the
>> numbers one by one from the file, and this will take a LOT of time.
>> And your method use the module operation % for every number, which is slow.
>> I think the basic algorithm of read blocks of 2MB to Ram and searching
>> for the number sequentially until you find the number or reach the end
>> of file, is some order of magnitude better, because you read the
>> blocks linearly from the disk, which is much, much better than read
>> one by one, and you dont have to calculate the %.
>>
>>
>>
>>
>>
>> On Thu, Jun 23, 2011 at 11:24 PM, Dave <[email protected]> wrote:
>> > @Atul: Here is one way to do it:
>>
>> > Treat the array as a million 16-bit integers: short a[1000000].
>>
>> > Set the array to zero.
>>
>> > Read through the file, and for each number k, increment a[k%1000000].
>> > Since the numbers are distinct, there can be at most 900 numbers in
>> > each bin.
>>
>> > Find j such that a[j] < 900. You now know that there is a missing
>> > number with the six-digit number j as the final six digits.
>>
>> > Set a[100] to a[999] to zero.
>>
>> > Read through the array again. Ignore any number k whose last six
>> > digits are not j. For the remaining numbers, set a[k/1000000] = 1.
>>
>> > Find i, with 100 <= i <= 999, such that a[i]==0. This gives you the
>> > first three digits of a missing number.
>>
>> > Put the first thee digits together with the last six digits: k =
>> > 1000000*i + j to get a number that is not in the file.
>>
>> > Dave
>>
>> > On Jun 23, 7:59 am, atul purohit <[email protected]> wrote:
>> >> Hi,
>>
>> >> Can someone explain me how to do this.
>>
>> >> Given a file containing roughly 300 million 9-digit numbers. Find a 
>> >> 9-digit
>> >> number that isn't present in the file.You have unlimited drive space but
>> >> only 2MB of RAM available.
>>
>> >> I suppose we have to use bit-vectors n all but still couldnt figure out a
>> >> proper way. A few steps or an algo will be great.
>>
>> >> Cheers,
>> >> Atul
>>
>> > --
>> > You received this message because you are subscribed to the Google Groups 
>> > "Algorithm Geeks" group.
>> > To post to this group, send email to [email protected].
>> > To unsubscribe from this group, send email to 
>> > [email protected].
>> > For more options, visit this group 
>> > athttp://groups.google.com/group/algogeeks?hl=en.
>>
>> --
>> -------------------------------------------------------------------
>> Douglas Gameiro Diniz
>> Engenheiro de Computação - 2003 - UNICAMP
>>
>> Mobile: (19) 92158777
>> Gtalk: dgdiniz
>> Msn: [email protected] Hide quoted text -
>>
>> - Show quoted text -
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Algorithm Geeks" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/algogeeks?hl=en.
>
>



-- 
-------------------------------------------------------------------
Douglas Gameiro Diniz
Engenheiro de Computação - 2003 - UNICAMP

Mobile: (19) 92158777
Gtalk: dgdiniz
Msn: [email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.

Reply via email to