Re: Re[2]: Compute the top 100 million in the total 10 billion data efficiently.

Klausen Schaefersinho Wed, 22 Jan 2014 02:03:08 -0800

Hi,

you can have a look at "Knowledge Discovery from Data Streams" from Joao
Gama. It gives a very good and solid introduction to the topic of stream
mining.


Regards,

Klaus



On Wed, Jan 22, 2014 at 10:35 AM, Ted Dunning <[email protected]> wrote:

>
> On Tue, Jan 21, 2014 at 7:31 AM, <[email protected]> wrote:
>
>> You mentioned a approximate algorithm. That's great! I will check it out
>> later. But, Is there a way to calculate it in a precise way?
>
>
> If you want to select the 1% largest numbers, then you have a few choices.
>
> If you have memory for the full set, you can sort.
>
> If you have room to keep 1% of the samples in memory, you need to do 100
> passes.
>
> If you are willing to accept small errors, then you can do it in a single
> pass.
>
> These trade-offs are not optional, but are theorems.
>
>
>

Re: Re[2]: Compute the top 100 million in the total 10 billion data efficiently.

Reply via email to