Actually, the earliest paper that solves the distinct_n estimation
 problem in 1 pass is the following:

    "Estimating simple functions on the union of data streams"
    by Gibbons and Tirthapura, SPAA 2001.

 The above paper addresses a more difficult problem (1 pass
 _and_ a distributed setting).

 Gibbon's followup paper in VLDB 2001 limits the problem to a 
 single machine and contains primarily experimental results (for
 a database audience). The algorithmic breakthrough had already been
 accomplished in the SPAA paper.


 Gurmeet Singh Manku                      Google Inc.    (650) 967 1890

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to