Actually, the earliest paper that solves the distinct_n estimation
 problem in 1 pass is the following:

    "Estimating simple functions on the union of data streams"
    by Gibbons and Tirthapura, SPAA 2001.
    http://home.eng.iastate.edu/~snt/research/streaming.pdf

 The above paper addresses a more difficult problem (1 pass
 _and_ a distributed setting).


 Gibbon's followup paper in VLDB 2001 limits the problem to a 
 single machine and contains primarily experimental results (for
 a database audience). The algorithmic breakthrough had already been
 accomplished in the SPAA paper.

 Gurmeet

-- 
 ----------------------------------------------------
 Gurmeet Singh Manku                      Google Inc.
 http://www.cs.stanford.edu/~manku    (650) 967 1890
 ----------------------------------------------------


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to