Thank you for the replies. I got the point now. :) Regards, Panshul
On Thu, Feb 28, 2013 at 11:08 AM, Prasanth J <[email protected]>wrote: > Sorry, I was confused with RandomSampleLoader which uses reservoir > sampling. > SAMPLE is rewritten to filter + less than expression with sampling > percentage as predicate value. > > Thanks > -- Prasanth > > On Feb 28, 2013, at 5:01 AM, Gianmarco De Francisci Morales < > [email protected]> wrote: > > > Hi, > > LIMIT takes the first X records, so there are no statistical guarantees. > > SAMPLE takes X% of the records from the whole bag (uniformly), so you > have > > statistical guarantees. > > No, SAMPLE does not use reservoir sampling. > > > > Cheers, > > > > -- > > Gianmarco > > > > > > On Wed, Feb 27, 2013 at 12:23 AM, Prasanth J <[email protected] > >wrote: > > > >> AFAIK, SAMPLE operator internally uses reservoir sampling. So it reads > >> entire data to randomly generate 10% data. > >> > >> Thanks > >> -- Prasanth > >> > >> On Feb 26, 2013, at 6:19 PM, Panshul Whisper <[email protected]> > >> wrote: > >> > >>> Hello, > >>> > >>> Can somebody please explain me the difference between Limit and Sample > >>> statements. > >>> Does it read the entire input file in case of Sample if the value is > set > >> to > >>> 0.1 or it reads randomly only till 10% of the data has been collected. > >>> > >>> Thanking You for any help. > >>> > >>> -- > >>> Regards, > >>> Ouch Whisper > >>> 010101010101 > >> > >> > > -- Regards, Ouch Whisper 010101010101
