Re: [E] Re: Consequences of sampling before analyzing data with DataSketches

2020-11-19 Thread leerho
Works for me now :) On Thu, Nov 19, 2020 at 9:10 AM Will Lauer wrote: > Lee, That link looks like it's working for me now. Must have been a > temporary server error. > > Will > > > > Will Lauer > > Senior Principal Architect, Audience & Advertising Reporting >

Re: [E] Re: Consequences of sampling before analyzing data with DataSketches

2020-11-19 Thread Will Lauer
Lee, That link looks like it's working for me now. Must have been a temporary server error. Will Will Lauer Senior Principal Architect, Audience & Advertising Reporting Data Platforms & Systems Engineering M 508 561 6427 1908 S. First St Champaign, IL 61822

Re: Consequences of sampling before analyzing data with DataSketches

2020-11-19 Thread Justin Thaler
Hi Lee, I guess you mean the link to the paper on sketching subsampled data? That's strange, it's working for me. Anyway, here is more information if anyone wants to access it. The paper is entitled "Space-Efficient Estimation of Statistics over Sub-Sampled Streams" by McGregor, Pavan,

Re: Consequences of sampling before analyzing data with DataSketches

2020-11-19 Thread leerho
Hi Justin, the site you referenced returns an error 500 (internal server error). It might be down, or out-of-service. You might also check to make sure it is the correct URL. Thanks! Lee. On Thu, Nov 19, 2020 at 6:05 AM Justin Thaler wrote: > I think the way to think about this is the

Re: Consequences of sampling before analyzing data with DataSketches

2020-11-19 Thread Justin Thaler
I think the way to think about this is the following. If you downsample and then sketch, there are two sources of error: sampling error and sketching error. The former refers to how much the answer to your query over the sample deviates from the answer over the original data, while the second

Re: Consequences of sampling before analyzing data with DataSketches

2020-11-19 Thread Sergio Castro
Thanks a lot for your answers to my first question, Lee and Justin. Justin, regarding this observation: "*All of that said, the library will not be able to say anything about what errors the user should expect if the data is pre-sampled, because in such a situation there are many factors that are