[ 
https://issues.apache.org/jira/browse/DATASKETCHES-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305031#comment-17305031
 ] 

Lee Rhodes commented on DATASKETCHES-10:
----------------------------------------

??Why not always double???
 * The size of a sketch is very important for many (if not most) of our users. 
For applications that are processing millions to billions of sketches (and this 
is not an exaggeration!), a 2X increase in memory or stored size would be a 
huge deal.
 * Sketches are approximate algorithms, and for many applications the 7 or so 
digits of precision of a float is more accurate than the accuracy of the 
sketch.  

Of course there are applications where the IEEE 754 32-bit float is not 
sufficient and that is why we also have the original Quantiles Sketch 
([https://datasketches.apache.org/docs/Quantiles/OrigQuantilesSketch.html)] 

which has a concrete _doubles_ implementation as well as a _generic_ 
implementation.  
Why don't you consider that?

Lee.

> Double precision by default?
> ----------------------------
>
>                 Key: DATASKETCHES-10
>                 URL: https://issues.apache.org/jira/browse/DATASKETCHES-10
>             Project: Apache Datasketches
>          Issue Type: Improvement
>            Reporter: Jan Prach
>            Priority: Major
>
> Would it make sense to use double (instead of float) for all sketches by 
> default?
> It would take (less than 2x) more memory, have same speed, have twice the 
> storage. Or even the same storage if one is fine with the flaot precision. 
> Most importantly it would be far more useful.
> I' trying to build generic profiler. In the first simple dataset there were a 
> couple of date  and timestamp columns. The obvious choice is to convert them 
> to epoch seconds. Full day of time with weird messages only to realize that 
> KllFloatsSketch, ReqSketch, etc. are all based on floats. That means 24 bit 
> precision. But epoch seconds today are 31 bit numbers.
> Why not always double?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to