[GitHub] [druid] benkrug commented on a diff in pull request #13671: docs: clarify Java precision

GitBox Wed, 18 Jan 2023 11:26:55 -0800


benkrug commented on code in PR #13671:
URL: https://github.com/apache/druid/pull/13671#discussion_r1073979796



##########
docs/querying/sql-operators.md:
##########
@@ -37,6 +37,10 @@ sidebar_label: "Operators"
 
 Operators in [Druid SQL](./sql.md) typically operate on one or two values and 
return a result based on the values. Types of operators in Druid SQL include 
arithmetic, comparison, logical, and more, as described here. 
 
+When performing math operations, Druid uses the integer datatype unless there 
are double or float values. If double or float values are involved, Druid uses 
double. Note that the highest precision way to store digits in Druid are 64-bit 
integers (long) or 64-bit floats (double). In essence, a double can represent 
52 binary digits, so Druid may return incorrect results for doubles if the 
value exceeds 2^52.
+
+For more information about how Java handles primitive data types and how it 
may impact the results you get, see [Primitive data types in Java are a matter 
of 
precision](https://blogs.oracle.com/javamagazine/post/java-primitive-datatypes-int-float-double).

Review Comment:
   The commit is a bit off still.  According to [current 
docs](https://druid.apache.org/docs/latest/querying/sql-data-types.html#standard-types),
 druid has long, float and double.  "float" is a 32-bit float, and "double" is 
a 64-bit float.
   
   I suggested some references because the math of float precision is beyond 
the scope of druid docs, imo.  The main points are that longs can store up to 
2^63 accurately (the current commit says doubles, that should be longs, iiuc), 
and floats and doubles use 32-bit and 64-bit floating point.  Any floating 
point storage format will have variable precision depending on the size of the 
numbers.  (See the linked URLs in my earlier comment.)  Floating point 
precision is really complicated and mathematical, beyond the scope of druid 
docs imo (again), and it's a general condition in software.  Just saying that  
"float" and "double" both use floating point is the main point. 
   
   We could get in to the bounds for integers being represented exactly in 
floats, but that ignores decimals, which is probably the point of using a 
float.  Maybe we can also add a comment along the lines that if exact decimal 
values are needed, and you need, eg, 3 decimal places, you can store the number 
multiplied by 1000 as long, and divide again when querying.  This will be 
exact, up to the min and max values for longs.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] benkrug commented on a diff in pull request #13671: docs: clarify Java precision

Reply via email to