[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997089#comment-14997089 ] Jie Yu commented on MESOS-1187: --- [~klaus1982], as [~idownes] suggested below, the true fix to this problem is to use fixed point (instead of floating point). However, doing that will require changes to the core protobuf definitions, which will make the rolling upgrade very difficult. As a short term fix, I would rather keep the code impact as small as possible. > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995505#comment-14995505 ] Klaus Ma commented on MESOS-1187: - [~jieyu], I'm thinking to find a way for all "double equal checking" as this's not the only ticket about such topic. And give a code guidance in Mesos. > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994149#comment-14994149 ] Jie Yu commented on MESOS-1187: --- To be clear, this is just a short term fix so that we don't end up with CHECK failures in the allocator, right? If that's the case, I vote for option 1 and suggest that we replace CHECK in allocator with CHECK_DOUBLE_EQ (instead of changing the equal operator impl.) > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985702#comment-14985702 ] Ian Downes commented on MESOS-1187: --- I'll ask again: why do we use floats and not fixed point? Many resources have an indivisible base unit, e.g., bytes, and for others there's a reasonable, sane limit on divisibility, e.g. for CPU it doesn't make much sense to go finer than milli cpu or even just deci cpu. Floating point is simply not necessary (IIUC) and brings with it a ton of precision issues that we could simply avoid solving! Rather than litter our code with "equality" checks, let's fix the representation... > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986450#comment-14986450 ] Adam B commented on MESOS-1187: --- +1 to milli-cpus and fixed point. That's what Google uses internally, and it should be good enough for Mesos too. > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984289#comment-14984289 ] Klaus Ma commented on MESOS-1187: - Just go through {{AlmostEqual}} in googletest for double equal check (https://github.com/google/googletest/blob/master/googletest/include/gtest/internal/gtest-internal.h#L358), it follows the idea the paper of Bruce Dawson at http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm . So the proposal to check whether the two double is equal in Mesos are: 1. Dump the logic of {{AlmostEqual}} of googletest into Mesos: a.) copy the file from googletest (need confirm its license) b.) re-implement it in Mesos 2. Define a "reasonable" epsilon, and check whether the two double is closed enough by {{fabs(l - r) < epsilon}}; for the "reasonable" epsilon, maybe 0.01 is enough. [~vi...@twitter.com]/[~jieyu], any comments for the proposal? > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738337#comment-14738337 ] Klaus Ma commented on MESOS-1187: - [~jieyu], do you have any comments on that? > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738336#comment-14738336 ] Klaus Ma commented on MESOS-1187: - Thanks very much for your input :); and you're right about the precision. And the following two article give some advise. http://stackoverflow.com/questions/17333/most-effective-way-for-float-and-double-comparison http://en.cppreference.com/w/cpp/types/numeric_limits/epsilon http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm And the following code, from cppreference, will resolve your case. But, check whether two double is equal is a complex and lots of discussion. I'd suggest to commit the following patch firstly; if any new case, let's handle it case by case. {code} bool almost_equal(double x, double y, int ulp = 2) { // the machine epsilon has to be scaled to the magnitude of the values used // and multiplied by the desired precision in ULPs (units in the last place) return fabs(x-y) < numeric_limits::epsilon() * fabs(x+y) * ulp // unless the result is subnormal || fabs(x-y) < numeric_limits::min(); } {code} > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739120#comment-14739120 ] Felix Abecassis commented on MESOS-1187: You are right that it can get extremely complex. We are not doing scientific computing here so I don't think we need to go to such depths as computing ULPs and handling subnormals :). I think a simple relative epsilon comparison will do the trick just fine. For instance function AlmostEqualRelative in: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14736243#comment-14736243 ] Klaus Ma commented on MESOS-1187: - A patch was uploaded to https://reviews.apache.org/r/38201/ [~jieyu], would you help to review it? > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737312#comment-14737312 ] Felix Abecassis commented on MESOS-1187: Being new to Mesos, I don't understand all the details of this bug, but are you aware of the pitfalls of using {{numeric_limits::epsilon()}}? This value (aka {{FLT_EPSILON}} or {{DBL_EPSILON}}) only makes sense when comparing values between 1. and 2.; even in this range, it's a very strict comparison. For instance, try the following code: {code} double r1 = 0; for (int i = 0; i < 20; ++i) r1 += 0.1; double r2 = 0.1 * 20; cout.precision(20); std::cout << r1 << " " << r2 << std::endl; if (r1 == r2) cout << "it's equal" << endl; cout << abs(r1 - r2) << " " << numeric_limits::epsilon() << endl; // This condition is *NOT* true if (abs(r1 - r2) < numeric_limits::epsilon()) cout << "it's almost equal." << endl; // This condition will NOT trigger if (r1 <= r2 && r1 >= r2) cout << "it's equal too" << endl; {code} > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye >Assignee: Klaus Ma > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734427#comment-14734427 ] Klaus Ma commented on MESOS-1187: - As [~jieyu] said, we can NOT use == for double; generally, we will check it by {{abs(a,b) < eps}}. I'll draft a fix for that. > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734831#comment-14734831 ] Klaus Ma commented on MESOS-1187: - Have a sample as follow to show how to compare two double, I'm going to fix it for {{==/<=}} of Scalar. [~jieyu], [~bmahler], [~vinodkone], and comments for this? {code} double r1 = 0.1 + 0.1 + 0.1 - 0.1 - 0.1; double r2 = 0.1; if (r1 == r2) cout << "it's equal" << endl; // Only this condition is true if (abs(r1 - r2) < numeric_limits::epsilon()) cout << "it's almost equal." << endl; // This condition will NOT trigger if (r1 <= r2 && r1 >= r2) cout << "it's equal too" << endl; {code} > precision errors with allocation calculations > - > > Key: MESOS-1187 > URL: https://issues.apache.org/jira/browse/MESOS-1187 > Project: Mesos > Issue Type: Bug > Components: allocation, master >Reporter: aniruddha sathaye > > As allocations are stored/transmitted as doubles many a times precision > errors creep in. > we have seen erroneous share calculations happen only because of floating > point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1187) precision errors with allocation calculations
[ https://issues.apache.org/jira/browse/MESOS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600257#comment-14600257 ] Jie Yu commented on MESOS-1187: --- This is a simple test to reproduce: https://reviews.apache.org/r/35849/ precision errors with allocation calculations - Key: MESOS-1187 URL: https://issues.apache.org/jira/browse/MESOS-1187 Project: Mesos Issue Type: Bug Components: allocation, master Reporter: aniruddha sathaye As allocations are stored/transmitted as doubles many a times precision errors creep in. we have seen erroneous share calculations happen only because of floating point arithmetic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)