Hi Julian, Thank you for your answer
Best regards, Cancai Cai Julian Hyde <jhyde.apa...@gmail.com> 于2024年6月10日周一 05:38写道: > I believe that there are many overlapping problems, many of which are that > each DBMS has its own semantics. > > In order to emulate a particular DBMS, we need to understand its > semantics, document that semantics by writing tests, and then follow that > semantics (if Calcite is in a mode that should follow those semantics). > > In my opinion, some DBMS overuse binary floating point types (which are > often lossy when converting integer or decimal literals). Ideally Calcite > would never convert to a binary floating point unless specifically asked to > do so. But if we are emulating another DBMS, maybe we would need to do that. > > By default, Calcite’s numeric literals are arbitrary-precision decimal > values. But some recent threads/issues have convinced me that we might need > arbitrary precision binary numeric literals in some cases (e.g. after > applying constant reduction). > > > > > On Jun 9, 2024, at 12:21 PM, Cancai Cai <caic68...@gmail.com> wrote: > > > > Hello calcite community, > > I would like to ask a question. Actually, I am not sure whether it is > > appropriate to ask this question in the calcite community. It is mainly > > about the database's handling of numeric types. > > > > [10:53:45]TiDB root:test> explain select id = 8145308033243873280 from > test; > > > +---------------------+----------+-----------+--------------------------+------------------------------------------------------------------------+ > > | id | estRows | task | access object | > > operator info | > > > +---------------------+----------+-----------+--------------------------+------------------------------------------------------------------------+ > > | Projection_3 | 10000.00 | root | | > > eq(cast(test.test.id, double BINARY), 8.145308033243873e+18)->Column#3 | > > | └─IndexReader_7 | 10000.00 | root | | > > index:IndexFullScan_6 | > > | └─IndexFullScan_6 | 10000.00 | cop[tikv] | table:test, index:id(id) | > > keep order:false, stats:pseudo | > > > +---------------------+----------+-----------+--------------------------+------------------------------------------------------------------------+ > > 3 rows in set > > Time: 0.004s > > [10:54:08]TiDB root:test> select cast('8145308033243873281' as double) ; > > +---------------------------------------+ > > | cast('8145308033243873281' as double) | > > +---------------------------------------+ > > | 8.145308033243873e+18 | > > +---------------------------------------+ > > 1 row in set > > Time: 0.008s > > [10:54:15]TiDB root:test> select cast('8145308033243873280' as double) ; > > +---------------------------------------+ > > | cast('8145308033243873280' as double) | > > +---------------------------------------+ > > | 8.145308033243873e+18 | > > +---------------------------------------+ > > 1 row in set > > Time: 0.003s > > > > As shown above, some numeric types are converted to double type by > default > > in the underlying database, but this is not compatible with bigdecimal > and > > decimal types. Why is this? > > > > These behaviors exist in both MySQL > > <https://dev.mysql.com/doc/refman/8.0/en/type-conversion.html>(I put the > > above example into mysql and it is also reproduced) and Spark. For > example, > > Spark's map_contains_key function also converts numeric types to double > for > > comparison. > > > > I am not a database developer, I am curious about where the difficulty is > > here, if someone can tell me, I will be very grateful and happy that I > will > > learn something new. > > > > Best wishes, > > Cancai Cai > >