Hello Sergey,
On 05/15/2017 04:45 PM, Sergey Petrunia wrote: > Hi Alexander, > > I was looking at > > https://jira.mariadb.org/browse/MDEV-11893 > > and found this piece of code: > > bool Item_str_func::fix_fields(THD *thd, Item **ref) > { > bool res= Item_func::fix_fields(thd, ref); > /* > In Item_str_func::check_well_formed_result() we may set null_value > flag on the same condition as in test() below. > */ > maybe_null= maybe_null || thd->is_strict_mode(); > return res; > } > > This has been introduced by this cset: > > https://github.com/MariaDB/server/commit/af22eb35e577ef17226faf662f2cffc4705bde26 > > Which says: > > Add Item_str_func::fix_fields() implementation, and set maybe_null to > TRUE if we are in the SQL mode that requires some functions to return > null even if they normally do not. > > The patch has only one example of CHAR() function doing that. > > At the same time, not having NOT-NULL attribute disallows query opimizations > (see MDEV-11893 for an example). > I think we should have this > > maybe_null= maybe_null || thd->is_strict_mode(); > > logic on a case-by-case basis only. I wanted to check with you - are there > any > other known string functions that, as the patch puts it "return null even if > they > normally do not"? The idea behind this code is to return: - A well-formed result, or NULL in strict mode. - A well-formed result, or a fixed result in non-strict mode. If val_str() sees that some bad byte sequences were fixed to question marks, then it returns NULL in strict mode, or the fixed string in non-strict mode. There are more functions except CHAR() that can return NULL or not-NULL depending on strict mode. The most important is Item_func_conv_charset. It's val_str() uses String_copier_for_item::copy_with_warn(), whose result depends on strict mode. The important thing about Item_func_conv_charset is that it can be automatically generated during fix_fields() whenever a character set conversion is needed. See Item::safe_charset_converter(). So it can appear even if it did not exists in the query typed by the user. For example: CREATE OR REPLACE TABLE t1 (a VARBINARY(1)); INSERT INTO t1 VALUES (0xFF); SELECT CONCAT(_utf8 'a' COLLATE utf8_general_ci, a) FROM t1; The SELECT query is automatically replaced to: SELECT CONCAT(_utf8 'a' COLLATE utf8_general_ci, CONVERT(a USING utf8)) FROM t1; For every Item_func_xxx it is possible to track down recursively during fix_fields() if the result is going to be well-formed for all possible argument values. In many cases it will also depend on the current @@character_set_connection value. I'm not sure how much efforts a precise solution will need. But in the example in MDEV-11893, it's clear that the result is going to be well-formed, because: - A datetime field converted to string always returns a result with ASCII repertoire, which is well-formed. - A string literal consisting of ASCII characters is also well-formed - A concatenation of two ASCII-repertoire results is always well-formed, even if character set conversion is needed. We could start with checking repertoire. If it's MY_REPERTOIRE_ASCII, then the result should be well-formed. It will at least cover the cases like MDEV-11893. > > BR > Sergei > _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp

