Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Binarus
> Hi Binarus,  
> 
> I see 3 options here: 
> 
> 1) MariaDB captains can help you implementing this, we always welcome 
> contributions and Alexander already point you where to start.
> 
> 2) You really need this feature and can switch to MariaDB, please contact one 
> of our sale reps to quote you some NRE work . 
>  
> 3) Captains fill this is a generic issue that need to be fixed for the 
> benefit of all, but also prioritise it on other tasks they are working on.
> 
> Tell us the time you are ready to wait, How much you can afford to help etc. 
> ..? 
> 
> An other option is that you switch to MariaDB and use VarBinary, that would 
> help our captains not to forget your request! 
> 

I'll try to take option 1) and have a look into the code over the weekend. I 
would be proud to make a small contribution to your project.

But since I never have read any of the sources of MySQL or MariaDB until now, I 
am unsure if I will be able to do anything meaningful to the code. Furthermore, 
I (currently) don't know how to submit patches. But I'll find out ...

Regards,

Binarus


___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Binarus
Hello Kristian, hello Alexander,

On 11.03.2016 12:20, Alexander Barkov wrote:
> So we follow the PAD SPACE requirement for default collations here.

I think it's reasonable that the default collations are implemented according 
to the standard. Nevertheless, the standard enforces an exceptionally stupid 
policy here (IMHO), and therefore I think we are in urgent need of additional 
(non-default) collations which solve that problem (and what you have written 
below makes me hope ...).

>> The only workaround I know of is to use VARBINARY instead of VARCHAR. I
>> think it works much the same in most respects. But obviously some semantics
>> is lost when the server no longer is aware of the character set used.
> 
> Correct.
 
I'll go that way, sticking with MySQL, if the problem won't be solved by new 
collations in MariaDB.

>>
>>> Since the index behaviour obviously depends on the collation, would
>>> building an own collation which does not PADSPACE be an option? I have
>>
>> That would be interesting, actually. I don't know what support there is for
>> non-PADSPACE collations. Maybe bar knows (Cc:'ed)?
> 
> We don't have NO PAD collations yet.

Thanks for making this clear; I haven't found something about it in the 
documents. By the way, the document sections which deal with the char data 
types could eventually mention these pitfalls (like the MySQL documentation) :-)

> I don't remember that anybody ever asked for this before.

I wonder why, given the dozens (or hundreds?) of threads and tutorials around 
the web which try to explain how to circumvent these issues (none of them 
provides a general solution besides switching to ...BINARY).
 
> [...]
> Sounds like a few hours of work.

This is very good news. I will now try to give some arguments why this really 
should be done.

1) Avoid programming mistakes

People might be programming for decades, but when not being database experts 
and thrown into a web development project, even those experienced programmers 
won't come to the idea that a database which returns trailing spaces in the 
result set of SELECT statements throws away these trailing spaces when building 
the index or when comparing (e.g. for evaluation the conditions of the WHERE 
CLAUSE). This is completely illogical and dangerous and makes debugging 
substantially more difficult.

2) Developers (and not DBMSs) should be in control of what happens to trailing 
spaces

A DBMS must not silently change data, nor for comparing nor for other purposes. 
It is solely the decision of the application logic if trailing spaces are 
acceptable or desired in data values (possibly entered by users), but not the 
decision of the DBMS. Nothing else than the application logic has to decide 
what to do with trailing spaces (trim them, keep them, reject user input, 
return error messages etc.), and the DBMS has to store, retrieve, compare and 
index the results of the application logic as-is. Period.

However, there is no argument against additional and optional functions or 
configuration flags of the DBMS which allow easy trimming or other (automatic 
and silent) transformation of the values and all sorts of weird compare modes, 
if these are optional. But in the first place, there must also be a mode / 
column type for every sort of data, notably for strings, in which data is 
stored, compared and indexed as-is.

We already have the binary data types which in this respect behave like 
desired, but all string functions fail with binary data values or behave 
surprisingly / impractically, so nearly all string functions (which are very 
important in many applications) are lost for binary data type columns.

3) There are use cases for string data with trailing spaces

I have seen some examples of trailing spaces, some of them useless, but some 
quite meaningful. In every case, the application logic relied on a string with 
trailing space being not equal to the same string without the trailing space. 
In some cases, the developers had not thought about the issues the trailing 
spaces could cause, i.e. their programs were faulty; in the other cases, they 
have put a lot of effort and used the weirdest tricks and index combinations to 
circumvent the problems. I can't remember a case where they have switched to 
...BINARY data types.

Two of the funniest cases (just because it's Friday - skip the next two 
sections if you don't want to laugh):

Some application actually stored an internal status bit at the end of existing 
user input, probably because the programmers were in a hurry or had no 
administrative access to the MySQL server, i.e. they couldn't create a new 
status column in the respective table. Now how could you save a status bit 
within an existing string without making it visible to the users? The answer is 
of course: Append a space at the end of the string to set the status, remove it 
to clear the status. Of course, the respective column had to be UNIQUE in the 
sense that two identical user 

[Maria-discuss] MariaDB replication-manager 0.6.0 release

2016-03-11 Thread Guillaume Lefranc
Hi,

I just released the latest version of my replication manager utility for
MariaDB. It has some tremendous improvements over the last version, so if
you are using it please upgrade, or check it out at
https://github.com/mariadb-corporation/replication-manager

Main features:
- Agentless utility
- Works over standard mysql protocol
- Easy to deploy: single binary for all 64-bit systems
- Failover
- Switchover
- Pre and post scripts
- Seamless interaction with Maxscale, HAProxy, ProxySQL...
- Interactive and non-interactive modes
- Designed to work with MariaDB GTID

Not supported yet by this version, but might be in the future:
- Clustering
- Conf file
- Semi sync support
- Multi-master support
- Positional replication
- Multiple GTID domains

Regards,
Guillaume
___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Alexander Barkov
FYI, I have added a new task for this:

https://jira.mariadb.org/browse/MDEV-9711


On 03/11/2016 03:20 PM, Alexander Barkov wrote:
> Hello Binarus, Kristian,
> 
> On 03/11/2016 02:25 PM, Kristian Nielsen wrote:
>> Binarus  writes:
>>
>>> "All MySQL collations are of type PADSPACE. This means that all CHAR,
>>> VARCHAR, and TEXT values in MySQL are compared without regard to any
>>> trailing spaces. “Comparison” in this context does not include the
>>
>> Yes, I have always found this terminally stupid as well. But I think it
>> comes from the SQL standard.
> 
> Probably this comes from here:
> 
>> 2) A  specifies the name of a character set 
>> that is defined by a national or
>> international standard. The character repertoire of CS is defined by the 
>> standard defining the character set
>> identified by that . The default collation of 
>> the character set is defined by
>> the order of the characters in the standard and has the PAD SPACE 
>> characteristic.
> 
> So we follow the PAD SPACE requirement for default collations here.
> 
> I guess the reasoning here is to treat CHAR and VARCHAR in a similar way
> by default. I agree with the standard here. We don't want different
> behavior for CHAR and VARCHAR.
> 
> 
> Btw, we don't follow the requirement that the default collation must be
> defined according to "the order of the characters in the standard".
> Default collations are traditionally case insensitive in MariaDB/MySQL.
> 
>>
>> The only workaround I know of is to use VARBINARY instead of VARCHAR. I
>> think it works much the same in most respects. But obviously some semantics
>> is lost when the server no longer is aware of the character set used.
> 
> Correct.
> 
>>
>>> Since the index behaviour obviously depends on the collation, would
>>> building an own collation which does not PADSPACE be an option? I have
>>
>> That would be interesting, actually. I don't know what support there is for
>> non-PADSPACE collations. Maybe bar knows (Cc:'ed)?
> 
> We don't have NO PAD collations yet.
> I don't remember that anybody ever asked for this before.
> 
> From a glance, this should not be too hard to implement.
> 
> You previously wrote you need utf8.
> So there are three options to make a new collation,
> based on one of these existing collations:
> 
> - utf8_general_ci
> - utf8_unicode_ci
> - utf8_bin
> 
> but with NO PAD characteristics.
> 
> Suppose you need utf8_general_nopad_ci
> (i.e. based on utf8_general_ci)
> 
> 
> utf8_general_ci is implemented in strings/ctype-utf8.c
> 
> A new collation handler should be added, similar to this one:
> 
> static MY_COLLATION_HANDLER my_collation_utf8_general_ci_handler =
> {
> NULL,   /* init */
> my_strnncoll_utf8_general_ci,
> my_strnncollsp_utf8_general_ci,
> my_strnxfrm_unicode,
> my_strnxfrmlen_unicode,
> my_like_range_mb,
> my_wildcmp_utf8,
> my_strcasecmp_utf8,
> my_instr_mb,
> my_hash_sort_utf8,
> my_propagate_complex
> };
> 
> 
> but these three virtual functions must be redefined to new
> similar functions that do not ignore trailing spaces:
> 
> - my_strnncollsp_utf8_general_ci - this is used for BTREE indexes
> - my_hash_sort_utf8  - this is used for HASH indexes
> - my_strnxfrm_unicode- this is used for filesort
>(non-indexed ORDER BY)
> 
> All other functions can be reused from the existing PAD SPACE collation.
> 
> 
> 
> Then a new "struct charset_info_st" should be defined,
> similar to my_charset_utf8_general_ci.
> 
> Sounds like a few hours of work.
> 
> 
> But then thorough testing will be needed,
> which will be the most time consuming part.
> 
> 
> Also, for consistency, it's worthy to implement
> at least utf8_nopad_bin and utf8_unicode_nopad_ci at once,
> and then eventually NO PAD collations for all other character sets.
> 
>>
>>  - Kristian.
>>

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


[Maria-discuss] New Question: Will MariaDB add json components like MySQL recently announced

2016-03-11 Thread AskMonty KB
Hello,

A new question has been asked in "What is in the Different MariaDB Releases?" 
by p...@mindkicks.com. Please answer it at 
http://mariadb.com/kb/en/will-mariadb-add-json-components-like-mysql-recently-announced/
 as the person asking the question may not be subscribed to the mailing list.


MySQL has released components specifically for json. As a web developer I need 
to know if MariaDB will also add features for json?


To view or answer this question please visit: 
http://mariadb.com/kb/en/will-mariadb-add-json-components-like-mysql-recently-announced/

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Binarus
On 11.03.2016 13:06, Peter Laursen wrote:
> An idea that may work for you could be to TRIM() - 
> http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_trim - 
> the original varchar column into a virtual (PERSISTENT) column, and add an 
> index on this column? 
> 
I'm not sure if I understand you correctly, but using TRIM() on that column 
would make the value 'a' the same as the value 'a ' which is exactly how 
MariaDB's indexes already work by default and the thing which I would like to 
avoid. Have I misunderstood something?

Regards,

Binarus



___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Binarus
On 11.03.2016 12:59, rhys.campb...@swisscom.com wrote:
> I think I could try this as a last resort, but it would require changes in 
> the existing applications.
> 
> Could do it with a trigger to avoid application changes although that's 
> something I'd probably try to avoid.

Agreed. I want to avoid this, too.
 
> If you can afford it, then you could only index part of your VRAHCAR columns 
> to make space for the extra field, i.e. 
> 
> CREATE INDEX part_of_name ON customer (name(200), str(200), str_length);

The problem with this special application is that a) the column is 
VARCHAR(255), b) there are rows where the column is filled completely, i.e. 
where 255 chars are in the column, c) that the index is a unique one, d) 
254*'a' + ' ' must (index wise) be another value than 254*'a'. Furthermore, the 
changes in the applications which are required for that probably are 
substantial.

I know that this might be a special requirement, but on the other hand please 
see my reply to Alexander Barkov's post where I (hopefully) can explain why 
ignoring these trailing spaces in unique indexes is a faulty design from the 
beginning on, may it be standard or not.

So, even if I could circumvent the problem with this special application by 
some mad index constructions and programming techniques, I still would really 
like to have a clean solution.

Alexander has made me hope that he or one of his fellows might be willing to 
implement this, and I will try to give him arguments for doing so.

Therefore, I now have two options:

1) If it will be implemented in MariaDB, I'll switch our applications from 
MySQL to MariaDB.

2) If it will not be implemented, I will stick with MySQL and switch all 
respective rows in all tables from VARCHAR(255) to VARBINARY(765).

Thank you very much again,

Binarus

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Peter Laursen
An idea that may work for you could be to TRIM() -
http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_trim
- the original varchar column into a virtual (PERSISTENT) column, and add
an index on this column?


 Peter

On Fri, Mar 11, 2016 at 12:59 PM,  wrote:

> I think I could try this as a last resort, but it would require changes in
> the existing applications.
>
> Could do it with a trigger to avoid application changes although that's
> something I'd probably try to avoid.
>
> If you can afford it, then you could only index part of your VRAHCAR
> columns to make space for the extra field, i.e.
>
> CREATE INDEX part_of_name ON customer (name(200), str(200), str_length);
>
> -Original Message-
> From: Maria-discuss [mailto:maria-discuss-bounces+rhys.campbell=
> swisscom@lists.launchpad.net] On Behalf Of Binarus
> Sent: Friday, March 11, 2016 11:55 AM
> To: maria-discuss@lists.launchpad.net
> Subject: Re: [Maria-discuss] Collations, trailing spaces and unique indexes
>
> On 11.03.2016 10:56, rhys.campb...@swisscom.com wrote:
> > Just a quick idea... Perhaps you can include an additional field in the
> unique index containing the length on the string. So
> >
> > Str,  str_len
> > 'a',  1
> > 'a ', 2
> >
> > Unq index on (str, str_len).
> >
>
> Thanks for the suggestion. I already have read about that idea when
> researching the same problem for MySQL.
>
> I think I could try this as a last resort, but it would require changes in
> the existing applications, and -more difficult- I should find out the
> maximum index length in indexes which contain several columns. As far as I
> know, the maximum index length with InnoDB is 767 bytes, and I already need
> these because my columns are mostly VARCHAR(255), charset UTF8. Now, if I
> add one of those columns to a unique index, thereby using all available
> index bytes, could I add a second column to the same index at all?
>
> If I have to go this way as a last resort, I'll investigate ...
>
> Binarus
>
> ___
> Mailing list: https://launchpad.net/~maria-discuss
> Post to : maria-discuss@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp
>
> ___
> Mailing list: https://launchpad.net/~maria-discuss
> Post to : maria-discuss@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp
>
___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Alexander Barkov
Hello Binarus, Kristian,

On 03/11/2016 02:25 PM, Kristian Nielsen wrote:
> Binarus  writes:
> 
>> "All MySQL collations are of type PADSPACE. This means that all CHAR,
>> VARCHAR, and TEXT values in MySQL are compared without regard to any
>> trailing spaces. “Comparison” in this context does not include the
> 
> Yes, I have always found this terminally stupid as well. But I think it
> comes from the SQL standard.

Probably this comes from here:

> 2) A  specifies the name of a character set that 
> is defined by a national or
> international standard. The character repertoire of CS is defined by the 
> standard defining the character set
> identified by that . The default collation of 
> the character set is defined by
> the order of the characters in the standard and has the PAD SPACE 
> characteristic.

So we follow the PAD SPACE requirement for default collations here.

I guess the reasoning here is to treat CHAR and VARCHAR in a similar way
by default. I agree with the standard here. We don't want different
behavior for CHAR and VARCHAR.


Btw, we don't follow the requirement that the default collation must be
defined according to "the order of the characters in the standard".
Default collations are traditionally case insensitive in MariaDB/MySQL.

> 
> The only workaround I know of is to use VARBINARY instead of VARCHAR. I
> think it works much the same in most respects. But obviously some semantics
> is lost when the server no longer is aware of the character set used.

Correct.

> 
>> Since the index behaviour obviously depends on the collation, would
>> building an own collation which does not PADSPACE be an option? I have
> 
> That would be interesting, actually. I don't know what support there is for
> non-PADSPACE collations. Maybe bar knows (Cc:'ed)?

We don't have NO PAD collations yet.
I don't remember that anybody ever asked for this before.

From a glance, this should not be too hard to implement.

You previously wrote you need utf8.
So there are three options to make a new collation,
based on one of these existing collations:

- utf8_general_ci
- utf8_unicode_ci
- utf8_bin

but with NO PAD characteristics.

Suppose you need utf8_general_nopad_ci
(i.e. based on utf8_general_ci)


utf8_general_ci is implemented in strings/ctype-utf8.c

A new collation handler should be added, similar to this one:

static MY_COLLATION_HANDLER my_collation_utf8_general_ci_handler =
{
NULL,   /* init */
my_strnncoll_utf8_general_ci,
my_strnncollsp_utf8_general_ci,
my_strnxfrm_unicode,
my_strnxfrmlen_unicode,
my_like_range_mb,
my_wildcmp_utf8,
my_strcasecmp_utf8,
my_instr_mb,
my_hash_sort_utf8,
my_propagate_complex
};


but these three virtual functions must be redefined to new
similar functions that do not ignore trailing spaces:

- my_strnncollsp_utf8_general_ci - this is used for BTREE indexes
- my_hash_sort_utf8  - this is used for HASH indexes
- my_strnxfrm_unicode- this is used for filesort
   (non-indexed ORDER BY)

All other functions can be reused from the existing PAD SPACE collation.



Then a new "struct charset_info_st" should be defined,
similar to my_charset_utf8_general_ci.

Sounds like a few hours of work.


But then thorough testing will be needed,
which will be the most time consuming part.


Also, for consistency, it's worthy to implement
at least utf8_nopad_bin and utf8_unicode_nopad_ci at once,
and then eventually NO PAD collations for all other character sets.

> 
>  - Kristian.
> 

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Binarus
On 11.03.2016 10:56, rhys.campb...@swisscom.com wrote:
> Just a quick idea... Perhaps you can include an additional field in the 
> unique index containing the length on the string. So
> 
> Str,  str_len
> 'a',  1
> 'a ', 2
> 
> Unq index on (str, str_len). 
> 

Thanks for the suggestion. I already have read about that idea when researching 
the same problem for MySQL.

I think I could try this as a last resort, but it would require changes in the 
existing applications, and -more difficult- I should find out the maximum index 
length in indexes which contain several columns. As far as I know, the maximum 
index length with InnoDB is 767 bytes, and I already need these because my 
columns are mostly VARCHAR(255), charset UTF8. Now, if I add one of those 
columns to a unique index, thereby using all available index bytes, could I add 
a second column to the same index at all?

If I have to go this way as a last resort, I'll investigate ...

Binarus

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Kristian Nielsen
Binarus  writes:

> "All MySQL collations are of type PADSPACE. This means that all CHAR,
> VARCHAR, and TEXT values in MySQL are compared without regard to any
> trailing spaces. “Comparison” in this context does not include the

Yes, I have always found this terminally stupid as well. But I think it
comes from the SQL standard.

The only workaround I know of is to use VARBINARY instead of VARCHAR. I
think it works much the same in most respects. But obviously some semantics
is lost when the server no longer is aware of the character set used.

> Since the index behaviour obviously depends on the collation, would
> building an own collation which does not PADSPACE be an option? I have

That would be interesting, actually. I don't know what support there is for
non-PADSPACE collations. Maybe bar knows (Cc:'ed)?

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Binarus
On 11.03.2016 10:30, Reindl Harald wrote:
> Am 11.03.2016 um 09:19 schrieb Binarus:
>> To make a long story short: I would like to have CHAR and VARCHAR fields 
>> with unique indexes which consider 'a' and 'a ' to be different values 
>> (which is the only reasonable point of view IMHO). This is not possible with 
>> MySQL. Is it possible with MariaDB?
> 
> since MariaDB is supposed to be a drop-in-replacement a different behavior 
> would be a bug and break existing applications

Not necessarily IMHO. If you could configure it (i.e. if you had to turn on 
such behaviour explicitly) then it wouldn't be a bug.

Regards,

Binarus


___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Rhys.Campbell
Just a quick idea... Perhaps you can include an additional field in the unique 
index containing the length on the string. So

Str,str_len
'a',1
'a ',   2

Unq index on (str, str_len). 

Rhys

-Original Message-
From: Maria-discuss 
[mailto:maria-discuss-bounces+rhys.campbell=swisscom@lists.launchpad.net] 
On Behalf Of Binarus
Sent: Friday, March 11, 2016 9:19 AM
To: maria-discuss@lists.launchpad.net
Subject: [Maria-discuss] Collations, trailing spaces and unique indexes

Dear all,

I already have done some MySQL development and now eventually would like to 
switch to MariaDB because I no longer can stand the following:

"All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, 
and TEXT values in MySQL are compared without regard to any trailing spaces. 
“Comparison” in this context does not include the LIKE pattern-matching 
operator, for which trailing spaces are significant."

"For those cases where trailing pad characters are stripped or comparisons 
ignore them, if a column has an index that requires unique values, inserting 
into the column values that differ only in number of trailing pad characters 
will result in a duplicate-key error. For example, if a table contains 'a', an 
attempt to store 'a ' causes a duplicate-key error."

(taken from https://dev.mysql.com/doc/refman/5.5/en/char.html)

I did not find any information whether the collations which MariaDB provides 
are of type PADSPACE as in MySQL (none of the collation tables from MariaDB's 
documentation does contain this information) or not.

To make a long story short: I would like to have CHAR and VARCHAR fields with 
unique indexes which consider 'a' and 'a ' to be different values (which is the 
only reasonable point of view IMHO). This is not possible with MySQL. Is it 
possible with MariaDB? I am generally working with charset UTF8 and would be 
happy with a general solution, but also with a non-general solution which only 
works with UTF8.

Since the index behaviour obviously depends on the collation, would building an 
own collation which does not PADSPACE be an option? I have read vague hints 
about that in MariaDB's documentation, but didn't try yet (before doing so, I 
would like to know if there is an easy, clean solution).

Thank you very much for any ideas,

Binarus

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp
___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Reindl Harald



Am 11.03.2016 um 09:19 schrieb Binarus:

To make a long story short: I would like to have CHAR and VARCHAR fields with 
unique indexes which consider 'a' and 'a ' to be different values (which is the 
only reasonable point of view IMHO). This is not possible with MySQL. Is it 
possible with MariaDB?


since MariaDB is supposed to be a drop-in-replacement a different 
behavior would be a bug and break existing applications




signature.asc
Description: OpenPGP digital signature
___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


[Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Binarus
Dear all,

I already have done some MySQL development and now eventually would like to 
switch to MariaDB because I no longer can stand the following:

"All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, 
and TEXT values in MySQL are compared without regard to any trailing spaces. 
“Comparison” in this context does not include the LIKE pattern-matching 
operator, for which trailing spaces are significant."

"For those cases where trailing pad characters are stripped or comparisons 
ignore them, if a column has an index that requires unique values, inserting 
into the column values that differ only in number of trailing pad characters 
will result in a duplicate-key error. For example, if a table contains 'a', an 
attempt to store 'a ' causes a duplicate-key error."

(taken from https://dev.mysql.com/doc/refman/5.5/en/char.html)

I did not find any information whether the collations which MariaDB provides 
are of type PADSPACE as in MySQL (none of the collation tables from MariaDB's 
documentation does contain this information) or not.

To make a long story short: I would like to have CHAR and VARCHAR fields with 
unique indexes which consider 'a' and 'a ' to be different values (which is the 
only reasonable point of view IMHO). This is not possible with MySQL. Is it 
possible with MariaDB? I am generally working with charset UTF8 and would be 
happy with a general solution, but also with a non-general solution which only 
works with UTF8.

Since the index behaviour obviously depends on the collation, would building an 
own collation which does not PADSPACE be an option? I have read vague hints 
about that in MariaDB's documentation, but didn't try yet (before doing so, I 
would like to know if there is an easy, clean solution).

Thank you very much for any ideas,

Binarus

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp