Re: DBD::mysql path forward

Night Light Tue, 19 Sep 2017 05:47:20 -0700

Dear Perl gurus,

This is my first post. I'm using Perl with great joy, and I'd like to
express my gratitude for all you are doing to keep Perl stable and fun to
use.


I'd like to ask to object to re-releasing this version and discuss on how
to make 4.043 backwards compatible instead.
This change will with 100% certainty corrupt all BLOB data written to the
database when the developer did not read the release notes before applying
the latest version of DBD::mysql (and changed its code consequently).
Knowing that sysadmins have the habit of not always reading the release
notes of each updated package the likelihood that this will happen will
therefore high.
I myself wasn't even shown the release notes as it was a dependency of an
updated package that I applied.
The exposure of this change is big as DBD::mysql affects multiple
applications and many user bases.
I believe deliberately introducing industry wide database corruption is
something that will significantly harm peoples confidence in using Perl.
I believe that not providing backwards compatibility is not in line with
the Perl policy that has been carefully put together by the community to
maintain the quality of Perl as it is today.
http://perldoc.perl.org/perlpolicy.html#BACKWARD-COMPATIBILITY-AND-DEPRECATION

I therefore believe the only solution is an upgrade that is by default
backwards compatible, and where it is the user who decides when to start
UTF8 encode the input values of a SQL request instead.
If it is too time consuming or too difficult it should be considered to
park the UTF8-encoding "fix" and release a version with the security fix
first.

I have the following objections against this release:

1. the upgrade will corrupt more records than it fixes (it does more harm
than good)
2. the reason given for not providing backward compatibility ("because it
was hard to implement") is not plausible given the level of unwanted side
effects.
   This especially knowing that there is already a mechanism in place to
signal if its wants UTF8 encoding or not
(mysql_enable_utf8/mysql_enable_utf8mb4).
3. it costs more resources to coordinate/discuss a "way forward" or options
than to implement a solution that addresses backwards compatibility
4. it is unreasonable to ask for changing existing source knowing that
depending modules may not be actively maintained or proprietary
   It can be argued that such module should always be maintained but it
does not change the fact that a good running Perl program becomes unusable
5. it does not inform the user that after upgrading existing code will
start write corrupt BLOB records
6. it does not inform the user about the fact that a code review of all
existing code is necessary, and how it needs to be changed and tested
7. it does not give the user the option to decide how the BLOB's should be
stored/encoded (opt in)
8. it does not provide backwards compatibility
   By doing so it does not respect the Perl policy that has been carefully
put together by the community to maintain the quality of Perl as it is
today.

http://perldoc.perl.org/perlpolicy.html#BACKWARD-COMPATIBILITY-AND-DEPRECATION
9. it blocks users from using DBD::mysql upgrades as long as they have not
rewritten their existing code
10. not all users from DBD::mysql can be warned beforehand about the side
effects as it is not known which private parties have code that use
DBD::mysql
12. I believe development will go faster when support for backwards
compatibility is addressed
13. having to write 1 extra line for each SQL query value is a monks job
that will make the module less attractive to use

About forking to DBD::mariadb?:
The primary reason to create such a module is when the communication
protocol of Mariadb has become incompatible with Mysql.
To use this namespace to fix a bug in DBD::mysql does not meet that
criteria and causes confusion for developers and unnecessary pollution of
the DBD namespace.

---

For people that do not know the impact of the change that is pending to be
committed:
(see Github issue that includes 3 reports of companies that suffered data
loss https://github.com/perl5-dbi/DBD-mysql/issues/117 )

Issue: some UTF8 characters are not properly displayed after retrieval
Cause: SQL query values are not UTF8 encoded when sent to the database but
they are all decoded once retrieved.
Occurence: Only records with string data that can only be written with
UTF8. It can be considered rare as people haven't reported this issue after
10 years of usage.
Regional impact: Only affects countries which characters need UTF8 encoding
and only affects string values.
Steps to recover from it: Read string data unencoded and write it encoded.

Changes of upgrade pending to be re-released:
SQL query values are both UTF8 encoded when sent to the database as when
its retrieved (including BLOB fields).
BLOB fields will be excluded from encoding only if you specify its data
type.

Side effects from installing upgrade:
- BLOB data will be written after UTF8 encoding and will therefore be
corrupt
- no possibility to detect if a BLOB field is corrupt or not. Only when
known when the INSERT/UPDATE took place, and when the upgrade was installed
- existing data will still display incorrect

Occurence: every INSERT/UPDATE statement will start writing corrupted BLOB
data
Regional impact: worldwide
Steps to recover from it corrupted BLOBs? You cannot. Your binary blobs are
encoded as if they were UTF8 strings. Your binary data is unrecoverable (as
in "gone forever").
If you are a dentist you have to ask your customers to come back to make
another x-ray as the made photo's are gone.

What is asked from the developer to prevent this from happening?
- do not miss reading the release notes before upgrading
- review all source code (including written by other included modules) and
specify the data type of each SQL parameter value
  before: $dbh->do('INSERT INTO test (BLOB1,BLOB2,BLOB3,BLOB4)
VALUES(?,?,?,?)',undef,$col1,$col2,$col3);
  after:  $dbh->do('INSERT INTO test (BLOB1,BLOB2,BLOB3,BLOB4)
VALUES(?,?,?,?)');
          $sth->bind_param(1, $file, SQL_BLOB);
          $sth->bind_param(2, $file, SQL_BLOB);
          $sth->bind_param(3, $file, SQL_BLOB);
          ...
  One line more for each SQL statement. This will be a time consuming monks
task during which the user will ask why this is necessary while it worked
before.
- upgrade scripts need to be written to UTF8 encode existing string data
- retest all source code

Re: DBD::mysql path forward

Reply via email to