[
https://issues.apache.org/jira/browse/CONNECTORS-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751709#comment-16751709
]
Markus Schuch edited comment on CONNECTORS-1572 at 1/24/19 11:58 PM:
---------------------------------------------------------------------
Added a 🎮 character in the repository connector description string in
{{RSSSimpleCrawlTester}} to successfully recreate the error described in the
mailing list thread.
When switching to {{utf8mb4}} the test still fails:
Reason is the 255 length varchar columns.
{quote}
InnoDB has a maximum index length of 767 bytes for tables that use COMPACT or
REDUNDANT row format, so for utf8mb3 or utf8mb4 columns, you can index a
maximum of 255 or 191 characters, respectively
{quote}
(https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-conversion.html)
A solution is to configure the database with the following settings:
{code}
innodb_file_format = barracuda
innodb_file_per_table = 1
innodb_large_prefix = 1
innodb_default_row_format = DYNAMIC
{code}
Also an update of the JDBC driver is required to automatically work correct
with {{utf8mb4}} mode. (since 5.1.47+ or 8.0.13+). Otherwise the server setting
{{character_set_server = utf8mb4}} must also be set.
was (Author: schuchm):
Added a 🎮 character in the repository connector description string in
{{RSSSimpleCrawlTester}} to successfully recreate the error.
When switching to {{utf8mb4}} the test still fails:
Reason is the 255 length varchar columns.
{quote}
InnoDB has a maximum index length of 767 bytes for tables that use COMPACT or
REDUNDANT row format, so for utf8mb3 or utf8mb4 columns, you can index a
maximum of 255 or 191 characters, respectively
{quote}
(https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-conversion.html)
A solution is to configure the database with the following settings:
{code}
innodb_file_format = barracuda
innodb_file_per_table = 1
innodb_large_prefix = 1
innodb_default_row_format = DYNAMIC
{code}
Also an update of the JDBC driver is required to automatically work correct
with {{utf8mb4}} mode. (since 5.1.47+ or 8.0.13+). Otherwise the server setting
{{character_set_server = utf8mb4}} must also be set.
> Support 4-byte characters for Strings stored in MySQL/MariaDB by default
> ------------------------------------------------------------------------
>
> Key: CONNECTORS-1572
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1572
> Project: ManifoldCF
> Issue Type: Improvement
> Reporter: Markus Schuch
> Assignee: Markus Schuch
> Priority: Major
> Attachments: CONNECTORS-1572.patch
>
>
> DBInterfaceMySQL creates the database with {{utf8}} charset which does not
> support 4-byte characters in varchar columns. This can be a problem, if a
> String stored to the database (e.g. version string) contains such a
> character, e.g. emojis
> We should create the database with the {{utf8mb4}} charset and the
> {{utf8mb4_bin}} collation and document this setting to support this situation
> better.
> http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201901.mbox/%[email protected]%3e
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)