SQLite has built-in support for EBCDIC-based systems, but I discovered that it’s been broken since 3.11.0. If you have an EBCDIC-based system, you can see the brokenness by firing up `sqlite` and trying the `.schema` metacommand – you’ll get an obscure error.
In detail, in February 2016, several changes were made to the SQL tokenizer for performance to use a character lookup table instead of a switch statement based on character literals in the C source; see http://www.sqlite.org/src/info/9115baa1919584dc <http://www.sqlite.org/src/info/9115baa1919584dc> and http://www.sqlite.org/src/info/04f7da77c13925c1 <http://www.sqlite.org/src/info/04f7da77c13925c1>. However, the character lookup table for EBCDIC appears to have several typos in it, causing several ubiquitous characters in SQL input (such as ‘.’) to be classified as invalid characters. This results in internal low-level queries like select sql from “main”.sqlite_master to fail to parse, with a non-obvious error message (I guess such low-level queries are always expected to succeed!). This broken character lookup table on EBCDIC systems causes pretty much any non-trivial SQL query to fail to parse, and causes for example the ‘.schema’ meta command to fail — making SQLite totally broken out-of-the-box on EBCDIC systems. The problem is in the `aiClass` character properties table when `SQLITE_EBCDIC` is defined. This table is defined as follows: static const unsigned char aiClass[] = { #ifdef SQLITE_ASCII … #endif #ifdef SQLITE_EBCDIC /* x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf */ /* 0x */ 27, 27, 27, 27, 27, 7, 27, 27, 27, 27, 27, 27, 7, 7, 27, 27, /* 1x */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, /* 2x */ 27, 27, 27, 27, 27, 7, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, /* 3x */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, /* 4x */ 7, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 12, 17, 20, 10, /* 5x */ 24, 27, 27, 27, 27, 27, 27, 27, 27, 27, 15, 4, 21, 18, 19, 27, /* 6x */ 11, 16, 27, 27, 27, 27, 27, 27, 27, 27, 27, 23, 22, 1, 13, 7, /* 7x */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 8, 5, 5, 5, 8, 14, 8, /* 8x */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* 9x */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* 9x */ 25, 1, 1, 1, 1, 1, 1, 0, 1, 1, 27, 27, 27, 27, 27, 27, /* Bx */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 9, 27, 27, 27, 27, 27, /* Cx */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* Dx */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* Ex */ 27, 27, 1, 1, 1, 1, 1, 0, 1, 1, 27, 27, 27, 27, 27, 27, /* Fx */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 27, 27, 27, 27, 27, 27, #endif }; While it’s conceivable that this table was written for a different codepage than used by the mainframe I was using, it looks more likely that there are typos in this table. For example: There are two “9x” rows in the table above; there is no “Ax” row There are no entries in this table for the CC_DOT or CC_VARNUM #defines (26 and 6 respectively) Assuming codepage 1047 (the most commonly used code page?), the entry for the CC_TILDA #define (25) is in the wrong place. To fix this problem, I patched the SQLite sources to change the `aiClass` character properties table to this: static const unsigned char aiClass[] = { #ifdef SQLITE_ASCII … #endif #ifdef SQLITE_EBCDIC /* x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xa xb xc xd xe xf */ /* 0x */ 27, 27, 27, 27, 27, 7, 27, 27, 27, 27, 27, 27, 7, 7, 27, 27, /* 1x */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, /* 2x */ 27, 27, 27, 27, 27, 7, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, /* 3x */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, /* 4x */ 7, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 26, 12, 17, 20, 10, /* 5x */ 24, 27, 27, 27, 27, 27, 27, 27, 27, 27, 15, 4, 21, 18, 19, 27, /* 6x */ 11, 16, 27, 27, 27, 27, 27, 27, 27, 27, 27, 23, 22, 1, 13, 6, /* 7x */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 8, 5, 5, 5, 8, 14, 8, /* 8x */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* 9x */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* Ax */ 27, 25, 1, 1, 1, 1, 1, 0, 1, 1, 27, 27, 27, 9, 27, 27, /* Bx */ 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, /* Cx */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* Dx */ 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 27, 27, 27, 27, 27, /* Ex */ 27, 27, 1, 1, 1, 1, 1, 0, 1, 1, 27, 27, 27, 27, 27, 27, /* Fx */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 27, 27, 27, 27, 27, 27, #endif }; These changes fixed the SQL tokenizer problems I was seeing my EBCDIC-based system, and resulted in a functioning SQLite there. Please let me know if more is needed to fix this bug. Hope this helps, Brad Larsen https://en.wikipedia.org/wiki/EBCDIC_1047 <https://en.wikipedia.org/wiki/EBCDIC_1047> P.S. It would be helpful if the SQLite documentation indicated which EBCDIC codepage(s) were supported. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users