[sqlite] Unicode issue on windows consoles. Was: Version 3.11.0 beta

Dominique Devienne Fri, 12 Feb 2016 13:55:00 +0100

On Fri, Feb 12, 2016 at 11:36 AM, Olivier Mascia <om at integral.be> wrote:


> > So it would be a bad idea to change sqlite3's output depending on the
> > current code page or font.
>
> I share your view on keeping things as unicode as possible, but the
> frontier is thin before trying to impose our views onto Windows usages and
> Windows command-line users expectations.  That's why I tend to favor the
> path of least resistance: just fit whatever the default
> behavior/expectation is for the a (narrow) command-line tool writing some
> text to the console.
>

The solution IMHO is to introduce a clean abstraction in shell.c for
input/output.

i.e. a Virtual Console Layer (VCL), similar to the existing Virtual File
Layer (VFL) of sqlite3.c (amalgamation).
The VCL would assume Unicode (e.g. UTF-8) on the "shell side" of the APIs,
and the implementation of the VCL
takes care of the platform-specific and console-specific "details" to get
or render those UTF-8 encoded chars
as best of the current "console" can or allows.

And this would also fill my need to be able to embed the SQLite3 shell in a
GUI app for example,
simply by writing my own VCL, talking to Qt for example.

The above wouldn't preclude Dr Hipp from evolving shell.c as he sees fit,
even in BC-breaking ways,
since the contract would be clear that the only public API would be to get
the text in and out of the shell
to the "console". The only thing necessary is then an alternative entry
point than main() via conditional
compilation basically, and a way to inject one's own VCL.


> Most people using sqlite3.exe interactively (on windows) won't expect to
> have correct console output for international texts (stored in the db)
> using scripts which are unusual for their system. But most users will
> expect to be able to type and see characters which are usual for them and
> their system.  A user on a Windows set for typical Western Europe support
> (CP850 for OEM and CP1252 for ANSI), will expect to be able to type and see
> correctly things like '?', '?', '?', '?'. And won't be surprised to have a
> display issue with characters more exotic to them (those are characters
> which they would be at pain trying to type with their keyboard). Yet those
> concerned can use the standard Windows command CHCP to check or change
> their console code page to their liking. Provided that the font configured
> in their cmd session is able to, they will then be able to see those other
> characters (and copy them properly).
>

Seeing those characters properly in the Windows console is not enough!

The DB file itself must really contain properly encoded UTF-8 text for them,
such that the same DB on Linux or another SQLite-based app which really
supports Unicode
also can see them correctly.

The former is already true in places, but I don't think the latter is
(although I have tested it yet). --DD

[sqlite] Unicode issue on windows consoles. Was: Version 3.11.0 beta

Reply via email to