Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascen...@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that 
> SQL_ASCII is recommended against doing. Here's the documentation listing the 
> supported encoding schemas: 
> https://www.postgresql.org/docs/current/multibyte.html .
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to 
>> use the SQL_ASCII setting because PostgreSQL will be unable to help you by 
>> converting or validating non-ASCII characters.

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want 
> to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding 
>> even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not 
>> enforce that the data stored in the database has any particular encoding, 
>> and so this choice poses risks of locale-dependent misbehavior. Using this 
>> combination of settings is deprecated and may someday be forbidden 
>> altogether.
>
>
> A note that you can currently choose incompatible settings, but probably 
> can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is 
>> disabled, regardless of the server's character set. Just as for the server, 
>> use of SQL_ASCII is unwise unless you are working with all-ASCII 
>> data[emphasis mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using 
> SQL_ASCII the database won't perform any conversions on your behalf.
>
> That is hardly a recommendation against using that supported encoding scheme. 
>  The fact that the psql command prompt, among others, works with it without 
> issue, is an indication that the problem lies in pgAdmin4 (and I would guess 
> the reliance of python on UTF8) than an issue with the database itself.  
> pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data 
> that might happen to be not UTF8 compliant.
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to 
> keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise 
> .Net applications that depend on it prohibit converting it to UTF8 (or 
> anything else for that matter).
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dp...@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascen...@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 
>> > specific problem, at least in so far as SQL_ASCII is concerned.  I say 
>> > this because I can usually work with the data just fine from the psql 
>> > prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver 
>> > that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql 
>> > command prompt, no problem (as was pgAdmin3 assuming you don't do too much 
>> > with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. 
>> > BOOM! At least until you cleaned  up the offending bytes.
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply via email to