Dave, I can't speak to Nania's specific issue, but I believe it's a *pgAdmin4* specific problem, at least in so far as SQL_ASCII is concerned. I say this because I can usually work with the data just fine from the psql prompt, but *not* through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection). .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert). pgAdmin4, SELECT, export, etc. *BOOM*! At least until you *cleaned * up the offending bytes.
Just my $0.02. rik. On Mon, Jan 7, 2019 at 12:49 PM Dave Page <dp...@pgadmin.org> wrote: > On Mon, Jan 7, 2019 at 9:11 PM richard coleman > <rcoleman.ascen...@gmail.com> wrote: > > > > Nania, > > > > Welcome to the wonderful world of pgAdmin4. I have been bitten often by > this particular shortcoming in pgAdmin4. :( My issue seems to stems from > the fact that I use ASCII tables as a back end for a .Net windows > application and perfectly valid windows (Word/Excel) characters cause > pgAdmin4 no end of issues. > > pgAdmin has nothing to do with this. It is simply calling PostgreSQL's > psql utility, and telling it to import or export the file. The > database server is then throwing the error seen. > > > My solution (with the help of some fine people on the postgres IRC > channel) is to run a couple of functions on my tables/fields to locate and > clean the offending characters out. Of course, if you need those > characters, then this won't actually help. Here they are in the advent > that they might prove helpful/adaptable to your situation. > > The problem with that is that you're trying to fix something that's > basically broken to begin with. From the PostgreSQL docs > (https://www.postgresql.org/docs/current/multibyte.html): > > ---- > The SQL_ASCII setting behaves considerably differently from the other > settings. When the server character set is SQL_ASCII, the server > interprets byte values 0-127 according to the ASCII standard, while > byte values 128-255 are taken as uninterpreted characters. No encoding > conversion will be done when the setting is SQL_ASCII. Thus, this > setting is not so much a declaration that a specific encoding is in > use, as a declaration of ignorance about the encoding. In most cases, > if you are working with any non-ASCII data, it is unwise to use the > SQL_ASCII setting because PostgreSQL will be unable to help you by > converting or validating non-ASCII characters. > ---- > > > > Finds what pgAdmin4 considers bad UTF8: > > CREATE OR REPLACE FUNCTION live.is_utf8( > > text) > > RETURNS boolean > > LANGUAGE 'sql' > > > > COST 100 > > VOLATILE > > AS $BODY$ > > select encode(convert_to($1,'SQL_ASCII'),'hex') > > ~ $r$(?x) > > ^(?:(?:[0-7][0-9a-f]) > > |(?:(?:c[2-9a-f]|d[0-9a-f]) > > |e0[ab][0-9a-f] > > |ed[89][0-9a-f] > > |(?:(?:e[1-9abcef]) > > |f0[9ab][0-9a-f] > > |f[1-3][89ab][0-9a-f] > > |f48[0-9a-f] > > )[89ab][0-9a-f] > > )[89ab][0-9a-f] > > )*$ > > $r$; > > $BODY$; > > > > ALTER FUNCTION live.is_utf8(text) > > OWNER TO postgres; > > > > > > Fixes what pgAdmin4 considers to be bad UTF8: > >> > >> CREATE OR REPLACE FUNCTION live.badutf8( > >> text) > >> RETURNS text > >> LANGUAGE 'sql' > >> COST 100 > >> VOLATILE > >> AS $BODY$ > >> select regexp_replace(encode(convert_to($1,'SQL_ASCII'),'hex'), > >> $r$(?x) > >> (?:(?:[0-7][0-9a-f]) > >> |(?:(?:c[2-9a-f]|d[0-9a-f]) > >> |e0[ab][0-9a-f] > >> |ed[89][0-9a-f] > >> |(?:(?:e[1-9abcef]) > >> |f0[9ab][0-9a-f] > >> |f[1-3][89ab][0-9a-f] > >> |f48[0-9a-f] > >> )[89ab][0-9a-f] > >> )[89ab][0-9a-f] > >> )*(..)? > >> $r$, '-\1-', 'g') > >> $BODY$; > >> ALTER FUNCTION live.badutf8(text) > >> OWNER TO postgres; > > > > > > Fixes bad UTF8 > > > > > > > > On Mon, Jan 7, 2019 at 8:40 AM Nanina Tron <nanina.t...@icloud.com> > wrote: > >> > >> Hi, > >> > >> I am pretty new to PostgreSQL so I might just miss something basic here. > >> > >> My problem is that, I cannot import or export some of the tables in my > db with pgAdmin4, as it raises the “ERROR: unvalid byte-sequenz for coding > »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and > imported via pgAdmin3. The strange thing is that it can still be imported > and exported with pgAdmin3 but not with pgAdmin4. The db was created with > encoding UTF-8, the .csv files where created with encoding UTF-8 and also > the import/export dialog is set to UTF-8. Queries are also no problem on > these tables so it seems to me that this could be a client problem. > >> > >> I am running PostgreSQL 11.1 on a server (I don’t know the OS, > maintained with pgAdmin4). Locally I am working on a Windows 7 Professional > (Service Pack 1) 64 Bit-System and pgAdmin4 3.6 & pgAdmin3. > >> > >> I did not find any hint of the same problem on my Google or archive > search, so I would be very grateful for any idea what I am doing wrong here. > >> > >> Best, > >> > >> Nanina > > > > -- > Dave Page > Blog: http://pgsnake.blogspot.com > Twitter: @pgsnake > > EnterpriseDB UK: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >