Dave,

I can't speak to Nania's specific issue, but I believe it's a
*pgAdmin4* specific
problem, at least in so far as SQL_ASCII is concerned.  I say this because
I can usually work with the data just fine from the psql prompt, but
*not* through
pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC
connection).  .Net/Windows ODBC drivers and psql command prompt, no problem
(as was pgAdmin3 assuming you don't do too much with it beyond
select/update/insert).  pgAdmin4, SELECT, export, etc. *BOOM*! At least
until you *cleaned * up the offending bytes.

Just my $0.02.

rik.





On Mon, Jan 7, 2019 at 12:49 PM Dave Page <dp...@pgadmin.org> wrote:

> On Mon, Jan 7, 2019 at 9:11 PM richard coleman
> <rcoleman.ascen...@gmail.com> wrote:
> >
> > Nania,
> >
> > Welcome to the wonderful world of pgAdmin4.  I have been bitten often by
> this particular shortcoming in pgAdmin4. :(  My issue seems to stems from
> the fact that I use ASCII tables as a back end for a .Net windows
> application and perfectly valid windows (Word/Excel) characters cause
> pgAdmin4 no end of issues.
>
> pgAdmin has nothing to do with this. It is simply calling PostgreSQL's
> psql utility, and telling it to import or export the file. The
> database server is then throwing the error seen.
>
> > My solution (with the help of some fine people on the postgres IRC
> channel) is to run a couple of functions on my tables/fields to locate and
> clean the offending characters out.  Of course, if you need those
> characters, then this won't actually help.  Here they are in the advent
> that they might prove helpful/adaptable to your situation.
>
> The problem with that is that you're trying to fix something that's
> basically broken to begin with. From the PostgreSQL docs
> (https://www.postgresql.org/docs/current/multibyte.html):
>
> ----
> The SQL_ASCII setting behaves considerably differently from the other
> settings. When the server character set is SQL_ASCII, the server
> interprets byte values 0-127 according to the ASCII standard, while
> byte values 128-255 are taken as uninterpreted characters. No encoding
> conversion will be done when the setting is SQL_ASCII. Thus, this
> setting is not so much a declaration that a specific encoding is in
> use, as a declaration of ignorance about the encoding. In most cases,
> if you are working with any non-ASCII data, it is unwise to use the
> SQL_ASCII setting because PostgreSQL will be unable to help you by
> converting or validating non-ASCII characters.
> ----
>
>
> > Finds what pgAdmin4 considers bad UTF8:
> > CREATE OR REPLACE FUNCTION live.is_utf8(
> > text)
> >     RETURNS boolean
> >     LANGUAGE 'sql'
> >
> >     COST 100
> >     VOLATILE
> > AS $BODY$
> >     select encode(convert_to($1,'SQL_ASCII'),'hex')
> >            ~ $r$(?x)
> >                 ^(?:(?:[0-7][0-9a-f])
> >                    |(?:(?:c[2-9a-f]|d[0-9a-f])
> >                       |e0[ab][0-9a-f]
> >                       |ed[89][0-9a-f]
> >                       |(?:(?:e[1-9abcef])
> >                          |f0[9ab][0-9a-f]
> >                          |f[1-3][89ab][0-9a-f]
> >                          |f48[0-9a-f]
> >                        )[89ab][0-9a-f]
> >                     )[89ab][0-9a-f]
> >                  )*$
> >              $r$;
> > $BODY$;
> >
> > ALTER FUNCTION live.is_utf8(text)
> >     OWNER TO postgres;
> >
> >
> > Fixes what pgAdmin4 considers to be bad UTF8:
> >>
> >> CREATE OR REPLACE FUNCTION live.badutf8(
> >> text)
> >>     RETURNS text
> >>     LANGUAGE 'sql'
> >>     COST 100
> >>     VOLATILE
> >> AS $BODY$
> >>     select regexp_replace(encode(convert_to($1,'SQL_ASCII'),'hex'),
> >>              $r$(?x)
> >>                  (?:(?:[0-7][0-9a-f])
> >>                    |(?:(?:c[2-9a-f]|d[0-9a-f])
> >>                       |e0[ab][0-9a-f]
> >>                       |ed[89][0-9a-f]
> >>                       |(?:(?:e[1-9abcef])
> >>                          |f0[9ab][0-9a-f]
> >>                          |f[1-3][89ab][0-9a-f]
> >>                          |f48[0-9a-f]
> >>                        )[89ab][0-9a-f]
> >>                     )[89ab][0-9a-f]
> >>                  )*(..)?
> >>              $r$, '-\1-', 'g')
> >> $BODY$;
> >> ALTER FUNCTION live.badutf8(text)
> >>     OWNER TO postgres;
> >
> >
> > Fixes bad UTF8
> >
> >
> >
> > On Mon, Jan 7, 2019 at 8:40 AM Nanina Tron <nanina.t...@icloud.com>
> wrote:
> >>
> >> Hi,
> >>
> >> I am pretty new to PostgreSQL so I might just miss something basic here.
> >>
> >> My problem is that, I cannot import or export some of the tables in my
> db with pgAdmin4, as it raises the “ERROR: unvalid byte-sequenz for coding
> »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and
> imported via pgAdmin3. The strange thing is that it can still be imported
> and exported with pgAdmin3 but not with pgAdmin4. The db was created with
> encoding UTF-8, the .csv files where created with encoding UTF-8 and also
> the import/export dialog is set to UTF-8. Queries are also no problem on
> these tables so it seems to me that this could be a client problem.
> >>
> >> I am running PostgreSQL 11.1 on a server (I don’t know the OS,
> maintained with pgAdmin4). Locally I am working on a Windows 7 Professional
> (Service Pack 1) 64 Bit-System and pgAdmin4  3.6 & pgAdmin3.
> >>
> >> I did not find any hint of the same problem on my Google or archive
> search, so I would be very grateful for any idea what I am doing wrong here.
> >>
> >> Best,
> >>
> >> Nanina
>
>
>
> --
> Dave Page
> Blog: http://pgsnake.blogspot.com
> Twitter: @pgsnake
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Reply via email to