Hi Dave,
--
*Harshal Dhumal*
*Software Engineer*
EnterpriseDB India: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Feb 6, 2017 at 6:48 PM, Dave Page <[email protected]> wrote:
> Hi
>
> On Mon, Feb 6, 2017 at 12:57 PM, Harshal Dhumal
> <[email protected]> wrote:
> > Hi,
> >
> > Please find attached patch for RM 1983.
> >
> > This issue only occurs when database encoding is other than utf-8
> >
> > Also other issue was when we use connection of database with encoding
> other
> > than utf-8 to retrieve data from cluster table/s which has encoding utf-8
> > (e.g. pg_database) then data was not decoded properly.
>
> The code makes an assumption that pg_database is always utf-8 encoded.
> I don't believe that is correct - I believe it's the encoding used in
> the database from which the new database was created. The general
> advice is that users should avoid using non-7bit ASCII characters in
> shared catalogs, e.g. databases and comments etc.
>
Ok.
Let me split this into two issues:
i) RM1983 for which I have attached updated patch. (basically I removed
changes related to decode data retried from pg_database when connection
encoding is other than utf-8)
ii) Support to allow user to use non-&bit ASCII characters in shared
catalogs with the help of pgAdmin4.
Regarding your statement about pg_database *"I believe it's the encoding
used in the database from which the new database was created."**. *I found
it little-bit confusing for me (correct me if i'm wrong); As mentioned here
<https://www.postgresql.org/docs/9.5/static/catalog-pg-database.html> there
is only one copy of pg_database per cluster. So I assume pg_database is
created when we initialize database cluster and not when we create new
database.
>
> See https://www.postgresql.org/message-id/flat/20160216163833.GF31273%
> 40momjian.us#[email protected]
> for more info for example.
>
> Did pgAdmin 3 just assume it was UTF-8? I suspect it did - and that
> just happened to work in most cases.
>
> --
> Dave Page
> Blog: http://pgsnake.blogspot.com
> Twitter: @pgsnake
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
diff --git a/web/pgadmin/utils/driver/psycopg2/__init__.py b/web/pgadmin/utils/driver/psycopg2/__init__.py
index 4cafeb2..4e460a6 100644
--- a/web/pgadmin/utils/driver/psycopg2/__init__.py
+++ b/web/pgadmin/utils/driver/psycopg2/__init__.py
@@ -26,7 +26,7 @@ from flask import g, current_app, session
from flask_babel import gettext
from flask_security import current_user
from pgadmin.utils.crypto import decrypt
-from psycopg2.extensions import adapt
+from psycopg2.extensions import adapt, encodings
import config
from pgadmin.model import Server, User
@@ -74,6 +74,7 @@ psycopg2.extensions.register_type(
'NUMERIC_RANGE_TEXT', psycopg2.STRING)
)
+
def register_string_typecasters(connection):
"""
Casts various types to string, resolving issues with out of
@@ -94,6 +95,30 @@ def register_string_typecasters(connection):
new_type = psycopg2.extensions.new_type(oids, 'RETURN_STRING', return_as_string)
psycopg2.extensions.register_type(new_type)
+ # In python3 when database encoding is other than utf-8 and client encoding
+ # is set to UNICODE then we need to map data from database encoding
+ # to utf-8.
+ # This is required because when client encoding is set to UNICODE then
+ # psycopg assumes database encoding utf-8 and not the actual encoding.
+ # Not sure whether it's bug or feature in psycopg for python3.
+
+ if sys.version_info >= (3,) and connection.encoding != 'UTF8':
+ def return_as_unicode(value, cursor):
+ if value is None:
+ return None
+ # Treat value as byte sequence of database encoding and then decode
+ # it as utf-8 to get correct unicode value.
+ return bytes(
+ value, encodings[cursor.connection.encoding]
+ ).decode('utf-8')
+
+ unicode_type = psycopg2.extensions.new_type(
+ (19, 18, 25, 1042, 1043, 0),
+ 'UNICODE', return_as_unicode)
+
+ psycopg2.extensions.register_type(unicode_type)
+
+
class Connection(BaseConnection):
"""
class Connection(object)
@@ -568,11 +593,17 @@ WHERE
query: SQL query to run.
params: Extra parameters
"""
+
+ if sys.version_info < (3,):
+ if type(query) == unicode:
+ query = query.encode('utf-8')
+ else:
+ query = query.encode('utf-8')
+
cur.execute(query, params)
if self.async == 1:
self._wait(cur.connection)
-
def execute_on_server_as_csv(self, query, params=None, formatted_exception_msg=False, records=2000):
status, cur = self.__cursor(server_cursor=True)
self.row_count = 0
@@ -581,11 +612,14 @@ WHERE
return False, str(cur)
query_id = random.randint(1, 9999999)
+ if sys.version_info < (3,) and type(query) == unicode:
+ query = query.encode('utf-8')
+
current_app.logger.log(25,
u"Execute (with server cursor) for server #{server_id} - {conn_id} (Query-id: {query_id}):\n{query}".format(
server_id=self.manager.sid,
conn_id=self.conn_id,
- query=query,
+ query=query.decode('utf-8') if sys.version_info < (3,) else query,
query_id=query_id
)
)
@@ -703,6 +737,13 @@ WHERE
params: extra parameters to the function
formatted_exception_msg: if True then function return the formatted exception message
"""
+
+ if sys.version_info < (3,):
+ if type(query) == unicode:
+ query = query.encode('utf-8')
+ else:
+ query = query.encode('utf-8')
+
self.__async_cursor = None
status, cur = self.__cursor()
@@ -715,7 +756,7 @@ WHERE
u"Execute (async) for server #{server_id} - {conn_id} (Query-id: {query_id}):\n{query}".format(
server_id=self.manager.sid,
conn_id=self.conn_id,
- query=query,
+ query=query.decode('utf-8'),
query_id=query_id
)
)
@@ -733,7 +774,7 @@ Failed to execute query (execute_async) for the server #{server_id} - {conn_id}
""".format(
server_id=self.manager.sid,
conn_id=self.conn_id,
- query=query,
+ query=query.decode('utf-8'),
errmsg=errmsg,
query_id=query_id
)
--
Sent via pgadmin-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgadmin-hackers