I’m no Python guy, but I think C.UTF-8 is probably the only sane thing to do – particularly if you’re going to support things like Unicode file names.   I always considered C.UTF-8 to be sort of weird, but this case is precisely why you need it.

 

Sent from Mail for Windows 10

 

From: Alexander Pyhalov via illumos-discuss
Sent: Tuesday, March 10, 2020 11:25 PM
To: Andy Fiddaman
Cc: illumos-discuss
Subject: [discuss] pkg, python3 and unicode

 

Hi.

 

When we initially imported OmniOS CE and Oracle fixes to port PKG to Python 3 I considered that only correct way to run it is to run in UTF-8 environment, as PKG generally treats actions as strings, but never specifies encoding for them. I've looked at recent commit to OmniOS CE pkg - https://github.com/omniosorg/pkg5/commit/93544be96e5c8106bcba71c5436e1464d6d491f0 , and hoped that it could solve the problem when PKG now and then is run in C environment. Usually it happens when it's working with linked images. After looking at it for some time I've come to https://github.com/OpenIndiana/pkg5/pull/76, but it's still not complete - I still can't install package containing unicode actions in C environment.

To actually fix it we should consider that every attrs['path'] is unicode and can't be used as-is in rest of the code, if we suppose that pkg environment is not UTF-8.

I'm starting to  wonder if original idea was more sane - just to ensure that we always run in UTF-8 environment (especially now, when we have C.UTF-8 locale)?

 

Best regards,

Alexander Pyhalov,

system administrator of Southern Federal University IT department

------------------------------------------

illumos: illumos-discuss

Permalink: https://illumos.topicbox.com/groups/discuss/Teef198902621dc83-M7f0241712d48d7915a6b70a6

Delivery options: https://illumos.topicbox.com/groups/discuss/subscription

 

Reply via email to