Hi Oliver,
Instead of a nice warm spring we have yet another frosty winter here in
Moscow.
So people have to spend time sitting indoors and trying to break utf-8
in oxi code.
What we tried:
- Lynux (CentOS-6.2) was installed on the host.
- host was told that it is located in Munich, Germany, and it has to use
central European time, to make it feel as near to you as possible at all.
- root console used in some tests had "C" locale (utf-8 locale is not
very handy for the purposes of this email presentation).
- electric plug of German geometry (TUV stamped) was used to feed the
host from the wall mains.
- In CSRs only such DNs were allowed that had either Russian letters, or
German umlauts.
- On the client side Mozilla Firefox was instructed to use German as
default language, with autodetecting the codepage.
- employed mysql-5.5 server. Its syntax is a bit incompatible with
earlier versions, so we had to slightly hack the code of oxi for these
tests. But it is another story.
# cat /usr/local/etc/my.cnf
[mysqld]
character-set-server=utf8
collation-server=utf8_general_ci
This was set _before_ creating of tables.
# mysqladmin variables | grep character_set
character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_database utf8
character_set_results utf8
character_set_server utf8
character_set_system utf8
# mysqladmin variables | grep collation_
collation_connection utf8_general_ci
collation_database utf8_general_ci
collation_server utf8_general_ci
Believe me, all the efforts were in vain. Utf-8 is still there and works
fine.
Now steps to be reproduced by yourself in detail after starting the oxi
with mysql.
1. From a FF on a client side 2 CSRs are created: one with Russian DN,
one with German umlauts as I understand them.
Fig 01_ shows input form in FF. At this moment umlauts have _not_
reached database yet.
2. Fig 02_ says that CSR is registered and umlauts have successfully
reached database. But here this page still remembers the data that was
filled into from the keyboard (via hidden elements on the page).
3. Fig 03_ shows a list of pending CSRs. This page does _not_ remember
data, which was feed into the form. Data is coming exclusively from the
database.
And umlauts are visible.
Now let us go to the server side.
Remember, database has codepage utf-8 and root console has C locale.
Command
select * from workflow_context where workflow_context_value like
'%cert_subject_realname%';
brings to us Fig 04_. It shows Russian letters (underlined with Red) and
German umlauts (underlined with Blue).
Both have a typical view of utf-8 symbols, which are stupidly decoded
into "C".
Russian and German symbols look in different ways just because Russian
utf-8 letters have D0 or D1 as a first byte,
while German letters have C3 as a first byte.
If you are not very comfortable with this look of utf-8, Fig 05_ (this
figure is actually a text file in plain English)
shows the same output in quoted-printable form. This was obtained by
feeding the actual output of the mysql command trough vi editor, which
was run from the C console.
I would appreciate if you could give your answer in plain words for the
following question:
=========================================================================
Can you reproduce steps 1-3 from above,
starting from clean code of tarballs
http://www7.openxpki.org/lastmidnight/index.html
which you have _not_ edited in advance?
=========================================================================
As my message of 22.02.2012 to this list has said, this is a demo patch.
With this patch you can:
- create CSR via a standard browser.
- look trough a (list of) CSR via a standard browser.
(The lack of utf-8 support on exactly these stages was your initial
complain.)
With this patch you can _not_:
- edit utf-8 aware DN in CSR if you are RA operator. More exactly, you
can edit it, but once saved, your changes will have deteriorated utf-8
symbols.
- view utf-8 aware DN in cert, etc.
As my message of 22.02.2012 has said, to extend this patch to other
stages of CSR lifecycle, we have to elaborate a new collective policy.
And/or invent a bullet proof clever decision of the utf-8 problem.
Rather than trying to introduce a couple of operators in arbitrary places.
General remarks.
Operator Encode::_utf8_on() does not encode or decode data. It just
controls the appropriate flag.
Use of actual decode/encode operators is as a rule a big error in this
project.
Nothing here is needed to be decoded/encoded.
More precisely, all necessary utf-8 related decoding/encoding is done
automatically by a standard browser.
All the best, Sergei
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
OpenXPKI-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openxpki-devel