On 03/23/2017 12:45 AM, Beppe Mazzola wrote:
Thanks Sergey

Your suggestion inspired me to do some more trials.

Declaring the text column

VARCHAR(3000) instead of NVARCHAR(1000)

the problem disappears, but I have to increase the column size because somehow Virtuoso recognizes the text as non-Latin and uses UTF-8 that occupies more space (till 4 times more than the default encoding). I cannot declare my text columns VARCHAR, because I must be able to host non-Latin text till 4000 characters long. In order to do this I should declare the column
VARCHAR(16000), whereas the maximum size is 4082. Unfortunately it seem the use 
of VARCHAR columns is not a solution for me...

Yet as the Virtuoso documentation 
<http://docs.openlinksw.com/virtuoso/datatypes/>says that NVARCHAR is dedicated 
to Unicode:

/"There are 3 additional data types to enable storing of Unicode data: /

 *

    /NCHAR/

 *

    */NVARCHAR/*

 *

    /LONG NVARCHAR/

/All the Unicode types are equivalent to their corresponding "narrow" type - CHAR, VARCHAR and LONG VARCHAR - except that instead of storing data as one byte they allow Unicode characters. Their lengths are defined and returned in characters instead of bytes."/


The data in my project are loaded with an ETL process implemented using the Virtuoso bulk load feature (DB.DBA.csv_register() and DB.DBA.csv_loader_run() procedures).

Can I still I use the ODBC / JDBC drivers to load my data into the RDB as 
somebody suggested?
[Sergey]
It is a bad idea to store UTF8 data in VARCHAR() column, it isn't portable and you will need decode UTF-8 data by hand, if you get data from table via JDBC/ODBC drivers.
1) I think, that "SQL_UTF8_EXECS = 1" options will be usefull only, if you want 
to create tables/columns with Non-Latin names like for example
  create table "таблица1" (...

2) If you need to store Unicode data in table, you must to use datatypes: 
NCHAR, NVARCHAR, LONG NVARCHAR, NCLOB !!!
3) The JDBC/ODBC drivers inserts Unicode data properly to datatypes above.
 For ODBC programs the ODBC procedures like SQLExecDirectW / SQLPrepareW must 
be used for properly send Unicode data to server.
4) If you want insert Unicode data to table or to use Unicode value in WHERE, 
the N' prefix must be used.
Example to insert:
    insert into table test values(1, N'АБВГД');

5) I write already, that there is a bug in the Virtuoso Conductor UI, so it 
send Unicode data to server via wrong way.
NOW Conductor UI works properly only for select unicode data from tables.

So try to use any tools that works via ODBC/JDBC/ADO.NET Virtuoso drivers, it 
will works properly and you will not have problems with use unicode data.
For example for JDBC => http://www.sql-workbench.net/downloads.html


Cheers

Beppe


2017-03-21 11:20 GMT+00:00 Sergey Malinin <sergmali...@gmail.com 
<mailto:sergmali...@gmail.com>>:

    On 03/21/2017 04:42 AM, Beppe Mazzola wrote:

        Related to the problem I have about storing non-Latin character in 
Virtuoso RDB, I see now that I haven't provided all the info related to it.

        So it was not evident that the Hungarian text written in the DB was not 
the same read from it immediately after, despite the Virtuoso UTF-8 set-up has
        been performed

        I bring here again the test case (now simplified and completed) to 
reproduce the problem.

        It has been performed with Conductor Interactive SQL.

    It looks, that Conductor Interactive SQL does something wrong, when SQL 
insert(and etc)  queries are executed, it inserts wrong data to table.
    But Conductor Interactive SQL shows unicode data properly, if it was inserted via 
ODBC/JDBC/ADO.NET <http://ADO.NET> .

    Also note:
    1) you don't need to set "SQL_UTF8_EXECS = 1" in virtuoso.ini 
http://docs.openlinksw.com/virtuoso/wideidentifiers/
    <http://docs.openlinksw.com/virtuoso/wideidentifiers/>
    for simple insert unicode data to columns.

    2) for insert unicode values, the prefix N must be used
       insert into test values(1, N'АБВГ')

    3) try insert unicode data via ODBC/JDBC/ADO.NET <http://ADO.NET> drivers, 
it must be inserted properly.



        1) log in as dba

        2) Execute the following statements

        CREATE TABLE DB.test.HST_FTN
        (
          FOOTNOTES_ID NUMERIC NOT NULL,
          FOOTNOTE_TEXT NVARCHAR(1000) NOT NULL,
          CONSTRAINT PK_V_SEM_MRL_HST_FTN PRIMARY KEY ( FOOTNOTES_ID )
        );

        INSERT INTO DB.test.HST_FTN
        (FOOTNOTES_ID,FOOTNOTE_TEXT)
        VALUES (4,'След тази дата МДГОВ ще бъде 0,05(*) mg/kg, освен ако не 
бъде изменена с директива или регламент.');

        SELECT * FROM DB.test.HST_FTN ;

        3) This is the result of the selectdisplayed, no longer the original 
text

        Query result:
        FOOTNOTES_ID
        DECIMAL         FOOTNOTE_TEXT
        NVARCHAR
         4       След тази дата МДГОВ ще бъде 0,05(*) 
mg/kg, Ð¾Ñ Ð²ÐµÐ½ ако не бъде изменена Ñ Ð´Ð¸Ñ€ÐµÐºÑ‚Ð¸Ð²Ð°
        или регламент.
        No. of rows in result: 1

        Here below again is the Virtuoso set up performed for UTF-8 .

          * all the table columns containing non-Latin text declared as 
NVARCHAR(n)

          * in virtuoso.ini file
        SQL_UTF8_EXECS = 1 ... [HTTPServer] Charset        = UTF-8  ; this is 
the default value
        I do not know if there is still some set up missing or if it's a 
Virtuoso issue. It's not clear to me as well whether the text is storied 
corrupted in
        the DB by the INSERT statement or it is somehow corrupted later, when 
Conductor reads it from the DB and returns it to the browser.
        Any idea, hint are very welcome!
        Thanks Cheers
        Beppe
        2017-02-27 10:21 GMT+00:00 Beppe Mazzola <beppemazz...@gmail.com 
<mailto:beppemazz...@gmail.com> <mailto:beppemazz...@gmail.com
        <mailto:beppemazz...@gmail.com>>>:

            Hi Hugh
            This should reproduce the problem. In my Virtuoso installation the 
Bulgarian text inserted in the test table is different to the one displayed as
        a result
            of the select.
            1) Create a user with name test with the following setting.
            [picture removed]
            2) log in as user test
3) Execute the following statements CREATE TABLE DB.test.HST_FTN ( FOOTNOTES_ID NUMERIC NOT NULL, FOOTNOTE_TEXT NVARCHAR(1000) NOT NULL, CONSTRAINT
            PK_V_SEM_MRL_HST_FTN PRIMARY KEY ( FOOTNOTES_ID ) );
            INSERT INTO DB.test.HST_FTN
            (FOOTNOTES_ID,FOOTNOTE_TEXT) VALUES (4,'След тази дата МДГОВ ще 
бъде 0,05(*) mg/kg, освен ако не бъде изменена с директива или регламент.');
        SELECT * FROM
            DB.test.HST_FTN ;
            Cheers
            Beppe
            2017-02-25 22:42 GMT+00:00 Hugh Williams <hwilli...@openlinksw.com 
<mailto:hwilli...@openlinksw.com> <mailto:hwilli...@openlinksw.com
        <mailto:hwilli...@openlinksw.com>>>:

                Hi Beppe,
                That should be all that is required on the server for Unicode / 
UTF-8 support.
                Do you have a test case for recreating the problem being 
observed, which should show how the data is being inserted and the queried ?
                Best Regards
                Hugh Williams
                Professional Services
                OpenLink Software, Inc.      // http://www.openlinksw.com/
                Weblog   -- http://www.openlinksw.com/blogs/ 
<http://www.openlinksw.com/blogs/> <http://www.openlinksw.com/blogs/
        <http://www.openlinksw.com/blogs/>>
                LinkedIn -- http://www.linkedin.com/company/openlink-software/ 
<http://www.linkedin.com/company/openlink-software/>
        <http://www.linkedin.com/company/openlink-software/ 
<http://www.linkedin.com/company/openlink-software/>>
                Twitter  -- http://twitter.com/OpenLink
                Google+  -- http://plus.google.com/100570109519069333827/ 
<http://plus.google.com/100570109519069333827/>
        <http://plus.google.com/100570109519069333827/ 
<http://plus.google.com/100570109519069333827/>>
                Facebook -- http://www.facebook.com/OpenLinkSoftware 
<http://www.facebook.com/OpenLinkSoftware> 
<http://www.facebook.com/OpenLinkSoftware
        <http://www.facebook.com/OpenLinkSoftware>>
                Universal Data Access, Integration, and Management Technology 
Providers

                    On 22 Feb 2017, at 14:20, Beppe Mazzola <beppemazz...@gmail.com 
<mailto:beppemazz...@gmail.com> <mailto:beppemazz...@gmail.com
            <mailto:beppemazz...@gmail.com>>> wrote:
                    Hi to all
                    In order to have Virtuoso relational DB able to correctly 
store the text with the UTF-8 encoding I have performed the following 
operations.

                      * all the columns with text declared as NVARCHAR(n)
                      * in virtuoso.ini file
                          o SQL_UTF8_EXECS     = 1
                          o Charset                       = UTF-8       ; this 
is the default value
                    This is not enough, since I see that the non-Latin 
characters are incorrectly stored in the DB yet.
                    Do I miss any further set-up?
                    I am using the following Virtuoso Open Source Version 
Virtuoso Open Source Edition (Column Store) (multi threaded) Version
            7.2.4.2.3217-pthreads as
                    of Feb  3 2017 Compiled for Linux (x86_64-unknown-linux-gnu)
                    Thanks!
                    Cheers
                    Beppe
                    
------------------------------------------------------------------------------ 
Check out the vibrant tech community on one of the world's most
                    engaging tech sites, SlashDot.org <http://slashdot.org>! 
http://sdm.link/slashdot_______________________________________________
            
<http://sdm.link/slashdot_______________________________________________>
                    
<http://sdm.link/slashdot_______________________________________________
            
<http://sdm.link/slashdot_______________________________________________>> 
Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net
            <mailto:Virtuoso-users@lists.sourceforge.net>
                    <mailto:Virtuoso-users@lists.sourceforge.net 
<mailto:Virtuoso-users@lists.sourceforge.net>>
            https://lists.sourceforge.net/lists/listinfo/virtuoso-users 
<https://lists.sourceforge.net/lists/listinfo/virtuoso-users>
                    <https://lists.sourceforge.net/lists/listinfo/virtuoso-users 
<https://lists.sourceforge.net/lists/listinfo/virtuoso-users>>


        
------------------------------------------------------------------------------
        Check out the vibrant tech community on one of the world's most
        engaging tech sites, Slashdot.org! http://sdm.link/slashdot

        _______________________________________________
        Virtuoso-users mailing list
        Virtuoso-users@lists.sourceforge.net 
<mailto:Virtuoso-users@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/virtuoso-users 
<https://lists.sourceforge.net/lists/listinfo/virtuoso-users>


-- Best Regards,
    Sergey Malinin




--
Best Regards,
Sergey Malinin


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to