On 03/23/2017 12:45 AM, Beppe Mazzola wrote:
Thanks SergeyYour suggestion inspired me to do some more trials. Declaring the text column VARCHAR(3000) instead of NVARCHAR(1000)the problem disappears, but I have to increase the column size because somehow Virtuoso recognizes the text as non-Latin and uses UTF-8 that occupies more space (till 4 times more than the default encoding). I cannot declare my text columns VARCHAR, because I must be able to host non-Latin text till 4000 characters long. In order to do this I should declare the columnVARCHAR(16000), whereas the maximum size is 4082. Unfortunately it seem the use of VARCHAR columns is not a solution for me... Yet as the Virtuoso documentation <http://docs.openlinksw.com/virtuoso/datatypes/>says that NVARCHAR is dedicated to Unicode: /"There are 3 additional data types to enable storing of Unicode data: / * /NCHAR/ * */NVARCHAR/* * /LONG NVARCHAR//All the Unicode types are equivalent to their corresponding "narrow" type - CHAR, VARCHAR and LONG VARCHAR - except that instead of storing data as one byte they allow Unicode characters. Their lengths are defined and returned in characters instead of bytes."/The data in my project are loaded with an ETL process implemented using the Virtuoso bulk load feature (DB.DBA.csv_register() and DB.DBA.csv_loader_run() procedures).Can I still I use the ODBC / JDBC drivers to load my data into the RDB as somebody suggested?
[Sergey]It is a bad idea to store UTF8 data in VARCHAR() column, it isn't portable and you will need decode UTF-8 data by hand, if you get data from table via JDBC/ODBC drivers.
1) I think, that "SQL_UTF8_EXECS = 1" options will be usefull only, if you want to create tables/columns with Non-Latin names like for example create table "таблица1" (... 2) If you need to store Unicode data in table, you must to use datatypes: NCHAR, NVARCHAR, LONG NVARCHAR, NCLOB !!! 3) The JDBC/ODBC drivers inserts Unicode data properly to datatypes above. For ODBC programs the ODBC procedures like SQLExecDirectW / SQLPrepareW must be used for properly send Unicode data to server. 4) If you want insert Unicode data to table or to use Unicode value in WHERE, the N' prefix must be used. Example to insert: insert into table test values(1, N'АБВГД'); 5) I write already, that there is a bug in the Virtuoso Conductor UI, so it send Unicode data to server via wrong way. NOW Conductor UI works properly only for select unicode data from tables. So try to use any tools that works via ODBC/JDBC/ADO.NET Virtuoso drivers, it will works properly and you will not have problems with use unicode data. For example for JDBC => http://www.sql-workbench.net/downloads.html
Cheers Beppe 2017-03-21 11:20 GMT+00:00 Sergey Malinin <sergmali...@gmail.com <mailto:sergmali...@gmail.com>>: On 03/21/2017 04:42 AM, Beppe Mazzola wrote: Related to the problem I have about storing non-Latin character in Virtuoso RDB, I see now that I haven't provided all the info related to it. So it was not evident that the Hungarian text written in the DB was not the same read from it immediately after, despite the Virtuoso UTF-8 set-up has been performed I bring here again the test case (now simplified and completed) to reproduce the problem. It has been performed with Conductor Interactive SQL. It looks, that Conductor Interactive SQL does something wrong, when SQL insert(and etc) queries are executed, it inserts wrong data to table. But Conductor Interactive SQL shows unicode data properly, if it was inserted via ODBC/JDBC/ADO.NET <http://ADO.NET> . Also note: 1) you don't need to set "SQL_UTF8_EXECS = 1" in virtuoso.ini http://docs.openlinksw.com/virtuoso/wideidentifiers/ <http://docs.openlinksw.com/virtuoso/wideidentifiers/> for simple insert unicode data to columns. 2) for insert unicode values, the prefix N must be used insert into test values(1, N'АБВГ') 3) try insert unicode data via ODBC/JDBC/ADO.NET <http://ADO.NET> drivers, it must be inserted properly. 1) log in as dba 2) Execute the following statements CREATE TABLE DB.test.HST_FTN ( FOOTNOTES_ID NUMERIC NOT NULL, FOOTNOTE_TEXT NVARCHAR(1000) NOT NULL, CONSTRAINT PK_V_SEM_MRL_HST_FTN PRIMARY KEY ( FOOTNOTES_ID ) ); INSERT INTO DB.test.HST_FTN (FOOTNOTES_ID,FOOTNOTE_TEXT) VALUES (4,'След тази дата МДГОВ ще бъде 0,05(*) mg/kg, освен ако не бъде изменена с директива или регламент.'); SELECT * FROM DB.test.HST_FTN ; 3) This is the result of the selectdisplayed, no longer the original text Query result: FOOTNOTES_ID DECIMAL FOOTNOTE_TEXT NVARCHAR 4 След тази дата МДГОВ ще бъде 0,05(*) mg/kg, Ð¾Ñ Ð²ÐµÐ½ ако не бъде изменена Ñ Ð´Ð¸Ñ€ÐµÐºÑ‚Ð¸Ð²Ð° или регламент. No. of rows in result: 1 Here below again is the Virtuoso set up performed for UTF-8 . * all the table columns containing non-Latin text declared as NVARCHAR(n) * in virtuoso.ini file SQL_UTF8_EXECS = 1 ... [HTTPServer] Charset = UTF-8 ; this is the default value I do not know if there is still some set up missing or if it's a Virtuoso issue. It's not clear to me as well whether the text is storied corrupted in the DB by the INSERT statement or it is somehow corrupted later, when Conductor reads it from the DB and returns it to the browser. Any idea, hint are very welcome! Thanks Cheers Beppe 2017-02-27 10:21 GMT+00:00 Beppe Mazzola <beppemazz...@gmail.com <mailto:beppemazz...@gmail.com> <mailto:beppemazz...@gmail.com <mailto:beppemazz...@gmail.com>>>: Hi Hugh This should reproduce the problem. In my Virtuoso installation the Bulgarian text inserted in the test table is different to the one displayed as a result of the select. 1) Create a user with name test with the following setting. [picture removed] 2) log in as user test3) Execute the following statements CREATE TABLE DB.test.HST_FTN ( FOOTNOTES_ID NUMERIC NOT NULL, FOOTNOTE_TEXT NVARCHAR(1000) NOT NULL, CONSTRAINTPK_V_SEM_MRL_HST_FTN PRIMARY KEY ( FOOTNOTES_ID ) ); INSERT INTO DB.test.HST_FTN (FOOTNOTES_ID,FOOTNOTE_TEXT) VALUES (4,'След тази дата МДГОВ ще бъде 0,05(*) mg/kg, освен ако не бъде изменена с директива или регламент.'); SELECT * FROM DB.test.HST_FTN ; Cheers Beppe 2017-02-25 22:42 GMT+00:00 Hugh Williams <hwilli...@openlinksw.com <mailto:hwilli...@openlinksw.com> <mailto:hwilli...@openlinksw.com <mailto:hwilli...@openlinksw.com>>>: Hi Beppe, That should be all that is required on the server for Unicode / UTF-8 support. Do you have a test case for recreating the problem being observed, which should show how the data is being inserted and the queried ? Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ <http://www.openlinksw.com/blogs/> <http://www.openlinksw.com/blogs/ <http://www.openlinksw.com/blogs/>> LinkedIn -- http://www.linkedin.com/company/openlink-software/ <http://www.linkedin.com/company/openlink-software/> <http://www.linkedin.com/company/openlink-software/ <http://www.linkedin.com/company/openlink-software/>> Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ <http://plus.google.com/100570109519069333827/> <http://plus.google.com/100570109519069333827/ <http://plus.google.com/100570109519069333827/>> Facebook -- http://www.facebook.com/OpenLinkSoftware <http://www.facebook.com/OpenLinkSoftware> <http://www.facebook.com/OpenLinkSoftware <http://www.facebook.com/OpenLinkSoftware>> Universal Data Access, Integration, and Management Technology Providers On 22 Feb 2017, at 14:20, Beppe Mazzola <beppemazz...@gmail.com <mailto:beppemazz...@gmail.com> <mailto:beppemazz...@gmail.com <mailto:beppemazz...@gmail.com>>> wrote: Hi to all In order to have Virtuoso relational DB able to correctly store the text with the UTF-8 encoding I have performed the following operations. * all the columns with text declared as NVARCHAR(n) * in virtuoso.ini file o SQL_UTF8_EXECS = 1 o Charset = UTF-8 ; this is the default value This is not enough, since I see that the non-Latin characters are incorrectly stored in the DB yet. Do I miss any further set-up? I am using the following Virtuoso Open Source Version Virtuoso Open Source Edition (Column Store) (multi threaded) Version 7.2.4.2.3217-pthreads as of Feb 3 2017 Compiled for Linux (x86_64-unknown-linux-gnu) Thanks! Cheers Beppe ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org <http://slashdot.org>! http://sdm.link/slashdot_______________________________________________ <http://sdm.link/slashdot_______________________________________________> <http://sdm.link/slashdot_______________________________________________ <http://sdm.link/slashdot_______________________________________________>> Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net <mailto:Virtuoso-users@lists.sourceforge.net> <mailto:Virtuoso-users@lists.sourceforge.net <mailto:Virtuoso-users@lists.sourceforge.net>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users <https://lists.sourceforge.net/lists/listinfo/virtuoso-users> <https://lists.sourceforge.net/lists/listinfo/virtuoso-users <https://lists.sourceforge.net/lists/listinfo/virtuoso-users>> ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net <mailto:Virtuoso-users@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/virtuoso-users <https://lists.sourceforge.net/lists/listinfo/virtuoso-users>-- Best Regards,Sergey Malinin
-- Best Regards, Sergey Malinin
smime.p7s
Description: S/MIME Cryptographic Signature
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users