Re: SQLite, Unicode LC
On Fri, Apr 11, 2014 at 12:46 AM, Peter Haworth p...@lcsql.com wrote: Do you know what version of the SQLite library your admin tool is using? I'm wondering if there's some incompatibility in how UTF8 is handled in different versions of the SQLite library. That's my assumption. Valentina Studio 5.5.4 reports 3.8.1 Navicat Premium 11.0.16 reports 3.8.1 In LC 6.5.2 I reports 3.7.4 In LC 6.6.1 I reports 3.8.3.1 ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
A few comments below. On Wed, Apr 9, 2014 at 8:28 PM, Kay C Lan lan.kc.macm...@gmail.com wrote: Pete said: 1) Exports from iTunes and gets a word like eÜjûzëiÇoò [hope it displays with all the accents] with all the accented chars as garbage. That's correct when I used Textedit with its default character encoding (Automatic) for opening files. I just tried it with Textedit's character encoding set to utf8 and the accented characters now show correctly. Apparently Textedit is unable to automatically detect utf8 correctly. 2) I don't know how those displayed for him in a LC variable or field. Looking at it in the variable viewer, it displayed with the corrupted characters. 3) He imports that data into SQLite and gets those same carbage chars. 4) He used unidecode(uniencode()) to convert the garbage and display correctly in SQLite Management software Slight correction to that - once the data was in the SQLite database correctly formatted with uniencode/decode, it displayed correctly in an LC application after doing a SELECT on it. It also displayed correctly in my SQLite admin tool but since that tool is my SQLiteAdmin utility which is written with LC, that's probably not a good benchmark :-) In my case, when I orginally wrote my script (6.1.x) I never used any uniencode or unidecode: 1) Exported a file and a word like eÜjûzëiÇoò appeared exactly like that in a BBEdit text file that reported it as UTF8 and Unix CRs. I don't have BBEdit but it sounds like it does a better job than TextEdit on detecting character encodings. 2) Put it in a LC variable and field and it looked exactly the same. That's where I get a different result than you - I get the corrupted characters. even when I coerce TextEdit to displaying them correctly. Did you save the file with BBEdit before loading it into LC? If so, maybe that removed the need for the LC uniencode/decode. 3) Imported into SQLite UTF8 db and the word looked exactly the same. 4) When I SELECTED the record and displayed it a LC field it looked exactly the same. I'd expect 3) and 4) to be the case if it looked OK in the LC variable. NOW, since updating to LC 6.6.1GM (which has updated SQLite) 1) In SQLite original records with accented words look correct. 2) When I SELECT I have to use the mentioned unidecode(uniencode()) to display correctly. I don't see that in 6.6.1. The existing records in my database display correctly in LC after a SELECT with no uniencode/decode. BUT NOW in 6.6.1GM if I 1) Take a BBEdit UTF8 Unix CRs text file with the word eÜjûzëiÇoò 2) Put it in an LC variable or field it still looks correct 3) Import it into SQLite without any uniencode and/or unidecode it looks like this e j z i o --blank where accented chars should be Do you know what version of the SQLite library your admin tool is using? I'm wondering if there's some incompatibility in how UTF8 is handled in different versions of the SQLite library. 4) When I SELECT the record and display it in an LC variable or field without using uniencode and/or decode it displays correctly. 5) So the only problem here is it doesn't display correctly in SQLite 6) On the other hand if I employ unidecode(uniencode()) I get this in the db: ejzio 7) When I SELECT the record and display it in LC I get ejzio with or without using unidecode(uniencode()) or worse if I use any combination of uniencode or unidecode. So Pete reported accents incorrectly displaying in his text file, and he can correct those by employing unidecode(uniencode()) to look fine in SQLite. I on the other hand have correctly displayed accents in text files, but can't get those to appear in SQLite correctly using your suggested solution. In the long term, unless LC 7.x stuffs things up further, for me the simplest solution seems to be to ignore unicode all together, just import it into SQLite, and not look at it using an SQLite Manager software, if I need to look at it I'll simply extract the data using LC or I notice that if I Export the data to a UTF8 Text file all the accents appear correctly. The problem to me seems to revolve around what happened when LC 6.6.x upgraded SQLite, which now seems to prevent my SQLite Management software (tried 3) from correctly displaying accents when it obviously still can. I'm on OS X 10.9.2, LC 6.6.1GM Pete lcSQL Software http://www.lcsql.com Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and SQLiteAdmin http://www.lcsql.com/sqliteadmin.html ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
Just talking out my derrier here, but can’t you simply uniencode all your data that contains odd characters, then store them as a blob in your database? Unidecode them when your read them out of the database? Bob On Apr 8, 2014, at 18:58 , Kay C Lan lan.kc.macm...@gmail.com wrote: On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote: I am however, whether I like it or not, having to get into the weird world of Unicode (I think). Some of the artist names and CD names in my iTunes library have accented characters which end up in the tab delimited file as not what the original character was. The corrupted characters then end up in my database. I don't have any control over how iTunes exports the data so is it possible for me to ensure that what ends up in my sqlite database is correct? The default text encoding for sqlite db's is UTF-8 but it can be changed to UTF-16, UTF-16le, or UTF-16be. I've just done a quick test and inserted my problem name into iTunes (appended it to the end of an album name) and I can Export the Playlist and it comes back fine. Are you seeing EVERY ascented character corrupted, or are some correct? As a workaround, with the appropriate playlist selected, select a track, then Select All, Copy and Paste into your favourite Text Editor. Does that come out correctly? ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
Hi Kay, I think I have got this working. Looking at the dictionary entry for uniencode, there's a user note with the following recommendation: put unidecode(uniencode(tData,UTF8)) into tData Tried that and it fixed the problem, database looks fine now. The user note credits Devin Assay and Dave Cragg for the solution so thanks to those folks. Pete lcSQL Software http://www.lcsql.com Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and SQLiteAdmin http://www.lcsql.com/sqliteadmin.html On Tue, Apr 8, 2014 at 6:58 PM, Kay C Lan lan.kc.macm...@gmail.com wrote: On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote: I am however, whether I like it or not, having to get into the weird world of Unicode (I think). Some of the artist names and CD names in my iTunes library have accented characters which end up in the tab delimited file as not what the original character was. The corrupted characters then end up in my database. I don't have any control over how iTunes exports the data so is it possible for me to ensure that what ends up in my sqlite database is correct? The default text encoding for sqlite db's is UTF-8 but it can be changed to UTF-16, UTF-16le, or UTF-16be. I've just done a quick test and inserted my problem name into iTunes (appended it to the end of an album name) and I can Export the Playlist and it comes back fine. Are you seeing EVERY ascented character corrupted, or are some correct? As a workaround, with the appropriate playlist selected, select a track, then Select All, Copy and Paste into your favourite Text Editor. Does that come out correctly? ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
Kay and Pete, On Apr 9, 2014, at 10:14 AM, Peter Haworth p...@lcsql.com wrote: Hi Kay, I think I have got this working. Looking at the dictionary entry for uniencode, there's a user note with the following recommendation: put unidecode(uniencode(tData,UTF8)) into tData Tried that and it fixed the problem, database looks fine now. The user note credits Devin Assay and Dave Cragg for the solution so thanks to those folks. Yep, that's the way to do it for now. Once LC 7 hits the streets it will be slightly different (and simpler.) Devin Devin Asay Learn to code with LiveCode University http://university.livecode.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
Thanks Devin, I was wondering about that. Will the above method still work in 7? Pete lcSQL Software http://www.lcsql.com Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and SQLiteAdmin http://www.lcsql.com/sqliteadmin.html On Wed, Apr 9, 2014 at 10:05 AM, Devin Asay devin_a...@byu.edu wrote: Kay and Pete, On Apr 9, 2014, at 10:14 AM, Peter Haworth p...@lcsql.com wrote: Hi Kay, I think I have got this working. Looking at the dictionary entry for uniencode, there's a user note with the following recommendation: put unidecode(uniencode(tData,UTF8)) into tData Tried that and it fixed the problem, database looks fine now. The user note credits Devin Assay and Dave Cragg for the solution so thanks to those folks. Yep, that's the way to do it for now. Once LC 7 hits the streets it will be slightly different (and simpler.) Devin Devin Asay Learn to code with LiveCode University http://university.livecode.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
On Apr 9, 2014, at 12:07 PM, Peter Haworth p...@lcsql.com wrote: Thanks Devin, I was wondering about that. Will the above method still work in 7? I believe it will still work, but will be deprecated. Pete lcSQL Software http://www.lcsql.com Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and SQLiteAdmin http://www.lcsql.com/sqliteadmin.html On Wed, Apr 9, 2014 at 10:05 AM, Devin Asay devin_a...@byu.edu wrote: Kay and Pete, On Apr 9, 2014, at 10:14 AM, Peter Haworth p...@lcsql.com wrote: Hi Kay, I think I have got this working. Looking at the dictionary entry for uniencode, there's a user note with the following recommendation: put unidecode(uniencode(tData,UTF8)) into tData Tried that and it fixed the problem, database looks fine now. The user note credits Devin Assay and Dave Cragg for the solution so thanks to those folks. Yep, that's the way to do it for now. Once LC 7 hits the streets it will be slightly different (and simpler.) Devin Asay Office of Digital Humanities Brigham Young University ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
On Wed, Apr 9, 2014 at 11:21 AM, Devin Asay devin_a...@byu.edu wrote: Thanks Devin, I was wondering about that. Will the above method still work in 7? I believe it will still work, but will be deprecated. Hi Devin, Yes, just saw that in the 7.0 release notes. I also see that there is a new form of the open file/process/socket commands to deal with different encodings but I didn't see anything equivalent when using the get/put URL commands to read a file. I use those commands a lot to read files so wondering if they will be able to handle different encodings? Pete lcSQL Software http://www.lcsql.com Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and SQLiteAdmin http://www.lcsql.com/sqliteadmin.html ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
On Apr 9, 2014, at 1:07 PM, Peter Haworth p...@lcsql.com wrote: On Wed, Apr 9, 2014 at 11:21 AM, Devin Asay devin_a...@byu.edu wrote: Thanks Devin, I was wondering about that. Will the above method still work in 7? I believe it will still work, but will be deprecated. Hi Devin, Yes, just saw that in the 7.0 release notes. I also see that there is a new form of the open file/process/socket commands to deal with different encodings but I didn't see anything equivalent when using the get/put URL commands to read a file. I use those commands a lot to read files so wondering if they will be able to handle different encodings? Saving: on mouseUp ask file Save to file.. if it is empty then exit to top put textEncode(fld nonlatin,utf8) into url (file: it) put the result end mouseUp Reading in: on mouseUp answer file Open file.. if it is empty then exit to top put url (file: it) into tText put textDecode(tText,utf8) into fld nonlatin put the result end mouseUp You have to know the format of the file you're reading in, but it works well. Devin Devin Asay Office of Digital Humanities Brigham Young University ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
On Thu, Apr 10, 2014 at 1:05 AM, Devin Asay devin_a...@byu.edu wrote: Yep, that's the way to do it for now. OK, well that solved my current problem but isn't a solution. Pete said: 1) Exports from iTunes and gets a word like eÜjûzëiÇoò [hope it displays with all the accents] with all the accented chars as garbage. 2) I don't know how those displayed for him in a LC variable or field. 3) He imports that data into SQLite and gets those same carbage chars. 4) He used unidecode(uniencode()) to convert the garbage and display correctly in SQLite Management software In my case, when I orginally wrote my script (6.1.x) I never used any uniencode or unidecode: 1) Exported a file and a word like eÜjûzëiÇoò appeared exactly like that in a BBEdit text file that reported it as UTF8 and Unix CRs. 2) Put it in a LC variable and field and it looked exactly the same. 3) Imported into SQLite UTF8 db and the word looked exactly the same. 4) When I SELECTED the record and displayed it a LC field it looked exactly the same. NOW, since updating to LC 6.6.1GM (which has updated SQLite) 1) In SQLite original records with accented words look correct. 2) When I SELECT I have to use the mentioned unidecode(uniencode()) to display correctly. BUT NOW in 6.6.1GM if I 1) Take a BBEdit UTF8 Unix CRs text file with the word eÜjûzëiÇoò 2) Put it in an LC variable or field it still looks correct 3) Import it into SQLite without any uniencode and/or unidecode it looks like this e j z i o --blank where accented chars should be 4) When I SELECT the record and display it in an LC variable or field without using uniencode and/or decode it displays correctly. 5) So the only problem here is it doesn't display correctly in SQLite 6) On the other hand if I employ unidecode(uniencode()) I get this in the db: ejzio 7) When I SELECT the record and display it in LC I get ejzio with or without using unidecode(uniencode()) or worse if I use any combination of uniencode or unidecode. So Pete reported accents incorrectly displaying in his text file, and he can correct those by employing unidecode(uniencode()) to look fine in SQLite. I on the other hand have correctly displayed accents in text files, but can't get those to appear in SQLite correctly using your suggested solution. In the long term, unless LC 7.x stuffs things up further, for me the simplest solution seems to be to ignore unicode all together, just import it into SQLite, and not look at it using an SQLite Manager software, if I need to look at it I'll simply extract the data using LC or I notice that if I Export the data to a UTF8 Text file all the accents appear correctly. The problem to me seems to revolve around what happened when LC 6.6.x upgraded SQLite, which now seems to prevent my SQLite Management software (tried 3) from correctly displaying accents when it obviously still can. I'm on OS X 10.9.2, LC 6.6.1GM ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
SQLite, Unicode LC
Peter, Glad I could get you over the first hurdle. Seems though we might be hitting the 2nd hurdle at the same time. I'm reading a file into LC and it contains this word: Cabañesas --the 5th char being numToChar(150) on Mac it ends up in a variable exactly the same. I then feed it into a SQLite database and if I view it with Valentina Studio or Navicat it appears just the same. But if I then use LC to query the db the variable contains: Caba√±esas --5th 6th char are numToChar(195) (177) on Mac The orginal file is UTF-8, SQLite is UTF-8 I'm not so sure that this is a unicode problem as the numbers are so low, but above 125, so maybe some other text encoding problem which I'm trying to nut out right now. Any insights would be much appreciated. On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote: Thanks for that Kay, I went with that approach and it took perhaps an hour to write the import script. I am however, whether I like it or not, having to get into the weird world of Unicode (I think). Some of the artist names and CD names in my iTunes library have accented characters which end up in the tab delimited file as not what the original character was. The corrupted characters then end up in my database. I don't have any control over how iTunes exports the data so is it possible for me to ensure that what ends up in my sqlite database is correct? The default text encoding for sqlite db's is UTF-8 but it can be changed to UTF-16, UTF-16le, or UTF-16be. Assuming that can be done, how do I make sure the artists names and album names are correctly displayed in my fields/option menus/datagrids? Let's assume for now that I will not be using LC7 for this ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: SQLite, Unicode LC
On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote: I am however, whether I like it or not, having to get into the weird world of Unicode (I think). Some of the artist names and CD names in my iTunes library have accented characters which end up in the tab delimited file as not what the original character was. The corrupted characters then end up in my database. I don't have any control over how iTunes exports the data so is it possible for me to ensure that what ends up in my sqlite database is correct? The default text encoding for sqlite db's is UTF-8 but it can be changed to UTF-16, UTF-16le, or UTF-16be. I've just done a quick test and inserted my problem name into iTunes (appended it to the end of an album name) and I can Export the Playlist and it comes back fine. Are you seeing EVERY ascented character corrupted, or are some correct? As a workaround, with the appropriate playlist selected, select a track, then Select All, Copy and Paste into your favourite Text Editor. Does that come out correctly? ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode