Re: SQLite, Unicode LC

2014-04-11 Thread Kay C Lan
On Fri, Apr 11, 2014 at 12:46 AM, Peter Haworth p...@lcsql.com wrote:

 Do you know what version of the SQLite library your admin tool is using?
  I'm wondering if there's some incompatibility in how UTF8 is handled in
 different versions of the SQLite library.

 That's my assumption.

Valentina Studio 5.5.4 reports 3.8.1
Navicat Premium 11.0.16 reports 3.8.1
In LC 6.5.2 I reports 3.7.4
In LC 6.6.1 I reports 3.8.3.1
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-10 Thread Peter Haworth
A few comments below.

On Wed, Apr 9, 2014 at 8:28 PM, Kay C Lan lan.kc.macm...@gmail.com wrote:

 Pete said:

 1) Exports from iTunes and gets a word like eÜjûzëiÇoò [hope it displays
 with all the accents] with all the accented chars as garbage.


That's correct when I used Textedit with its default character encoding
(Automatic) for opening files.  I just tried it with Textedit's character
encoding set to utf8 and the accented characters now show correctly.
 Apparently Textedit is unable to automatically detect utf8 correctly.


 2) I don't know how those displayed for him in a LC variable or field.


Looking at it in the variable viewer, it displayed with the corrupted
characters.


  3) He imports that data into SQLite and gets those same carbage chars.
 4) He used unidecode(uniencode()) to convert the garbage and display
 correctly in SQLite Management software


Slight correction to that - once the data was in the SQLite database
correctly formatted with uniencode/decode, it displayed correctly in an LC
application after doing a SELECT on it.  It also displayed correctly in my
SQLite admin tool but since that tool is my SQLiteAdmin utility which is
written with LC, that's probably not a good benchmark :-)


 In my case, when I orginally wrote my script (6.1.x) I never used any
 uniencode or unidecode:

 1) Exported a file and a word like eÜjûzëiÇoò appeared exactly like that in
 a BBEdit text file that reported it as UTF8 and Unix CRs.


I don't have BBEdit but it sounds like it does a better job than TextEdit
on detecting character encodings.


 2) Put it in a LC variable and field and it looked exactly the same.


That's where I get a different result than you - I get the corrupted
characters. even when I coerce TextEdit to displaying them correctly. Did
you save the file with BBEdit before loading it into LC? If so, maybe that
removed the need for the LC uniencode/decode.


 3) Imported into SQLite UTF8 db and the word looked exactly the same.
 4) When I SELECTED the record and displayed it a LC field it looked exactly
 the same.


I'd expect 3) and 4) to be the case if it looked OK in the LC variable.


 NOW, since updating to LC 6.6.1GM (which has updated SQLite)

 1) In SQLite original records with accented words look correct.
 2) When I SELECT I have to use the mentioned unidecode(uniencode()) to
 display correctly.


I don't see that in 6.6.1.  The existing records in my database display
correctly in LC after a SELECT with no uniencode/decode.



 BUT NOW in 6.6.1GM if I

 1) Take a BBEdit UTF8 Unix CRs text file with the word eÜjûzëiÇoò
 2) Put it in an LC variable or field it still looks correct
 3) Import it into SQLite without any uniencode and/or unidecode it looks

 like this e j z i o  --blank where accented chars should be


Do you know what version of the SQLite library your admin tool is using?
 I'm wondering if there's some incompatibility in how UTF8 is handled in
different versions of the SQLite library.


 4) When I SELECT the record and display it in an LC variable or field
 without using uniencode and/or decode it displays correctly.
 5) So the only problem here is it doesn't display correctly in SQLite

 6) On the other hand if I employ unidecode(uniencode()) I get this in the
 db: ejzio
 7) When I SELECT the record and display it in LC I get ejzio with or
 without using unidecode(uniencode()) or worse if I use any combination of
 uniencode or unidecode.

 So Pete reported accents incorrectly displaying in his text file, and he
 can correct those by employing unidecode(uniencode()) to look fine in
 SQLite.

 I on the other hand have correctly displayed accents in text files, but
 can't get those to appear in SQLite correctly using your suggested
 solution.

 In the long term, unless LC 7.x stuffs things up further, for me the
 simplest solution seems to be to ignore unicode all together, just import
 it into SQLite, and not look at it using an SQLite Manager software, if I
 need to look at it I'll simply extract the data using LC or I notice that
 if I Export the data to a UTF8 Text file all the accents appear correctly.

 The problem to me seems to revolve around what happened when LC 6.6.x
 upgraded SQLite, which now seems to prevent my SQLite Management software
 (tried 3) from correctly displaying accents when it obviously still can.

 I'm on OS X 10.9.2, LC 6.6.1GM




Pete
lcSQL Software http://www.lcsql.com
Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and
SQLiteAdmin http://www.lcsql.com/sqliteadmin.html
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Bob Sneidar
Just talking out my derrier here, but can’t you simply uniencode all your data 
that contains odd characters, then store them as a blob in your database? 
Unidecode them when your read them out of the database?

Bob


On Apr 8, 2014, at 18:58 , Kay C Lan lan.kc.macm...@gmail.com wrote:

 On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote:
 
 I am however, whether I like it or not, having to get into the weird world
 of Unicode (I think).  Some of the artist names and CD names in my iTunes
 library have accented characters which end up in the tab delimited file as
 not what the original character was. The corrupted characters then end up
 in my database.
 
 I don't have any control over how iTunes exports the data so is it possible
 for me to ensure that what ends up in my sqlite database is correct?  The
 default text encoding for sqlite db's is UTF-8 but it can be changed to
 UTF-16, UTF-16le, or UTF-16be.
 
 
 I've just done a quick test and inserted my problem name into iTunes
 (appended it to the end of an album name) and I can Export the Playlist and
 it comes back fine. Are you seeing EVERY ascented character corrupted, or
 are some correct?
 
 As a workaround, with the appropriate playlist selected, select a track,
 then Select All, Copy and Paste into your favourite Text Editor. Does that
 come out correctly?
 ___
 use-livecode mailing list
 use-livecode@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your subscription 
 preferences:
 http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Peter Haworth
Hi Kay,
I think I have got this working.  Looking at the dictionary entry for
uniencode, there's a user note with the following recommendation:

put unidecode(uniencode(tData,UTF8)) into tData

Tried that and it fixed the problem, database looks fine now.

The user note credits Devin Assay and Dave Cragg for the solution so thanks
to those folks.

Pete
lcSQL Software http://www.lcsql.com
Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and
SQLiteAdmin http://www.lcsql.com/sqliteadmin.html


On Tue, Apr 8, 2014 at 6:58 PM, Kay C Lan lan.kc.macm...@gmail.com wrote:

 On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote:

 I am however, whether I like it or not, having to get into the weird world
  of Unicode (I think).  Some of the artist names and CD names in my iTunes
  library have accented characters which end up in the tab delimited file
 as
  not what the original character was. The corrupted characters then end
 up
  in my database.
 
  I don't have any control over how iTunes exports the data so is it
 possible
  for me to ensure that what ends up in my sqlite database is correct?  The
  default text encoding for sqlite db's is UTF-8 but it can be changed to
  UTF-16, UTF-16le, or UTF-16be.


 I've just done a quick test and inserted my problem name into iTunes
 (appended it to the end of an album name) and I can Export the Playlist and
 it comes back fine. Are you seeing EVERY ascented character corrupted, or
 are some correct?

 As a workaround, with the appropriate playlist selected, select a track,
 then Select All, Copy and Paste into your favourite Text Editor. Does that
 come out correctly?
 ___
 use-livecode mailing list
 use-livecode@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your
 subscription preferences:
 http://lists.runrev.com/mailman/listinfo/use-livecode

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Devin Asay
Kay and Pete,


On Apr 9, 2014, at 10:14 AM, Peter Haworth p...@lcsql.com wrote:

 Hi Kay,
 I think I have got this working.  Looking at the dictionary entry for
 uniencode, there's a user note with the following recommendation:
 
 put unidecode(uniencode(tData,UTF8)) into tData
 
 Tried that and it fixed the problem, database looks fine now.
 
 The user note credits Devin Assay and Dave Cragg for the solution so thanks
 to those folks.

Yep, that's the way to do it for now. Once LC 7 hits the streets it will be 
slightly different (and simpler.) 

Devin


Devin Asay
Learn to code with LiveCode University
http://university.livecode.com




___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Peter Haworth
Thanks Devin, I was wondering about that.  Will the above method still work
in 7?

Pete
lcSQL Software http://www.lcsql.com
Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and
SQLiteAdmin http://www.lcsql.com/sqliteadmin.html


On Wed, Apr 9, 2014 at 10:05 AM, Devin Asay devin_a...@byu.edu wrote:

 Kay and Pete,


 On Apr 9, 2014, at 10:14 AM, Peter Haworth p...@lcsql.com wrote:

  Hi Kay,
  I think I have got this working.  Looking at the dictionary entry for
  uniencode, there's a user note with the following recommendation:
 
  put unidecode(uniencode(tData,UTF8)) into tData
 
  Tried that and it fixed the problem, database looks fine now.
 
  The user note credits Devin Assay and Dave Cragg for the solution so
 thanks
  to those folks.

 Yep, that's the way to do it for now. Once LC 7 hits the streets it will
 be slightly different (and simpler.)

 Devin


 Devin Asay
 Learn to code with LiveCode University
 http://university.livecode.com




 ___
 use-livecode mailing list
 use-livecode@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your
 subscription preferences:
 http://lists.runrev.com/mailman/listinfo/use-livecode

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Devin Asay

On Apr 9, 2014, at 12:07 PM, Peter Haworth p...@lcsql.com
 wrote:

 Thanks Devin, I was wondering about that.  Will the above method still work
 in 7?

I believe it will still work, but will be deprecated.
 
 Pete
 lcSQL Software http://www.lcsql.com
 Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and
 SQLiteAdmin http://www.lcsql.com/sqliteadmin.html
 
 
 On Wed, Apr 9, 2014 at 10:05 AM, Devin Asay devin_a...@byu.edu wrote:
 
 Kay and Pete,
 
 
 On Apr 9, 2014, at 10:14 AM, Peter Haworth p...@lcsql.com wrote:
 
 Hi Kay,
 I think I have got this working.  Looking at the dictionary entry for
 uniencode, there's a user note with the following recommendation:
 
 put unidecode(uniencode(tData,UTF8)) into tData
 
 Tried that and it fixed the problem, database looks fine now.
 
 The user note credits Devin Assay and Dave Cragg for the solution so
 thanks
 to those folks.
 
 Yep, that's the way to do it for now. Once LC 7 hits the streets it will
 be slightly different (and simpler.)
 

Devin Asay
Office of Digital Humanities
Brigham Young University


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Peter Haworth
On Wed, Apr 9, 2014 at 11:21 AM, Devin Asay devin_a...@byu.edu wrote:

  Thanks Devin, I was wondering about that.  Will the above method still
 work
  in 7?

 I believe it will still work, but will be deprecated.


Hi Devin,
Yes, just saw that in the 7.0 release notes.

I also see that there is a new form of the open file/process/socket
commands to deal with different encodings but I didn't see anything
equivalent when using the get/put URL commands to read a file.  I use those
commands a lot to read files so wondering if they will be able to handle
different encodings?

Pete
lcSQL Software http://www.lcsql.com
Home of lcStackBrowser http://www.lcsql.com/lcstackbrowser.html and
SQLiteAdmin http://www.lcsql.com/sqliteadmin.html
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Devin Asay

On Apr 9, 2014, at 1:07 PM, Peter Haworth p...@lcsql.com
 wrote:

 On Wed, Apr 9, 2014 at 11:21 AM, Devin Asay devin_a...@byu.edu wrote:
 
 Thanks Devin, I was wondering about that.  Will the above method still
 work
 in 7?
 
 I believe it will still work, but will be deprecated.
 
 
 Hi Devin,
 Yes, just saw that in the 7.0 release notes.
 
 I also see that there is a new form of the open file/process/socket
 commands to deal with different encodings but I didn't see anything
 equivalent when using the get/put URL commands to read a file.  I use those
 commands a lot to read files so wondering if they will be able to handle
 different encodings?

Saving:

on mouseUp
ask file Save to file..
if it is empty then exit to top
put textEncode(fld nonlatin,utf8) into url (file:  it)
put the result
end mouseUp

Reading in:

on mouseUp
answer file Open file..
if it is empty then exit to top
put url (file:  it) into tText
put textDecode(tText,utf8) into fld nonlatin
put the result
end mouseUp

You have to know the format of the file you're reading in, but it works well.

Devin


Devin Asay
Office of Digital Humanities
Brigham Young University


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: SQLite, Unicode LC

2014-04-09 Thread Kay C Lan
On Thu, Apr 10, 2014 at 1:05 AM, Devin Asay devin_a...@byu.edu wrote:


 Yep, that's the way to do it for now.

 OK, well that solved my current problem but isn't a solution.

Pete said:

1) Exports from iTunes and gets a word like eÜjûzëiÇoò [hope it displays
with all the accents] with all the accented chars as garbage.
2) I don't know how those displayed for him in a LC variable or field.
3) He imports that data into SQLite and gets those same carbage chars.
4) He used unidecode(uniencode()) to convert the garbage and display
correctly in SQLite Management software

In my case, when I orginally wrote my script (6.1.x) I never used any
uniencode or unidecode:

1) Exported a file and a word like eÜjûzëiÇoò appeared exactly like that in
a BBEdit text file that reported it as UTF8 and Unix CRs.
2) Put it in a LC variable and field and it looked exactly the same.
3) Imported into SQLite UTF8 db and the word looked exactly the same.
4) When I SELECTED the record and displayed it a LC field it looked exactly
the same.

NOW, since updating to LC 6.6.1GM (which has updated SQLite)

1) In SQLite original records with accented words look correct.
2) When I SELECT I have to use the mentioned unidecode(uniencode()) to
display correctly.

BUT NOW in 6.6.1GM if I

1) Take a BBEdit UTF8 Unix CRs text file with the word eÜjûzëiÇoò
2) Put it in an LC variable or field it still looks correct
3) Import it into SQLite without any uniencode and/or unidecode it looks
like this e j z i o  --blank where accented chars should be
4) When I SELECT the record and display it in an LC variable or field
without using uniencode and/or decode it displays correctly.
5) So the only problem here is it doesn't display correctly in SQLite

6) On the other hand if I employ unidecode(uniencode()) I get this in the
db: ejzio
7) When I SELECT the record and display it in LC I get ejzio with or
without using unidecode(uniencode()) or worse if I use any combination of
uniencode or unidecode.

So Pete reported accents incorrectly displaying in his text file, and he
can correct those by employing unidecode(uniencode()) to look fine in
SQLite.

I on the other hand have correctly displayed accents in text files, but
can't get those to appear in SQLite correctly using your suggested solution.

In the long term, unless LC 7.x stuffs things up further, for me the
simplest solution seems to be to ignore unicode all together, just import
it into SQLite, and not look at it using an SQLite Manager software, if I
need to look at it I'll simply extract the data using LC or I notice that
if I Export the data to a UTF8 Text file all the accents appear correctly.

The problem to me seems to revolve around what happened when LC 6.6.x
upgraded SQLite, which now seems to prevent my SQLite Management software
(tried 3) from correctly displaying accents when it obviously still can.

I'm on OS X 10.9.2, LC 6.6.1GM
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


SQLite, Unicode LC

2014-04-08 Thread Kay C Lan
Peter,

Glad I could get you over the first hurdle.

Seems though we might be hitting the 2nd hurdle at the same time.

I'm reading a file into LC and it contains this word:

Cabañesas  --the 5th char being numToChar(150) on Mac

it ends up in a variable exactly the same. I then feed it into a SQLite
database and if I view it with Valentina Studio or Navicat it appears just
the same. But if I then use LC to query the db the variable contains:

Cabañesas  --5th  6th char are numToChar(195)  (177) on Mac

The orginal file is UTF-8, SQLite is UTF-8

I'm not so sure that this is a unicode problem as the numbers are so low,
but above 125, so maybe some other text encoding problem which I'm trying
to nut out right now.

Any insights would be much appreciated.


On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote:

 Thanks for that Kay, I went with that approach and it took perhaps an hour
 to write the import script.

 I am however, whether I like it or not, having to get into the weird world
 of Unicode (I think).  Some of the artist names and CD names in my iTunes
 library have accented characters which end up in the tab delimited file as
 not what the original character was. The corrupted characters then end up
 in my database.

 I don't have any control over how iTunes exports the data so is it possible
 for me to ensure that what ends up in my sqlite database is correct?  The
 default text encoding for sqlite db's is UTF-8 but it can be changed to
 UTF-16, UTF-16le, or UTF-16be.

 Assuming that can be done, how do I make sure the artists names and album
 names are correctly displayed in my fields/option menus/datagrids?  Let's
 assume for now that I will not be using LC7 for this
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: SQLite, Unicode LC

2014-04-08 Thread Kay C Lan
On Wed, Apr 9, 2014 at 7:07 AM, Peter Haworth p...@lcsql.com wrote:

I am however, whether I like it or not, having to get into the weird world
 of Unicode (I think).  Some of the artist names and CD names in my iTunes
 library have accented characters which end up in the tab delimited file as
 not what the original character was. The corrupted characters then end up
 in my database.

 I don't have any control over how iTunes exports the data so is it possible
 for me to ensure that what ends up in my sqlite database is correct?  The
 default text encoding for sqlite db's is UTF-8 but it can be changed to
 UTF-16, UTF-16le, or UTF-16be.


I've just done a quick test and inserted my problem name into iTunes
(appended it to the end of an album name) and I can Export the Playlist and
it comes back fine. Are you seeing EVERY ascented character corrupted, or
are some correct?

As a workaround, with the appropriate playlist selected, select a track,
then Select All, Copy and Paste into your favourite Text Editor. Does that
come out correctly?
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode