Re: opening txt files
This is the first time I've asked a question on use-livecode and I've been pleasantly amazed that people have taken the time to give so much useful advice - much respect, and thankyou to everyone. I think I now have a solution which works, and I've learnt some interesting things too. In summary (1) Bob's idea (below) is the way to differentiate between UTF-8 and UTF-16. The program can react by ignoring alternate characters if it finds FF FE as the first two characters (thanks HTH for some tidy code but note that the first character after the FF FE is the valid one). (2) The comment on the Apple discussion (also below) would seem to be right. In case (1) Text Wrangler reports a Unicode UTF-8 file and in case (2) a Unicode UTF-16. (3) TextEdit seems to resolve the encoding issue before displaying the file so TextWrangler is better for nit-pickers (thanks for that, Francis :) my wife appreciates the extra ammunition!) who want to see everything. Onwards and upwards, Nishok From: Robert Sneidar slylab...@me.com I will hazard a guess, that when you open the file for reading, you can open binary first and see if the first two characters amount to FE FF, yes? If so, treat as UTF-16. If not, treat as UTF-8. I have not tested this strategy myself, but your second point seems to give the clue to solve this mystery. Bob On Jan 16, 2013, at 9:15 AM, Nishok Love wrote: Thanks, Bob. Your command works but the same results occur. Further investigations here found this When Pages is used to export as Text, the resulting file may be of two kinds: (1) if the document contained only characters included in Apple MacRoman charset, the file is a pure text file based on Apple MacRoman encoding. (2) if the document contained extraneous characters the created text file take care of this feature and uses the UTF encoding (two bytes per character) and starts with the logical BOM: FE FF. which I've copied from the discussion on https://discussions.apple.com/message/9518841?messageID=9518841#9518841?messageID=9518841 * ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
Hi friends, Am 16.01.2013 um 18:15 schrieb Nishok Love nishok.l...@virgin.net: ... So I'm still looking for a way for LiveCode to spot whether it's opening a file in UTF-8 or UTF-16 (or something else - aaarrgh!). Can I access the file header? read from file just gives me the data... I found an old script that Mark Waddingham supplied in the past when I had some problems reading VCards in 3.0 format (unicode). I think it can be used to open ANY txt file. I do not fully understand it, so I leave it uncommented ;-) In any case it will convert any given text file to Livecode readable plain text. Comments are from Mark W. I could read the file, count the number of characters and how many of them are spaces and from that I could infer which format is being used. Probably this would be reliable for my purposes - just not very elegant! Nishok ### -- vCards are stored as a text file, however, the text encoding used varies -- depending on the program that exported them. -- -- We use the following heuristic to detect encoding: -- 1) If there is the byte order mark 0xFEFF then we assume UTF-16BE -- 2) If there is the byte order mark 0xFFFE then we assume UTF-16LE -- 3) If the first byte is 0x00 then we assume UTF-16BE (compatibility -- with Tiger Address Book) -- 4) Otherwise we assume UTF-8 -- function importVCard pFilename -- First load the vCard as binary data - at this stage we don't know -- the text encoding of the file and loading as text would cause -- inappropriate line ending conversion. local tBinaryVCard put url (binfile: pFilename) into tBinaryVCard -- This variable will hold the vCard encoded in MacRoman (the default -- text encoding Revolution uses on Mac OS X) local tNativeVCard -- We now do our checks to detect text encoding local tTextEncoding if charToNum(char 1 of tBinaryVCard) is 0 then put UTF16BE into tTextEncoding else if charToNum(char 1 of tBinaryVCard) is 0xFE and charToNum(char 2 of tBinaryVCard) is 0xFF then delete char 1 to 2 of tBinaryVCard put UTF16BE into tTextEncoding else if charToNum(char 1 of tBinaryVCard) is 0xFF and charToNum(char 2 of tBinaryVCard) is 0xFE then delete char 1 to 2 of tBinaryVCard put UTF16LE into tTextEncoding else put UTF8 into tTextEncoding end if if tTextEncoding begins with UTF16 then -- Work out the processors byte order local tHostByteOrder if the processor is x86 then put LE into tHostByteOrder else put BE into tHostByteOrder end if -- If the byte orders don't match, switch the order of pairs of bytes if char -2 to -1 of tTextEncoding is not tHostByteOrder then repeat with x = 1 to the length of tBinaryVCard step 2 get char x of tBinaryVCard put char x + 1 of tBinaryVCard into char x of tBinaryVCard put it into char x + 1 of tBinaryVCard end repeat end if -- Decode the UTF-16 to native put uniDecode(tBinaryVCard) into tNativeVCard else -- Use the standard uniDecode/uniEncode pair to decode the UTF-8 encoding put uniDecode(uniEncode(tBinaryVCard, UTF8)) into tNativeVCard end if -- We now need to normalize line endings to make sure all lines terminate -- in 'return' (numToChar(10)). local tTextVCard put tNativeVCard into tTextVCard -- First replace Windows CR-LF style endings replace numToChar(13) numToChar(10) with return in tTextVCard -- Now replace Mac OS CR style endings replace numToChar(13) with return in tTextVCard return tTextVCard end importVCard -- The Tiger version of Apple Address Book (4.0.4) exports vCard files -- as UTF-16 big endian without a BOM if the record contains any non-ASCII -- characters. -- If there are non non-ASCII characters, the record is just left as -- ASCII with no conversion to UTF-16. -- On Leopard, it seems that Apple Address Book exports vCard files -- in UTF-8 regardless. function importAppleAddressVCard pFilename -- First load the vCard as binary data - at this stage we don't know -- the text encoding of the file and loading as text would cause -- inappropriate line ending conversion. local tBinaryVCard put url (binfile: pFilename) into tBinaryVCard -- This variable will hold the vCard encoded in MacRoman (the default -- text encoding Revolution uses on Mac OS X) local tNativeVCard -- Okay so now we have the binary data, we need to decide if it is -- UTF-16BE or ASCII/UTf-8. This is easy to do since the first character of -- a vCard has to be an ASCII character. If the record has been encoded -- as UTF-16BE, then this means this will translate as the first byte -- being the NUL (0) character. if charToNum(char 1 of tBinaryVCard) is 0 then -- We are UTF-16BE -- We now know that tBinaryVCard is big endian UTF-16 since Revolution -- only handles host byte
Re: opening txt files
Hey that's nice, thanks! Bob On Jan 17, 2013, at 6:32 AM, Klaus on-rev wrote: Hi friends, Am 16.01.2013 um 18:15 schrieb Nishok Love nishok.l...@virgin.net: ... So I'm still looking for a way for LiveCode to spot whether it's opening a file in UTF-8 or UTF-16 (or something else - aaarrgh!). Can I access the file header? read from file just gives me the data... I found an old script that Mark Waddingham supplied in the past when I had some problems reading VCards in 3.0 format (unicode). I think it can be used to open ANY txt file. I do not fully understand it, so I leave it uncommented ;-) In any case it will convert any given text file to Livecode readable plain text. Comments are from Mark W. snip ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
Hi from Beautiful Brittany, Hi Nishok, If you are a nit-picker, and REALLY want to know why you have this problem, then my response is simple - I don't know ! If you want a work-around, it's simple - select your text when you are in Word, and paste it into a new text file (TextEdit), and save it. You have pure text. -Francis ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
Thanks, Bob. Your command works but the same results occur. Further investigations here found this When Pages is used to export as Text, the resulting file may be of two kinds: (1) if the document contained only characters included in Apple MacRoman charset, the file is a pure text file based on Apple MacRoman encoding. (2) if the document contained extraneous characters the created text file take care of this feature and uses the UTF encoding (two bytes per character) and starts with the logical BOM: FE FF. which I've copied from the discussion on https://discussions.apple.com/message/9518841?messageID=9518841#9518841?messageID=9518841 Opening both files with TextEdit (which displays both of them correctly, ie without all those extra spaces), duplicating them and then watching the save options shows that one file (the one from Pages) is using UTF-16 whilst Word's Western (Mac OS Roman) export is in UTF-8. Using GetInfo I can now see that the UTF-16 file is twice the size of the other. In short, text files are not as simple as they used to be! So I'm still looking for a way for LiveCode to spot whether it's opening a file in UTF-8 or UTF-16 (or something else - aaarrgh!). Can I access the file header? read from file just gives me the data... I could read the file, count the number of characters and how many of them are spaces and from that I could infer which format is being used. Probably this would be reliable for my purposes - just not very elegant! Nishok I am not sure why you are seeing this. I exported a pages newsletter file as text, then ran this command on it: on mouseUp pMouseBtnNo answer file Pick a text file with /Users/bobsneidar/Desktop/SneidarNewsletter.txt put it into theFile open file theFile for read read from file theFile until cr put it close file theFile end mouseUp I got this in the message box: 2005 Summer Edition Seems to work. Bob On Jan 15, 2013, at 10:34 AM, NISHOK LOVE wrote: Hi All I have a problem when I open .txt files in OSX, and I don't have much (any!) experience of reading files in LiveCode. I have a file originally written in Word on Windows. When I export it as a .txt from Word for Mac I just accept the default Mac OS encoding option (Western (Mac OS Roman) and it all works fine when I open the file in my LiveCode. But when I open the original file in Pages and export it as Plain Text, I get a different result. When I open that file in LiveCode I find a space has been inserted after every character. So Hello world becomes H e l l o w o r l d. I guess this is a problem with the encoding, but how can my LiveCode understand what the incoming file's encoding is and respond accordingly? My LiveCode needs to be able to deal with any kind of text file... Thanks, Nishok Love ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
On 01/16/2013 07:15 PM, Nishok Love wrote: Thanks, Bob. Your command works but the same results occur. Further investigations here found this When Pages is used to export as Text, the resulting file may be of two kinds: (1) if the document contained only characters included in Apple MacRoman charset, the file is a pure text file based on Apple MacRoman encoding. (2) if the document contained extraneous characters the created text file take care of this feature and uses the UTF encoding (two bytes per character) and starts with the logical BOM: FE FF. which I've copied from the discussion on https://discussions.apple.com/message/9518841?messageID=9518841#9518841?messageID=9518841 Opening both files with TextEdit (which displays both of them correctly, ie without all those extra spaces), duplicating them and then watching the save options shows that one file (the one from Pages) is using UTF-16 whilst Word's Western (Mac OS Roman) export is in UTF-8. Using GetInfo I can now see that the UTF-16 file is twice the size of the other. In short, text files are not as simple as they used to be! So I'm still looking for a way for LiveCode to spot whether it's opening a file in UTF-8 or UTF-16 (or something else - aaarrgh!). Can I access the file header? read from file just gives me the data... I could read the file, count the number of characters and how many of them are spaces and from that I could infer which format is being used. Probably this would be reliable for my purposes - just not very elegant! Nishok Why not use RTF? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
I did not see an RTF export option in Pages. Besides, I think he is dealing with reading text files, the nature of which he does not control. Bob On Jan 16, 2013, at 9:22 AM, Richmond wrote: On 01/16/2013 07:15 PM, Nishok Love wrote: Thanks, Bob. Your command works but the same results occur. Further investigations here found this When Pages is used to export as Text, the resulting file may be of two kinds: (1) if the document contained only characters included in Apple MacRoman charset, the file is a pure text file based on Apple MacRoman encoding. (2) if the document contained extraneous characters the created text file take care of this feature and uses the UTF encoding (two bytes per character) and starts with the logical BOM: FE FF. which I've copied from the discussion on https://discussions.apple.com/message/9518841?messageID=9518841#9518841?messageID=9518841 Opening both files with TextEdit (which displays both of them correctly, ie without all those extra spaces), duplicating them and then watching the save options shows that one file (the one from Pages) is using UTF-16 whilst Word's Western (Mac OS Roman) export is in UTF-8. Using GetInfo I can now see that the UTF-16 file is twice the size of the other. In short, text files are not as simple as they used to be! So I'm still looking for a way for LiveCode to spot whether it's opening a file in UTF-8 or UTF-16 (or something else - aaarrgh!). Can I access the file header? read from file just gives me the data... I could read the file, count the number of characters and how many of them are spaces and from that I could infer which format is being used. Probably this would be reliable for my purposes - just not very elegant! Nishok Why not use RTF? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
I will hazard a guess, that when you open the file for reading, you can open binary first and see if the first two characters amount to FE FF, yes? If so, treat as UTF-16. If not, treat as UTF-8. I have not tested this strategy myself, but your second point seems to give the clue to solve this mystery. Bob On Jan 16, 2013, at 9:15 AM, Nishok Love wrote: Thanks, Bob. Your command works but the same results occur. Further investigations here found this When Pages is used to export as Text, the resulting file may be of two kinds: (1) if the document contained only characters included in Apple MacRoman charset, the file is a pure text file based on Apple MacRoman encoding. (2) if the document contained extraneous characters the created text file take care of this feature and uses the UTF encoding (two bytes per character) and starts with the logical BOM: FE FF. which I've copied from the discussion on https://discussions.apple.com/message/9518841?messageID=9518841#9518841?messageID=9518841 Opening both files with TextEdit (which displays both of them correctly, ie without all those extra spaces), duplicating them and then watching the save options shows that one file (the one from Pages) is using UTF-16 whilst Word's Western (Mac OS Roman) export is in UTF-8. Using GetInfo I can now see that the UTF-16 file is twice the size of the other. In short, text files are not as simple as they used to be! So I'm still looking for a way for LiveCode to spot whether it's opening a file in UTF-8 or UTF-16 (or something else - aaarrgh!). Can I access the file header? read from file just gives me the data... I could read the file, count the number of characters and how many of them are spaces and from that I could infer which format is being used. Probably this would be reliable for my purposes - just not very elegant! Nishok I am not sure why you are seeing this. I exported a pages newsletter file as text, then ran this command on it: on mouseUp pMouseBtnNo answer file Pick a text file with /Users/bobsneidar/Desktop/SneidarNewsletter.txt put it into theFile open file theFile for read read from file theFile until cr put it close file theFile end mouseUp I got this in the message box: 2005 Summer Edition Seems to work. Bob On Jan 15, 2013, at 10:34 AM, NISHOK LOVE wrote: Hi All I have a problem when I open .txt files in OSX, and I don't have much (any!) experience of reading files in LiveCode. I have a file originally written in Word on Windows. When I export it as a .txt from Word for Mac I just accept the default Mac OS encoding option (Western (Mac OS Roman) and it all works fine when I open the file in my LiveCode. But when I open the original file in Pages and export it as Plain Text, I get a different result. When I open that file in LiveCode I find a space has been inserted after every character. So Hello world becomes H e l l o w o r l d. I guess this is a problem with the encoding, but how can my LiveCode understand what the incoming file's encoding is and respond accordingly? My LiveCode needs to be able to deal with any kind of text file... Thanks, Nishok Love ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
I am not sure why you are seeing this. I exported a pages newsletter file as text, then ran this command on it: on mouseUp pMouseBtnNo answer file Pick a text file with /Users/bobsneidar/Desktop/SneidarNewsletter.txt put it into theFile open file theFile for read read from file theFile until cr put it close file theFile end mouseUp I got this in the message box: 2005 Summer Edition Seems to work. Bob On Jan 15, 2013, at 10:34 AM, NISHOK LOVE wrote: Hi All I have a problem when I open .txt files in OSX, and I don't have much (any!) experience of reading files in LiveCode. I have a file originally written in Word on Windows. When I export it as a .txt from Word for Mac I just accept the default Mac OS encoding option (Western (Mac OS Roman) and it all works fine when I open the file in my LiveCode. But when I open the original file in Pages and export it as Plain Text, I get a different result. When I open that file in LiveCode I find a space has been inserted after every character. So Hello world becomes H e l l o w o r l d. I guess this is a problem with the encoding, but how can my LiveCode understand what the incoming file's encoding is and respond accordingly? My LiveCode needs to be able to deal with any kind of text file... Thanks, Nishok Love ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
I have seen the behavior Nishok describes. It was some time ago and I don't remember what the issue was, but I think it was fixable. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig On Jan 15, 2013, at 1:54 PM, Robert Sneidar wrote: I am not sure why you are seeing this. I exported a pages newsletter file as text, then ran this command on it: on mouseUp pMouseBtnNo answer file Pick a text file with /Users/bobsneidar/Desktop/SneidarNewsletter.txt put it into theFile open file theFile for read read from file theFile until cr put it close file theFile end mouseUp I got this in the message box: 2005 Summer Edition Seems to work. Bob On Jan 15, 2013, at 10:34 AM, NISHOK LOVE wrote: Hi All I have a problem when I open .txt files in OSX, and I don't have much (any!) experience of reading files in LiveCode. I have a file originally written in Word on Windows. When I export it as a .txt from Word for Mac I just accept the default Mac OS encoding option (Western (Mac OS Roman) and it all works fine when I open the file in my LiveCode. But when I open the original file in Pages and export it as Plain Text, I get a different result. When I open that file in LiveCode I find a space has been inserted after every character. So Hello world becomes H e l l o w o r l d. I guess this is a problem with the encoding, but how can my LiveCode understand what the incoming file's encoding is and respond accordingly? My LiveCode needs to be able to deal with any kind of text file... Thanks, Nishok Love ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
What happens when you open the Pages converted file in TextEdit, does it have the extra spaces? If so the problem is with the conversion process from Pages, not with LC. If so you need to look at some of the conversion options Pages offers and see if you can create the file without the extra spaces. If TextEdit opens the converted file just fine, then you may need to download the free TextWrangler: http://www.barebones.com/products/textwrangler/download.html You can the select 'Show invisibles' from the 'T' toolbar icon. If that doesn't reveal anything then use 'Zap Gremlins...' option in the Text menu. Tick the boxes, use 'Replace with' and put in your own distinct character. This will then highlight if there are any chars that Pages is leaving behind that TextEdit is interpreting as non-visible but LC is interpreting as a space. If all else fails, assuming you have that selects the file and puts it in a variable, I'll call tData, and you have a breakpoint immediately after this, when you look at the variable tData it looks like this: H E L L O... So in the next line of you script after the breakpoint put: --to determine the ASCII number of the bad character put charToNum(char 2 of tData) into tBadChar run your script again, when it stops at the breakpoint step once so the above line is executed then check what is in tBadChar. Hopefully it won't be 32 - which is what a regular space is. You should be careful though, as there may be a bad character before the H so maybe the number you get is for the H. Basically any number between 32-126 is probably valid. Anything outside this range is likely to be the cause of your problem. So you may need to test char 1, char 2 and char 3 just to make sure you are looking at the Bad character. Once you've positively identified the ASCII value of the bad character then simply add this further line of script after the last line you added: --replace bad ASCII with nothing replace numToChar(tBadChar) with in tData Again, if the number is 32 this means Pages is doing the wrong thing with it's conversion and it's Page's problem, and it would be nigh impossible for LC to repair this. HTH On Wed, Jan 16, 2013 at 2:34 AM, NISHOK LOVE nishok.l...@virgin.net wrote: Hi All I have a problem when I open .txt files in OSX, and I don't have much (any!) experience of reading files in LiveCode. I have a file originally written in Word on Windows. When I export it as a .txt from Word for Mac I just accept the default Mac OS encoding option (Western (Mac OS Roman) and it all works fine when I open the file in my LiveCode. But when I open the original file in Pages and export it as Plain Text, I get a different result. When I open that file in LiveCode I find a space has been inserted after every character. So Hello world becomes H e l l o w o r l d. I guess this is a problem with the encoding, but how can my LiveCode understand what the incoming file's encoding is and respond accordingly? My LiveCode needs to be able to deal with any kind of text file... Thanks, Nishok Love ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
I did that earlier. There are no encoding options available in the export dialog. It is possible that based on the system and language settings at the time the file is exported that the encoding might be set to Unicode, but I couldn't find anything by googling for it. Bob Sneidar IT Manager Calvary Chapel CM Sent from iPhone On Jan 15, 2013, at 21:19, Kay C Lan lan.kc.macm...@gmail.com wrote: What happens when you open the Pages converted file in TextEdit, does it have the extra spaces? If so the problem is with the conversion process from Pages, not with LC. If so you need to look at some of the conversion options Pages offers and see if you can create the file without the extra spaces. If TextEdit opens the converted file just fine, then you may need to download the free TextWrangler: http://www.barebones.com/products/textwrangler/download.html You can the select 'Show invisibles' from the 'T' toolbar icon. If that doesn't reveal anything then use 'Zap Gremlins...' option in the Text menu. Tick the boxes, use 'Replace with' and put in your own distinct character. This will then highlight if there are any chars that Pages is leaving behind that TextEdit is interpreting as non-visible but LC is interpreting as a space. If all else fails, assuming you have that selects the file and puts it in a variable, I'll call tData, and you have a breakpoint immediately after this, when you look at the variable tData it looks like this: H E L L O... So in the next line of you script after the breakpoint put: --to determine the ASCII number of the bad character put charToNum(char 2 of tData) into tBadChar run your script again, when it stops at the breakpoint step once so the above line is executed then check what is in tBadChar. Hopefully it won't be 32 - which is what a regular space is. You should be careful though, as there may be a bad character before the H so maybe the number you get is for the H. Basically any number between 32-126 is probably valid. Anything outside this range is likely to be the cause of your problem. So you may need to test char 1, char 2 and char 3 just to make sure you are looking at the Bad character. Once you've positively identified the ASCII value of the bad character then simply add this further line of script after the last line you added: --replace bad ASCII with nothing replace numToChar(tBadChar) with in tData Again, if the number is 32 this means Pages is doing the wrong thing with it's conversion and it's Page's problem, and it would be nigh impossible for LC to repair this. HTH On Wed, Jan 16, 2013 at 2:34 AM, NISHOK LOVE nishok.l...@virgin.net wrote: Hi All I have a problem when I open .txt files in OSX, and I don't have much (any!) experience of reading files in LiveCode. I have a file originally written in Word on Windows. When I export it as a .txt from Word for Mac I just accept the default Mac OS encoding option (Western (Mac OS Roman) and it all works fine when I open the file in my LiveCode. But when I open the original file in Pages and export it as Plain Text, I get a different result. When I open that file in LiveCode I find a space has been inserted after every character. So Hello world becomes H e l l o w o r l d. I guess this is a problem with the encoding, but how can my LiveCode understand what the incoming file's encoding is and respond accordingly? My LiveCode needs to be able to deal with any kind of text file... Thanks, Nishok Love ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
On Wed, Jan 16, 2013 at 1:19 PM, Kay C Lan lan.kc.macm...@gmail.com wrote: it would be nigh impossible for LC to repair this. Actually I take that back. If Pages places an extra space after EVERY char, so there are two spaces between words instead of one, there is a space after the last word on a line but before the carriage return, and another space after the carriage return and before the first char of a new line, then it should be just a simple matter of: --assuming the very first char is invalid put true into tOdd repeat for each char tChar in tData if (tOdd = true) put tChar after tOutput put false into tOdd else put true into tOdd end if end repeat put tOutput into fld WhereEver If there is not an extra space after EVERY char, then it becomes much more difficult. HTH ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
RE: opening txt files
From: Kay C Lan Actually I take that back. If Pages places an extra space after EVERY char, so there are two spaces between words instead of one, there is a space after the last word on a line but before the carriage return, and another space after the carriage return and before the first char of a new line, then it should be just a simple matter of: --assuming the very first char is invalid put true into tOdd repeat for each char tChar in tData if (tOdd = true) put tChar after tOutput put false into tOdd else put true into tOdd end if end repeat put tOutput into fld WhereEver If there is not an extra space after EVERY char, then it becomes much more difficult. Is it possible that the original text file is in Unicode, so that each character in the ASCII set is followed by a null, and something is converting the nulls into blanks? Possibly the display routine itself? -- Ciao, Paul D. DeRocco Paulmailto:pdero...@ix.netcom.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
On 1/15/13 11:39 PM, Kay C Lan wrote: If there is not an extra space after EVERY char, then it becomes much more difficult. If the original was unicode, which was then converted to plain text, Pages might be retaining the second byte and inserting a null. In other words, it's keeping both of the original bytes but using only the first. That's what LiveCode does too when you uniEncode text. Maybe uniDecode would fix it. -- Jacqueline Landman Gay | jac...@hyperactivesw.com HyperActive Software | http://www.hyperactivesw.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
On 1/15/13 11:49 PM, Paul D. DeRocco wrote: Is it possible that the original text file is in Unicode, so that each character in the ASCII set is followed by a null, and something is converting the nulls into blanks? Possibly the display routine itself? GMTA. :) -- Jacqueline Landman Gay | jac...@hyperactivesw.com HyperActive Software | http://www.hyperactivesw.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
On Wed, Jan 16, 2013 at 1:54 PM, J. Landman Gay jac...@hyperactivesw.com wrote: If the original was unicode, Again, TextWrangler can help, at the bottom left of the document it will show the documents encoding which is also a button that allows you to change it. On my system a new TW document is created as UTF-8, when I open a .doc in Pages and convert it to plain text and open it in TW it shows Western (Mac OS Roman). If I take that same converted document and feed it into LC, it's fine. If I take that Mac OS Roman document in TW and change the encoding to UTF-16, then when I import that into LC it includes all sorts of extra characters between all the valid ones, not just extra spaces. UTF-8 has no extra space but there are the odd incorrectly transposed chars. If I then take that UTF-16 document and open it in Pages and Export it as Plain Text TW again shows it's encoding is Western (Mac OS Roman) and LC has no problems with it. My suggestion, if you open the converted document in TW and the encoding isn't Western (Mac OS Roman), then change it, save it, and see what you get. At least that's how it works on my system. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: opening txt files
Bob wrote: No one has talked about what version of pages. [on my gmail it got connected to a different thread] I'm on OS X 10.8.2, LC 5.5.3, Pages 4.3, TextWrangler 4.0.1 ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode