I need to read and process a tab-delimited text file that is in UTF8
format containing unicode. The final goal is to get it into an array
with the first tabbed item as the keys, preserving all unicode. There
are some HTML format tags in it as well.
If I read the file as binfile, carriage
On Aug 19, 2013, at 1:03 PM, J. Landman Gay wrote:
I need to read and process a tab-delimited text file that is in UTF8 format
containing unicode. The final goal is to get it into an array with the first
tabbed item as the keys, preserving all unicode. There are some HTML format
tags in
On 08/19/2013 10:03 PM, J. Landman Gay wrote:
I need to read and process a tab-delimited text file that is in UTF8
format containing unicode. The final goal is to get it into an array
with the first tabbed item as the keys, preserving all unicode. There
are some HTML format tags in it as well.
LF:Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR:Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS:Line Separator, U+2028
PS:Paragraph Separator, U+2029
I have a feeling that a search and replace routine
On 8/19/13 2:15 PM, Devin Asay wrote:
On Aug 19, 2013, at 1:03 PM, J. Landman Gay wrote:
I need to read and process a tab-delimited text file that is in
UTF8 format containing unicode. The final goal is to get it into an
array with the first tabbed item as the keys, preserving all
unicode.
On 8/19/13 2:21 PM, Richmond wrote:
LF:Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR:Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS:Line Separator, U+2028
PS:Paragraph Separator, U+2029
I have a
On 08/19/2013 10:31 PM, J. Landman Gay wrote:
On 8/19/13 2:21 PM, Richmond wrote:
LF:Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR:Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS:Line Separator, U+2028
On Aug 19, 2013, at 1:29 PM, J. Landman Gay wrote:
On 8/19/13 2:15 PM, Devin Asay wrote:
On Aug 19, 2013, at 1:03 PM, J. Landman Gay wrote:
I need to read and process a tab-delimited text file that is in
UTF8 format containing unicode. The final goal is to get it into an
array with the
On 8/19/13 2:43 PM, Devin Asay wrote:
When I run uniEncode(tData,UTF8) on it, the high-ascii characters
are in the variable watcher as + and an unprintable box. Can I
assume the real character is in there? Will it work for text
chunking, etc? When I split it into an array, will the keys be
On Aug 19, 2013, at 1:59 PM, J. Landman Gay wrote:
On 8/19/13 2:43 PM, Devin Asay wrote:
When I run uniEncode(tData,UTF8) on it, the high-ascii characters
are in the variable watcher as + and an unprintable box. Can I
assume the real character is in there? Will it work for text
chunking,
This is unicode array.
go to url http://kenjikojima.com/livecode/download/unicodeArray.livecode;
I hope it helps,
--
Kenji Kojima / 小島健治
http://www.kenjikojima.com/
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to
On 8/19/13 3:08 PM, in...@kenjikojima.com wrote:
This is unicode array.
go to url http://kenjikojima.com/livecode/download/unicodeArray.livecode;
I hope it helps,
Thank you. I will try this if I have to, but the data is very large and
stepping through each character would take a long time.
On 8/19/13 3:07 PM, Devin Asay wrote:
On Aug 19, 2013, at 1:59 PM, J. Landman Gay wrote:
On 8/19/13 2:43 PM, Devin Asay wrote:
When I run uniEncode(tData,UTF8) on it, the high-ascii characters
are in the variable watcher as + and an unprintable box. Can I
assume the real character is in
On Aug 19, 2013, at 2:22 PM, J. Landman Gay wrote:
On 8/19/13 3:07 PM, Devin Asay wrote:
Something like this should work:
User clicks term to look up.
get the text of the click line -- this will be displayed as UTF16
Sorry, that should have been
get the unicodeText of the clickLine
Jacque wrote:
Basically, I'm storing a glossary. The keys are the glossary terms, some
of which are unicode. The definitions are the elements. The user points
to a word in a field and I need to retrieve the definition by matching
the displayed field text (which is unicodetext) with the glossary
On 8/19/13 3:41 PM, Richard Gaskin wrote:
Jacque wrote:
Basically, I'm storing a glossary. The keys are the glossary terms, some
of which are unicode. The definitions are the elements. The user points
to a word in a field and I need to retrieve the definition by matching
the displayed field
On 20/08/2013, at 6:41 AM, Richard Gaskin ambassa...@fourthworld.com wrote:
In my experience that's more strict than it needs to be, but if the format of
encoded arrays is any clue there may still be a restriction on having NULL
bytes in a key name.
Which would count out utf16. Jacque why
On Mon, Aug 19, 2013 at 1:22 PM, J. Landman Gay jac...@hyperactivesw.comwrote:
Thanks, I'll try it. The glossary is only one piece of a much bigger data
set involving a lot of different types of lookups, and this is going to be
a huge pain. I'm going to have to rewrite a large part of the
On 8/19/13 4:24 PM, Monte Goulding wrote:
Jacque why not uniDecode(theKey,UTF8) in order to use it as a key in the
array?
It drops or alters characters. The glossary is used a few different
ways, and sometimes I need to display all the keys in a field, to act as
an index of terms. So it
On 20/08/2013, at 8:50 AM, J. Landman Gay wrote:
It drops or alters characters.
I'm not sure what you mean. UTF8 can represent all the unicode code points.
The glossary is used a few different ways, and sometimes I need to display
all the keys in a field, to act as an index of terms. So it
On 8/19/13 5:27 PM, Peter Haworth wrote:
If you have to rewrite you code base and there's lots of different lookups
involved, maybe an SQLite database would work?
The lookups search custom properties and script variables mostly. But
I'm not sure that moving the problem to a database would
21 matches
Mail list logo