Re: Converting from unicode to ASCII

2020-09-26 Thread J. Landman Gay via use-livecode
On 9/24/20 12:09 PM, J. Landman Gay via use-livecode wrote: My original goal was to get the canonical version directly from LC somehow. Neville Smythe contacted me privately with this brilliant solution, posted here with his consent: function stripAccents pInput local tDecomposed local

Re: Converting from unicode to ASCII

2020-09-24 Thread J. Landman Gay via use-livecode
That's what I was hoping for when I started this thread, and it was suggested (without the ID) a while back, but then I'd need another lookup table. Probably one for each language. My original goal was to get the canonical version directly from LC somehow. -- Jacqueline Landman Gay |

Re: Converting from unicode to ASCII

2020-09-24 Thread Alex Tweedly via use-livecode
You could even decide that, rather than strip out non-ascii characters, you would convert (reduce?) each one to a canonical equivalent (where there is one), and hence instead of l’Académie française---> lAcadmiefranaise_1234.livecode it would become l’Académie française--->

Re: Converting from unicode to ASCII

2020-09-24 Thread Dave Cragg via use-livecode
That's what I was thinking. So the filename for " l’Académie française" might becomes something like lAcadmiefranaise_1234.livecode. Kind of readable, but guaranteed unique. (And also allows identifying the database record from the filename if that is needed.) (Apologies if this appears

Re: Converting from unicode to ASCII

2020-09-24 Thread J. Landman Gay via use-livecode
I'm pretty sure each record has an ID. This would be for ensuring unique file names, right? -- Jacqueline Landman Gay | jac...@hyperactivesw.com HyperActive Software | http://www.hyperactivesw.com On September 24, 2020 2:00:50 AM Dave Cragg via use-livecode wrote: Jacqueline, You said

Re: Converting from unicode to ASCII

2020-09-24 Thread J. Landman Gay via use-livecode
It's all automated already except for the uploading. The file organization on AWS is complex and the stacks don't all go in the same place. -- Jacqueline Landman Gay | jac...@hyperactivesw.com HyperActive Software | http://www.hyperactivesw.com On September 23, 2020 4:53:36 PM Richard Gaskin

Re: Converting from unicode to ASCII

2020-09-24 Thread Dave Cragg via use-livecode
Jacqueline, You said earlier you don't have a field in the database for the file name. But does the database table have a unique numerical ID field for each record? If so, could you strip out the non-ASCII characters and then append the numerical ID to the file name? > On 23 Sep 2020, at

Re: Converting from unicode to ASCII

2020-09-23 Thread Mark Wieder via use-livecode
On 9/23/20 4:50 PM, J. Landman Gay via use-livecode wrote: Heh. Now you understand why I didn't want another lookup table. :) OTOH, one of the cardinal rules of data design is *not* to use real data as an index into data. YMMV. -- Mark Wieder ahsoftw...@gmail.com

Re: Converting from unicode to ASCII

2020-09-23 Thread J. Landman Gay via use-livecode
Heh. Now you understand why I didn't want another lookup table. :) -- Jacqueline Landman Gay | jac...@hyperactivesw.com HyperActive Software | http://www.hyperactivesw.com On September 23, 2020 5:27:06 PM Mark Wieder via use-livecode wrote: On 9/22/20 11:10 PM, J. Landman Gay via

Re: Converting from unicode to ASCII

2020-09-23 Thread Mark Wieder via use-livecode
On 9/22/20 11:10 PM, J. Landman Gay via use-livecode wrote: There's more to it than that; the server runs a cron job hourly that indexes all its files and creates AWS secure URLs for each. The app downloads that lookup file on demand. When the user selects a name from a list, the selection is

Re: Converting from unicode to ASCII

2020-09-23 Thread Bob Sneidar via use-livecode
FYI I have a rudimentary document storage system developed where I can “check out” a document from my app so that no one else can check it out, which downloads the file from it’s repository into a temp folder. The user can then edit or work with the file, then check it in. It RETAINS the

Re: Converting from unicode to ASCII

2020-09-23 Thread Richard Gaskin via use-livecode
For an ongoing need like that on a substantial project, I'd automate it: She works on her master copy, then presses a button. Done. The button saves the stack, copies it to the munged name, and uploads it for her, even verifying the integrity of the upload afterward (machines don't mind the

Re: Converting from unicode to ASCII

2020-09-23 Thread Lagi Pittas via use-livecode
Hi Jacq, Since you don't do Chinese then I think what I suggested would work except for bulgarian and other non latin alphabets. (which you could use a translation table for). It also is compatible with all the previous names as the extract and tagging on the end will only happen with new

Re: Converting from unicode to ASCII

2020-09-23 Thread J. Landman Gay via use-livecode
On 9/23/20 1:26 PM, Richard Gaskin via use-livecode wrote: My only suggestion was to change how the existing munger works to satisfy the two problem areas identified: that names not be too long, and that any munger not remove so many characters as to make the file name non-unique or empty.

Re: Converting from unicode to ASCII

2020-09-23 Thread Bob Sneidar via use-livecode
Yes I understand that I was thinking of using the method for something I need. Bob S On Sep 23, 2020, at 11:47 AM, Richard Gaskin via use-livecode mailto:use-livecode@lists.runrev.com>> wrote: No lookup table is needed at all if the relationship between the original string and the resulting

Re: Converting from unicode to ASCII

2020-09-23 Thread J. Landman Gay via use-livecode
On 9/23/20 1:47 PM, Richard Gaskin via use-livecode wrote: But so far I haven't read anything requiring this to work in both directions.  Did I miss something?  Does she also rely on an unmunger function? No, you're correct, I only need the conversion to go one-way. The cron job creates

Re: Converting from unicode to ASCII

2020-09-23 Thread Richard Gaskin via use-livecode
No lookup table is needed at all if the relationship between the original string and the resulting munged file name never needs to also work the other direction. If bidirectional derivation is needed, given the limitations imposed by AWS' naming limitations I would see no way to avoid

Re: Converting from unicode to ASCII

2020-09-23 Thread Bob Sneidar via use-livecode
Understood, but if it were reversible, it would eliminate the necessity of a lookup table as an intermediary. Bob S > On Sep 23, 2020, at 11:26 AM, Richard Gaskin via use-livecode > wrote: > > If I understand her problem correctly, file identification need only be in > one direction. > >

Re: Converting from unicode to ASCII

2020-09-23 Thread Richard Gaskin via use-livecode
If I understand her problem correctly, file identification need only be in one direction. As far as I can tell from the description, everything that needs to determine which file to access does so by using a string from which the hashed file name can be derived. That she already has a

Re: Converting from unicode to ASCII

2020-09-23 Thread Bob Sneidar via use-livecode
Will binaryEncode get you back to the filename? Bob S On Sep 23, 2020, at 8:03 AM, Richard Gaskin via use-livecode mailto:use-livecode@lists.runrev.com>> wrote: J. Landman Gay write: > I'm looking for a way to create non-unicode file names > based on the string that comes out of the

Re: Converting from unicode to ASCII

2020-09-23 Thread Bob Sneidar via use-livecode
Duh. That was a stupid question. How do you get back to the filename? Bob S On Sep 23, 2020, at 8:08 AM, Bob Sneidar mailto:bobsnei...@iotecdigital.com>> wrote: Will binaryEncode get you back to the filename? Bob S On Sep 23, 2020, at 8:03 AM, Richard Gaskin via use-livecode

Re: Converting from unicode to ASCII

2020-09-23 Thread Richard Gaskin via use-livecode
J. Landman Gay write: > I'm looking for a way to create non-unicode file names > based on the string that comes out of the database. Ah, public clouds... Amazon's S3 docs say just encoding in UTF-8 should suffice, but then they also list a lot of characters they consider "special", but common

Re: Converting from unicode to ASCII

2020-09-23 Thread Bob Sneidar via use-livecode
You could extract the filename part of the path returned by tempfile() and use that anywhere. That would require something to track the visible name linked to the stored filename tho’. Bob S On Sep 22, 2020, at 11:10 PM, J. Landman Gay via use-livecode mailto:use-livecode@lists.runrev.com>>

Re: Converting from unicode to ASCII

2020-09-23 Thread Paul Dupuis via use-livecode
On 9/22/2020 6:48 PM, J. Landman Gay via use-livecode wrote: I have a stack with an index. When a user clicks a line, a handler uses the clicktext to create a file name which is always the clicktext plus the ".livecode" extension. The stack is then downloaded from an AWS server and displayed.

Re: Converting from unicode to ASCII

2020-09-23 Thread Lagi Pittas via use-livecode
Assuming all the languages are latin type alphabets (no chinese, Japanese , Sanskrit ;-) ) (but see later for a fix?) I would replace the charactes like the E with umlout/cedilla and other dicritics with the "naked" character but for others that can't remove them BUT add to the end of the

Re: Converting from unicode to ASCII

2020-09-23 Thread scott--- via use-livecode
What about just converting to UTF8. Wouldn’t that coerce it into ASCII? — Scott Morrow Elementary Software (Now with 20% less chalk dust!) web https://elementarysoftware.com/ email sc...@elementarysoftware.com > On Sep 22, 2020, at 3:48 PM, J. Landman Gay via use-livecode > wrote: >

Re: Converting from unicode to ASCII

2020-09-23 Thread Richmond Mathewson via use-livecode
"This communication may be unlawfully collected and stored by the Agents of a large number of governments in secret. The parties to this email do not consent to the retrieving or storing of this communication and any related metadata, as well as printing, copying, re-transmitting,disseminating, or

Re: Converting from unicode to ASCII

2020-09-23 Thread Richmond Mathewson via use-livecode
Personally I think deleting everything that is not inwith the ASCII range is potentially a bit dangerous [suppose ALL the letters in the title are not inwith the ASCII range], so I would favour using some sort of lookup table/substitution list. Certainly letters such as accented 'e' can just be

Re: Converting from unicode to ASCII

2020-09-23 Thread J. Landman Gay via use-livecode
On 9/22/20 10:42 PM, Mark Wieder via use-livecode wrote: On 9/22/20 7:58 PM, J. Landman Gay via use-livecode wrote: Is this just a temporary filename (not long-term storage)? No, the stacks are uploaded to AWS and remain there, retrieved from the server on request. There are currently

Re: Converting from unicode to ASCII

2020-09-22 Thread Mark Wieder via use-livecode
On 9/22/20 7:58 PM, J. Landman Gay via use-livecode wrote: Is this just a temporary filename (not long-term storage)? No, the stacks are uploaded to AWS and remain there, retrieved from the server on request. There are currently hundreds of them with more added frequently. That's why I'm

Re: Converting from unicode to ASCII

2020-09-22 Thread J. Landman Gay via use-livecode
Combining responses: "NormalizeText" always returns unicode for all four of its variations, so no go. And as Paul pointed out, if the language is Chinese, deleting all non-ascii characters would leave nothing. On the other hand, we are only converting to Roman languages right now, so this

Re: Converting from unicode to ASCII

2020-09-22 Thread John Balgenorth via use-livecode
You could easily convert it to HEX but that would make the file name exactly twice as long. JB > On Sep 22, 2020, at 4:43 PM, Bob Sneidar via use-livecode > wrote: > > There’s a tempname() function??? Ohhh fun!! > > Bob S > > > On Sep 22, 2020, at 4:22 PM, Mark Wieder via use-livecode >

Re: Converting from unicode to ASCII

2020-09-22 Thread Bob Sneidar via use-livecode
There’s a tempname() function??? Ohhh fun!! Bob S On Sep 22, 2020, at 4:22 PM, Mark Wieder via use-livecode mailto:use-livecode@lists.runrev.com>> wrote: Can you use tempname() to create and retrieve the stack? ___ use-livecode mailing list

RE: Converting from unicode to ASCII

2020-09-22 Thread Ralph DiMola via use-livecode
How about converting the non ascii characters into to base 64 ascii? This could produce really long filenames. I guess you could truncate if needed. Also the filename would make no sense at all if it was all non ascii. Ralph DiMola IT Director Evergreen Information Services

Re: Converting from unicode to ASCII

2020-09-22 Thread Mark Wieder via use-livecode
On 9/22/20 3:48 PM, J. Landman Gay via use-livecode wrote: I have a stack with an index. When a user clicks a line, a handler uses the clicktext to create a file name which is always the clicktext plus the ".livecode" extension. The stack is then downloaded from an AWS server and displayed.

Re: Converting from unicode to ASCII

2020-09-22 Thread Paul Dupuis via use-livecode
On 9/22/2020 6:58 PM, Devin Asay via use-livecode wrote: But it that doesn’t help, and if nobody ever sees the filenames, why not just loop through the string and delete anything that’s not in ASCII range? Well, if the name is in Chinese,  you would delete the entire name.

Re: Converting from unicode to ASCII

2020-09-22 Thread Devin Asay via use-livecode
Hi Jacque, Have you looked at the normalizeText function? I’m not sure that would help, but maybe it’s a start. But it that doesn’t help, and if nobody ever sees the filenames, why not just loop through the string and delete anything that’s not in ASCII range? Devin > On Sep 22, 2020, at