[racket-users] Slack link! Re: #quickscript-competition week two

2020-07-10 Thread Stephen De Gabrielle
Hi 

My apologies! 

You must signup to slack using the slack signup link 
https://racket-slack.herokuapp.com/
to access the *#quickscript-competition 
** channel. *(I should have 
realised that - and I should have posted this a week ago)

Its also fine to discuss on this list - maybe prefix the subject line with 
'#quickscript-competition ' so those who are not interested can skip past. 

Thanks 

Stephen

PS if you prefer chat another platform; mastodon, riot, discord, twitter or 
something else that is fine too - my handle is spdegabrielle on most 
things. Just remember the most active chat for racketeers is the racket 
slack.

On Thursday, July 9, 2020 at 3:37:34 PM UTC+1 Stephen De Gabrielle wrote:

> *#quickscript-competition ** 
> week 
> two*
>
> *Prizes: *In addition to the glory and admiration of your peers…
> If you participate once, you get stickers,
> if you participate twice time, you get also a mug,
> if you participate three times, you get also a t-shirt
> (while stocks last)
> *You can participate more than once per week. *
>
> *Thank you to the participants in week 1:*
>
>- Breakout 
> by 
> Jens 
>Axel Søgaard : The classic Breakout game in a single script!
>- Format-selection 
> by 
> Alex 
>Harsányi : Format comments to the Racket Style Guide standard.
>- Robo-Head-Pat 
>
> by Lambduli 
>
> 
>  : 
>It is like a good work sticker on your homework - but for code.
>- Rot13 Decode/Encode 
> by 
> Karrq 
>: Fraq naq qrpbqr frperg zrffntrf yvxr Wnzrf Obaq!
>
> *Don’t forget to check them out!*
>
> *PSA: You don’t need to install Quickscript - it is bundled in DrRacket!*
>
> *Week 2…*starts with a bang with two awesome entries by Laurent Orseau:
>
>- Letterfall: See you code fall like rocks! 
>
>- Run this quickscript to install all scripts from the competition! 
>
>
> *Looking for ideas?*
>
>- make a Code-Prettify script that uses pretty-print 
>
> 
> and *reindent* to prettify code.
>
> More at 
> https://github.com/Quickscript-Competiton/July2020entries/blob/master/IDEAS.md
>
> Enjoy!
> Stephen
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/adb3c8ba-24ae-41d7-99e7-5c6b49674367n%40googlegroups.com.


Re: [racket-users] Are Regular Expression classes Unicode aware?

2020-07-10 Thread Peter W A Wood
Dear Ryan

Thank you very much for the kind, detailed explanation which I will study 
carefully. It was not my intention to reply to you off-list. I hope I have 
correctly addressed this reply to appear on-list.

Peter

> On 10 Jul 2020, at 15:47, Ryan Culpepper  wrote:
> 
> (I see this went off the mailing list. If you reply, please consider CCing 
> the list.)
> 
> Yes, I understood your goal of trying to capture the notion of Unicode 
> "alphabetic" characters with a regular expression.
> 
> As far as I know, Unicode doesn't have a notion of "alphabetic", but it does 
> assign every code point to a "General category", consisting of a main 
> category and a subcategory. There is a category called "Letter", which seems 
> like one reasonable generalization of "alphabetic".
> 
> In Racket, you can get the code for a character's category using 
> `char-general-category`. For example:
> 
>   > (char-general-category #\A)
>   'lu
>   > (char-general-category #\é)
>   'll
>   > (char-general-category #\ß)
>   'll
>   > (char-general-category #\7)
>   'nd
> 
> The general category for "A" is "Letter, uppercase", which has the code "Lu", 
> which Racket turns into the symbol 'lu. The general category of "é" is 
> "Letter, lowercase", code "Ll", which becomes 'll. The general category of 
> "7" is "Number, decimal digit", code "Nd".
> 
> In Racket regular expressions, the \p{category} syntax lets you recognize 
> characters from a specific category. For example, \p{Lu} recognizes 
> characters with the category "Letter, uppercase", and \p{L} recognizes 
> characters with the category "Letter", which is the union of "Letter, 
> uppercase", "Letter, lowercase", and so on.
> 
> So the regular expression #px"^\\p{L}+$" recognizes sequences of one or more 
> Unicode letters. For example:
> 
>   > (regexp-match? #px"^\\p{L}+$" "héllo")
>   #t
>   > (regexp-match? #px"^\\p{L}+$" "straße")
>   #t
>   > (regexp-match? #px"^\\p{L}+$" "二の句")
>   #t
>   > (regexp-match? #px"^\\p{L}+$" "abc123")
>   #f ;; No, contains numbers
> 
> There are still some problems to watch out for, though. For example, accented 
> characters like "é" can be expressed as a single pre-composed code point or 
> "decomposed" into a base letter and a combining mark. You can get the 
> decomposed form by converting the string to "decomposed normal form" (NFD), 
> and the regexp above won't match that string.
> 
>   > (map char-general-category (string->list (string-normalize-nfd "é")))
>   '(ll mn)
>   > (regexp-match? #px"^\\p{L}+$" (string-normalize-nfd "héllo"))
>   #f
> 
> One fix would be to call `string-normalize-nfc` first, but some 
> letter-modifier pairs don't have pre-composed versions. Another fix would be 
> to expand the regexp to include modifiers. You'd have to decide which is 
> better based on your application.
> 
> Ryan
> 
> 
> 
> On Fri, Jul 10, 2020 at 2:10 AM Peter W A Wood  wrote:
> Ryan
> 
> > On 9 Jul 2020, at 22:52, Ryan Culpepper  wrote:
> > 
> > If you want a regular expression that does match the example string, you 
> > can use the \p{property} notation. For example:
> > 
> >   > (regexp-match? #px"^\\p{L}+$" "h\uFFC3\uFFA9llo")
> >   #t
> > 
> > The "Regexp Syntax" docs have a grammar for regular expressions with links 
> > to examples.
> > 
> > Ryan
> 
> Thanks. I used héllo as an example. I was wondering if there was a way of 
> specifying a regular expression group for Unicode “alphabetic” characters. 
> 
> On reflection, it seems a somewhat esoteric requirement that is almost 
> impossible to satisfy. By way of example, would 
> “Straße" be considered alphabetic? Would “二の句” be considered alphabetic?
> 
> Strangely, Python considered the Japanese characters as being alphabetic but 
> will not accept “Straße” as a valid string. (I suspect this is due to some 
> problem relating to Locale..
> 
>  >>> "二の句".isalpha()
> True
> >>> “Straße".isalpha()
>   File "", line 1
> “Straße".isalpha()
>   ^
> SyntaxError: invalid character in identifier
> 
> Clearly, trying to identify “Unicode” alphabetic characters is far from 
> straightforward, though it may well be useful in processing some language 
> texts.
> 
> Peter
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/BC855B5D-80BF-458B-A2D2-9570B0436646%40gmail.com.