Low hanging fruit:
LATIN_LC =: (97+i.26){a.
LATIN_UC =: (65+i.26){a.
On Fri, Jun 21, 2013 at 2:06 PM, Alexander Epifanov <[email protected]>wrote:
> Hello,
>
> I just made small program, but, as usual, I am absolutely do not like how
> it looks:
>
> Description here is:
> http://leonardo-m.livejournal.com/109201.html<
> http://leonardo-m.livejournal.com/109201.html?thread=190353>
> "
>
> Read a file of text, determine the n most frequently used words, and print
> out a sorted list of those words along with their frequencies.
>
> A solution with shell scripting:
>
> Here's the script that does that, with each command given its own line:
>
> bash:
> 1: tr -cs A-Za-z '\n' |
> 2: tr A-Z a-z |
> 3: sort |
> 4: uniq -c |
> 5: sort -rn |
> 6: head -${1}"
>
> My solution is:
> <pre>
> LATIN_UC=:'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
> LATIN_LC=:'abcdefghijklmnopqrstuvwxyz'
> LATIN=:LATIN_UC,LATIN_LC
>
> s =: ' ',1!:1<'1.txt'
>
> s =: (LATIN_LC,a.) {~ (LATIN_UC,a.) i. s NB. lowcase
> ws =: }.&.> (0&= @ e.&LATIN <;.1 ]) s NB. split into words
> ws =: ((0<>@(#&.>@])) # ]) ws NB. delete empty words
> oc =: #&.>ws</.i.#ws NB. occurence
> t=:|:(~.ws) ,: oc NB. table word<->occurence
> f3i=:3{.\:>oc NB. first 3 sorder indexes
> f3i { t NB. first 3 words with occ
> </pre>
>
> Thank you,
>
> --
> Regards,
> Alexander.
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm