Here's how I'd do it:
n {. \:~ (#;{.)/.~ ;: fread file
If I wanted to process a string instead of a file, I'd replace (fread
file) with the string.
If I wanted a different implementation of "what is a word" I'd replace the ;:
And so on... (maybe I care about the arbitrariness of "n" and I want
to treat treat all words of the same length the same way - then I
would have to define whether I include extra words in the result, or
if I discard some words, and then I would write a word which would
replace the {. in that sentence.)
--
Raul
On Fri, Jun 21, 2013 at 2:30 PM, I.T. Daniher <[email protected]> wrote:
> Low hanging fruit:
>
> LATIN_LC =: (97+i.26){a.
> LATIN_UC =: (65+i.26){a.
>
> On Fri, Jun 21, 2013 at 2:06 PM, Alexander Epifanov <[email protected]>wrote:
>
>> Hello,
>>
>> I just made small program, but, as usual, I am absolutely do not like how
>> it looks:
>>
>> Description here is:
>> http://leonardo-m.livejournal.com/109201.html<
>> http://leonardo-m.livejournal.com/109201.html?thread=190353>
>> "
>>
>> Read a file of text, determine the n most frequently used words, and print
>> out a sorted list of those words along with their frequencies.
>>
>> A solution with shell scripting:
>>
>> Here's the script that does that, with each command given its own line:
>>
>> bash:
>> 1: tr -cs A-Za-z '\n' |
>> 2: tr A-Z a-z |
>> 3: sort |
>> 4: uniq -c |
>> 5: sort -rn |
>> 6: head -${1}"
>>
>> My solution is:
>> <pre>
>> LATIN_UC=:'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>> LATIN_LC=:'abcdefghijklmnopqrstuvwxyz'
>> LATIN=:LATIN_UC,LATIN_LC
>>
>> s =: ' ',1!:1<'1.txt'
>>
>> s =: (LATIN_LC,a.) {~ (LATIN_UC,a.) i. s NB. lowcase
>> ws =: }.&.> (0&= @ e.&LATIN <;.1 ]) s NB. split into words
>> ws =: ((0<>@(#&.>@])) # ]) ws NB. delete empty words
>> oc =: #&.>ws</.i.#ws NB. occurence
>> t=:|:(~.ws) ,: oc NB. table word<->occurence
>> f3i=:3{.\:>oc NB. first 3 sorder indexes
>> f3i { t NB. first 3 words with occ
>> </pre>
>>
>> Thank you,
>>
>> --
>> Regards,
>> Alexander.
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm