/. gives you letter-tabulations and \ gets you prefixes:
</.~ 'Hello Jello'
┌─┬──┬────┬──┬─┬─┐
│H│ee│llll│oo│ │J│
└─┴──┴────┴──┴─┴─┘
(#;{.)/.~ 'Hello Jello'
┌─┬─┐
│1│H│
├─┼─┤
│2│e│
├─┼─┤
│4│l│
├─┼─┤
│2│o│
├─┼─┤
│1│ │
├─┼─┤
│1│J│
└─┴─┘
0 _1 { <@((#;[),(#;{.)/.~)\ 'Hello Jello'
┌─────┬────────────────┐
│┌─┬─┐│┌──┬───────────┐│
││1│H│││11│Hello Jello││
│├─┼─┤│├──┼───────────┤│
││1│H│││1 │H ││
│└─┴─┘│├──┼───────────┤│
│ ││2 │e ││
│ │├──┼───────────┤│
│ ││4 │l ││
│ │├──┼───────────┤│
│ ││2 │o ││
│ │├──┼───────────┤│
│ ││1 │ ││
│ │├──┼───────────┤│
│ ││1 │J ││
│ │└──┴───────────┘│
└─────┴────────────────┘
$<@((#;[),(#;{.)/.~)\ 'Hello Jello'
11
,./(#~ 9 < 0 0&{::"2) ((#;[),(#;{.)/.~)\ 'Hello Jello'
┌──┬──────────┬──┬───────────┐
│10│Hello Jell│11│Hello Jello│
├──┼──────────┼──┼───────────┤
│1 │H │1 │H │
├──┼──────────┼──┼───────────┤
│2 │e │2 │e │
├──┼──────────┼──┼───────────┤
│4 │l │4 │l │
├──┼──────────┼──┼───────────┤
│1 │o │2 │o │
├──┼──────────┼──┼───────────┤
│1 │ │1 │ │
├──┼──────────┼──┼───────────┤
│1 │J │1 │J │
└──┴──────────┴──┴───────────┘
So I'd start with those, and maybe work with case folding and
26$0 counts of letters or whatever's convenient for the task.
Performance:
timex '#/.~ 1e7#''h'''
0.0227
timex '#[\ 1e4#''h'''
0.034662
timex '#<\ 1e4#''h'''
0.013861
timex '##\ 1e7#''h'''
0.017643
So you should take care with the u of u\
On 2021-04-09 08:48, Emir U wrote:
s=: 'Hello Jello'
Given a string like the above, I need to tabulate the number of
occurrences of every letter for every prefix of length >=k. I also
need to know the length of the prefix. So:
<prefix length> <prefix> <letter> <count>
As an example, prefix length=3, prefix=ell, letter=o, count=2
In my real use case k may be quite large (say 20) and the string may
be very long. The final form needs to be something I can slice and
dice thereafter (like perhaps a sparse array). I'd be grateful for any
advice as to how to tackle this.
Emir
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm