Maybe your soccer team, Linda; evidently a brilliant one. We’re just back
home from a sunny afternoon’s walk to find you’re World Champs again, so
congrats.
I might stay on topic next time...
Mike
Sent from my iPad
> On 7 Jul 2019, at 12:48, Linda Alvord <[email protected]> wrote:
>
> I just began to untangle this long story.
>
> Want a big surprise?
>
> Then try this!
> z =: 0 : 0
> A001|B002
> C003|A001|B002
> B002|A001
> C003|D004|A001
> E005|F006
> D004|C003
> )
> load 'debug'
> dissect 'z'
>
> Cheer on out soccer team, too.
>
> Linda
>
>
> -----Original Message-----
> From: Programming <[email protected]> On Behalf Of
> Louis de Forcrand
> Sent: Friday, July 5, 2019 10:36 PM
> To: [email protected]
> Subject: Re: [Jprogramming] How to improve speed searching for substrings
> using "E."?
>
> Hi again,
>
> I’m back with a speedier solution.
> comaxs is a conjunction, and if P is a boolean-valued dyadic verb and L any
> monadic verb on the items of the array A, P comaxs L A does the following:
>
> P and L are supposed to represent orderings on the elements of y. P
> represents a partial order, such that x P y says whether x is less than or
> equal to y for x and y any items of A. L is a “labelling” of the items of A,
> which represents a _total_ order compatible with P, in the sense that x P y
> implies that (L x) <: L y (or more generally that L x comes first in /:~(L
> x),L y).
>
> Then the result of the computation is the set of maximal items of A for the
> partial order P, that is, items not less than any other item.
>
> As mentioned earlier, the particular problem of inclusion of part number
> lists is a special case of this; indeed we can take
>
> P=: +./@E.&>
> L=: #@>
>
> since, for part number lists (or any kind of lists) x and y, if x is included
> in y, then certainly #x is no larger than #y.
>
> To measure performance we need a big, rather random dataset. Unfortunately I
> don’t know of any place I can find huge lists of part numbers, some of which
> include others, but if we look instead at lists of letters, we have a great
> data source which is the dictionary. At the following website we can find a
> quite huge file words_alpha.txt containing lots (too many for any of my
> devices probably) of english words, many of which are included in others:
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdwyl%2Fenglish-words&data=02%7C01%7C%7Cb8f0e8413c844f0eca3008d701baa2cc%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636979773413990492&sdata=xJqkJsb2SA%2BJnXwgnp6K%2BNCxpdQZPbldQZfjOHQ079o%3D&reserved=0
>
> Running freads on it allows us to try out cmaxs on a part of the English
> language. Sadly though I’m on my phone right now, and I can’t freads
> downloaded files, so either I’ll do it at a later time or someone else can
> try.
>
> The code works thanks to the previous observation used in fastmax that we can
> safely remove an element a from A that is included in another element, and no
> longer check other items for inclusion in a. The reason this verb is much
> faster than fastmax is because now we also know that if #x is greater than #y
> then we never have to check if x is included in y. (More generally, (L x) > L
> y implies -. x P y.) We therefore sort A by decreasing list length (L value)
> prior to running a modified version of fastmax which only checks if (i{A) P
> (j{A) when i > j.
> The last bit if speedup if achieved by manipulating bit arrays rather than
> the actual array A itself.
>
> Cheers,
> Louis
>
>
> cmaxs=: 2 : 0
> n=. #B=. 0"0 C=. 1"0 y=. (\: v) y
> while. n>i=. C i.1 do.
> B=. 1 i}B
> C=. 0 i}C
> C=. -.@:u&(i{y)&.(C&#) y
> end.
> B#y
> )
>
> ----------------------------------------------------------------------
> For information about J forums see
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jsoftware.com%2Fforums.htm&data=02%7C01%7C%7Cb8f0e8413c844f0eca3008d701baa2cc%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636979773413990492&sdata=7dOWNkx4lsOGTnRtRwLUqHy3HvQosYprxJi0aW%2Ff0fA%3D&reserved=0
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm