This appears to work! I'm extremely impressed. Truly awesome. Execution time is unknown at this point as it's been running for a while without completing yet. :-) After all, there are literally billions of comparisons being done by i.
(Should have used a subset first to verify, but I was so keen to see whether the expression would run without blowing up with all the data in one go). Again, big Thank you! On Mon, Sep 6, 2021 at 5:21 PM Raul Miller <rauldmil...@gmail.com> wrote: > I should also note that gmail decided that part of the code was not > code but a part of the previous reply. > > Here's a version which hopefully works around that issue: > > cnGroupIndices=:{{ > NB. x: ORPnoid > NB. y: ORPioid > assert. 1=>./x#/.y > valid=. y e. x > cns=. (1+x i.valid#y) (1+I.valid)} i.1+#x NB. connections > r=. (</. i.@#) }.{:@({&cns^:a:"0)i.#cns > }} > > Or you could use gmail's view original option, > > Hopefully this works, > > -- > Raul > > On Mon, Sep 6, 2021 at 11:18 AM Raul Miller <rauldmil...@gmail.com> wrote: > > > > Oh... I didn't think this through. Not all of these values are connected. > > > > I should also verify an assumption: that connections occur when a > > value in ORPnoid matches a value in ORPioid (and vice versa). If this > > assumption is invalid, then this won't be a useful result. > > > > Anyways, here's a fixed version of that code: > > > > cnGroupIndices=:{{ > > NB. x: ORPnoid > > NB. y: ORPioid > > assert. 1=>./x#/.y > > valid=. y e. x > > cns=. (1+x i.valid#y) (1+I.valid)} i.1+#x NB. connections > > (</. i.@#) }.{:@({&cns^:a:"0)i.#cns > > }} > > > > I hope this helps, > > > > -- > > Raul > > > > On Mon, Sep 6, 2021 at 10:41 AM Pablo Landherr <pablo.landh...@gmail.com> > wrote: > > > > > > Unfortunately, it fails. Given time I might realize the problem but I > > > suspect it's obvious to sharper minds than mine. > > > > > > #x=: ORPnoid > > > > > > 31636439 > > > > > > #y=: ORPioid > > > > > > 31636439 > > > > > > 1=>./x#/.y > > > > > > 1 > > > > > > #cns=. (1+x i.y) (1+i.#x)} i.1+#x NB. connections > > > > > > 31636440 > > > > > > r=: (</. i.@#) }.{:@({&cns^:a:"0)i.#cns > > > > > > |index error > > > > > > | r=:(</.i.@#)}. {:@({&cns^:a:"0)i.#cns > > > > > > > > > JVERSION > > > > > > Engine: j807/j64/darwin > > > > > > Release-d: commercial/2019-03-18T16:07:14 > > > > > > Library: 8.07.26 > > > > > > Qt IDE: 1.7.10/5.9.8 > > > > > > Platform: Darwin 64 > > > > > > Installer: J807 install > > > > > > InstallPath: /applications/j64-807 > > > > > > Contact: www.jsoftware.com > > > > > > > > > On Mon, Sep 6, 2021 at 4:00 PM Raul Miller <rauldmil...@gmail.com> > wrote: > > > > > > > Ok.. > > > > > > > > So we can use indices into ORPnoid to uniquely identify each relevant > > > > connection, and then use Michal Wallace's algorithm. (This means that > > > > in our connection list, we'll add 1 to each of those indexes, since > > > > that's what we had working.) > > > > > > > > In other words, I think that this should work: > > > > > > > > cnGroupIndices=:{{ > > > > NB. x: ORPnoid > > > > NB. y: ORPioid > > > > assert. 1=>./x#/.y > > > > cns=. (1+x i.y) (1+i.#x)} i.1+#x NB. connections > > > > (</. i.@#) }.{:@({&cns^:a:"0)i.#cns > > > > }} > > > > > > > > Running that should give you a list of boxes where each box contains > > > > indices into ORPnoid (and ORPioid). > > > > > > > > So, for example, you should be able to do: > > > > > > > > ndx=: ORPnoid cnGroupIndices ORPioid > > > > > > > > And then, for example, you could do this: > > > > ndx {each ORPnoid > > > > > > > > I hope that that makes sense. (And, I hope that I did not make a > silly > > > > mistake here.) > > > > > > > > Thanks, > > > > > > > > -- > > > > Raul > > > > > > > > On Mon, Sep 6, 2021 at 4:47 AM Pablo Landherr < > pablo.landh...@gmail.com> > > > > wrote: > > > > > > > > > > Raul, > > > > > True to form, your approach might offer a way forward. To add some > > > > details > > > > > to the actual case: > > > > > > > > > > #ORPnoid > > > > > > > > > > 31636439 > > > > > > > > > > #ORPioid > > > > > > > > > > 31636439 > > > > > > > > > > #ORPioid -. ORPnoid NB. some ioid do not connect to any other > > > > > > > > > > 563228 > > > > > > > > > > 10{.ORPioid NB. the original data is hex in character form, but I > convert > > > > > it to symbols on import to facilitate sorting and searching > > > > > > > > > > `69fba00 `6a25a00 `6a2c000 `6a37e00 `6a3f600 `6a44b00 `6a53500 > `6a5cf00 > > > > > `6a5ed00 `6a5f800 > > > > > > > > > > >./ ORPnoid #/. ORPioid > > > > > > > > > > 1 > > > > > > > > > > Where do I go from here? > > > > > > > > > > Thanks, > > > > > Pablo > > > > > > > > > > On Mon, Sep 6, 2021 at 10:21 AM Raul Miller <rauldmil...@gmail.com > > > > > > wrote: > > > > > > > > > > > If you get 1 from either > > > > > > > > > > > > >./ NID #/.OID > > > > > > > > > > > > or > > > > > > > > > > > > >/. OID #/. NID > > > > > > > > > > > > then you can represent the connection matrix with a connection > list, > > > > > > which would be about the same size as your OID or NID list. > > > > > > > > > > > > I hope this makes sense, > > > > > > > > > > > > -- > > > > > > Raul > > > > > > > > > > > > On Mon, Sep 6, 2021 at 3:53 AM Pablo Landherr < > > > > pablo.landh...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > Thank you for your clever solutions. Unfortunately I > underestimated > > > > the > > > > > > > problem of the size of the data I'm processing. As nid and oid > of my > > > > test > > > > > > > data each have a tally of 3.16e7, the connection matrix of my > data > > > > > > contains > > > > > > > 1e15 bits, which naturally gives me a limit error. I'll have > to chew > > > > > > > through it in chunks using a much less elegant approach. > > > > > > > > > > > > > > Thanks, > > > > > > > Pablo > > > > > > > > > > > > > > On Fri, Sep 3, 2021 at 6:59 PM Raul Miller < > rauldmil...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > Your approach is more compact than mine, which would be a > > > > significant > > > > > > > > advantage for large collections. > > > > > > > > > > > > > > > > Also, I'd discard the 0 value before grouping (since 0 here > is an > > > > > > > > artifact of the representation and not a part of the > original data > > > > > > > > set). > > > > > > > > > > > > > > > > This gives: > > > > > > > > > > > > > > > > OID=: 1 9 6 2 10 7 3 11 4 > > > > > > > > NID=: 2 10 7 3 11 12 4 8 5 > > > > > > > > G=: NID OID} i.1+>./NID,OID > > > > > > > > > > > > > > > > (</. #\)}.{:@({&G^:a:"0)i.#G > > > > > > > > +---------+------+---------+ > > > > > > > > |1 2 3 4 5|6 7 12|8 9 10 11| > > > > > > > > +---------+------+---------+ > > > > > > > > > > > > > > > > Or, this could be sorted. > > > > > > > > > > > > > > > > That said, note that this only works for tree structured > graphs. > > > > Which > > > > > > > > is probably always going to be the case here. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > -- > > > > > > > > Raul > > > > > > > > > > > > > > > > On Fri, Sep 3, 2021 at 12:53 PM Michal Wallace < > > > > > > michal.wall...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Oh, Raul's version using +. (or) on the connection matrix > is way > > > > > > nicer > > > > > > > > than > > > > > > > > > my version, and gets rid of my bug with directions. Do that > > > > instead. > > > > > > :) > > > > > > > > > > > > > > > > > > On Fri, Sep 3, 2021 at 12:31 PM Raul Miller < > > > > rauldmil...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > I should note that your example connection matrix does > not > > > > seem to > > > > > > > > > > match the oid, nid values you displayed. > > > > > > > > > > > > > > > > > > > > OID=: 1 9 6 2 10 7 3 11 4 > > > > > > > > > > NID=: 2 10 7 3 11 12 4 8 5 > > > > > > > > > > > > > > > > > > > > Here's the connection matrix I see represented: > > > > > > > > > > ]CM=: 1 (<:OID,.NID)} 0$~,~>./OID,NID > > > > > > > > > > 0 1 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 1 0 0 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 1 0 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 1 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 1 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 1 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 1 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 1 0 > > > > > > > > > > 0 0 0 0 0 0 0 1 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > > > > > > > > > > > > Note that I am using a recent J version here. In older > > > > versions of > > > > > > J, > > > > > > > > > > that would have to be 1 (<"1<:OID,.NID)} 0$~,~>./OID,NID > > > > > > > > > > > > > > > > > > > > Anyways, its transitive closure would be: > > > > > > > > > > (+. +./ .*~)^:_ CM > > > > > > > > > > 0 1 1 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 1 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 1 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 1 0 0 0 0 1 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 1 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 1 0 1 1 0 > > > > > > > > > > 0 0 0 0 0 0 0 1 0 0 1 0 > > > > > > > > > > 0 0 0 0 0 0 0 1 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > > > > > > > > > > > > But that does not match your suggested grouping operation > > > > (foo), so > > > > > > > > > > let's assume that connections work both ways: > > > > > > > > > > > > > > > > > > > > ]TC=: (+. +./ .*~)^:_ CM+.|:CM > > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0 > > > > > > > > > > 0 0 0 0 0 1 1 0 0 0 0 1 > > > > > > > > > > 0 0 0 0 0 1 1 0 0 0 0 1 > > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0 > > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0 > > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0 > > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0 > > > > > > > > > > 0 0 0 0 0 1 1 0 0 0 0 1 > > > > > > > > > > > > > > > > > > > > That gives us something close to the grouping you asked > for: > > > > > > > > > > TC </. 1+i.#TC > > > > > > > > > > +---------+------+---------+ > > > > > > > > > > |1 2 3 4 5|6 7 12|8 9 10 11| > > > > > > > > > > +---------+------+---------+ > > > > > > > > > > > > > > > > > > > > The ordering here is different, but if that's important > we > > > > could > > > > > > try to > > > > > > > > > > fix it: > > > > > > > > > > > > > > > > > > > > (/: {:@>) TC </. 1+i.#TC > > > > > > > > > > +---------+---------+------+ > > > > > > > > > > |1 2 3 4 5|8 9 10 11|6 7 12| > > > > > > > > > > +---------+---------+------+ > > > > > > > > > > > > > > > > > > > > I hope this helps, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Raul > > > > > > > > > > > > > > > > > > > > On Fri, Sep 3, 2021 at 11:21 AM Pablo Landherr < > > > > > > > > pablo.landh...@gmail.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > I want to group items that are linked to each other. I > tried > > > > to > > > > > > use > > > > > > > > some > > > > > > > > > > > kind of connection matrix > > > > > > > > > > > > > > > > > > > > > > nid =/ oid NB. an example > > > > > > > > > > > > > > > > > > > > > > 0 0 0 1 0 0 0 0 0 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 1 0 0 0 0 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 0 1 0 0 0 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 0 0 1 0 0 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 0 0 0 1 0 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 0 0 0 0 0 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 0 0 0 0 1 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 0 0 0 0 0 > > > > > > > > > > > > > > > > > > > > > > 0 0 0 0 0 0 0 0 0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > to group items linked to each other but I can't figure > out > > > > how to > > > > > > > > proceed > > > > > > > > > > > from there. I'm hoping someone has a trick in their > toolbox > > > > to > > > > > > share > > > > > > > > with > > > > > > > > > > > me. > > > > > > > > > > > > > > > > > > > > > > oid is the first number of all connections and nid is > the > > > > second > > > > > > > > number. > > > > > > > > > > So > > > > > > > > > > > in this example 6 is connected to 7, 7 is connected to > 12 > > > > and 12 > > > > > > has > > > > > > > > no > > > > > > > > > > > further connection. What should foo be? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > oid,.nid > > > > > > > > > > > > > > > > > > > > > > 1 2 > > > > > > > > > > > > > > > > > > > > > > 9 10 > > > > > > > > > > > > > > > > > > > > > > 6 7 > > > > > > > > > > > > > > > > > > > > > > 2 3 > > > > > > > > > > > > > > > > > > > > > > 10 11 > > > > > > > > > > > > > > > > > > > > > > 7 12 > > > > > > > > > > > > > > > > > > > > > > 3 4 > > > > > > > > > > > > > > > > > > > > > > 11 8 > > > > > > > > > > > > > > > > > > > > > > 4 5 > > > > > > > > > > > > > > > > > > > > > > oid foo nid > > > > > > > > > > > > > > > > > > > > > > ┌─────────┬─────────┬──────┐ > > > > > > > > > > > > > > > > > > > > > > │1 2 3 4 5│9 10 11 8│6 7 12│ > > > > > > > > > > > > > > > > > > > > > > └─────────┴─────────┴──────┘ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you, > > > > > > > > > > > > > > > > > > > > > > Pablo > > > > > > > > > > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > > > > > > > For information about J forums see > > > > > > > > http://www.jsoftware.com/forums.htm > > > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > > > > > > For information about J forums see > > > > > > http://www.jsoftware.com/forums.htm > > > > > > > > > > > > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > > > > > For information about J forums see > > > > > > http://www.jsoftware.com/forums.htm > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > > > > For information about J forums see > > > > http://www.jsoftware.com/forums.htm > > > > > > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > > > For information about J forums see > > > > http://www.jsoftware.com/forums.htm > > > > > > > ---------------------------------------------------------------------- > > > > > > For information about J forums see > http://www.jsoftware.com/forums.htm > > > > > > > > > > > > ---------------------------------------------------------------------- > > > > > For information about J forums see > http://www.jsoftware.com/forums.htm > > > > > ---------------------------------------------------------------------- > > > > For information about J forums see > http://www.jsoftware.com/forums.htm > > > > > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm