This appears to work! I'm extremely impressed. Truly awesome. Execution
time is unknown at this point as it's been running for a while without
completing yet.  :-)
After all, there are literally billions of comparisons being done by i.

(Should have used a subset first to verify, but I was so keen to see
whether the expression would run without blowing up with all the data in
one go).

Again, big Thank you!

On Mon, Sep 6, 2021 at 5:21 PM Raul Miller <rauldmil...@gmail.com> wrote:

> I should also note that gmail decided that part of the code was not
> code but a part of the previous reply.
>
> Here's a version which hopefully works around that issue:
>
> cnGroupIndices=:{{
>   NB. x: ORPnoid
>   NB. y: ORPioid
>   assert. 1=>./x#/.y
>   valid=. y e. x
>   cns=. (1+x i.valid#y) (1+I.valid)} i.1+#x NB. connections
>   r=. (</. i.@#) }.{:@({&cns^:a:"0)i.#cns
> }}
>
> Or you could use gmail's view original option,
>
> Hopefully this works,
>
> --
> Raul
>
> On Mon, Sep 6, 2021 at 11:18 AM Raul Miller <rauldmil...@gmail.com> wrote:
> >
> > Oh... I didn't think this through. Not all of these values are connected.
> >
> > I should also verify an assumption: that connections occur when a
> > value in ORPnoid matches a value in ORPioid (and vice versa). If this
> > assumption is invalid, then this won't be a useful result.
> >
> > Anyways, here's a fixed version of that code:
> >
> > cnGroupIndices=:{{
> >   NB. x: ORPnoid
> >   NB. y: ORPioid
> >   assert. 1=>./x#/.y
> >   valid=. y e. x
> >   cns=. (1+x i.valid#y) (1+I.valid)} i.1+#x NB. connections
> >   (</. i.@#) }.{:@({&cns^:a:"0)i.#cns
> > }}
> >
> > I hope this helps,
> >
> > --
> > Raul
> >
> > On Mon, Sep 6, 2021 at 10:41 AM Pablo Landherr <pablo.landh...@gmail.com>
> wrote:
> > >
> > > Unfortunately, it fails. Given time I might realize the problem but I
> > > suspect it's obvious to sharper minds than mine.
> > >
> > > #x=: ORPnoid
> > >
> > > 31636439
> > >
> > > #y=: ORPioid
> > >
> > > 31636439
> > >
> > > 1=>./x#/.y
> > >
> > > 1
> > >
> > > #cns=. (1+x i.y) (1+i.#x)} i.1+#x NB. connections
> > >
> > > 31636440
> > >
> > > r=: (</. i.@#) }.{:@({&cns^:a:"0)i.#cns
> > >
> > > |index error
> > >
> > > | r=:(</.i.@#)}. {:@({&cns^:a:"0)i.#cns
> > >
> > >
> > > JVERSION
> > >
> > > Engine: j807/j64/darwin
> > >
> > > Release-d: commercial/2019-03-18T16:07:14
> > >
> > > Library: 8.07.26
> > >
> > > Qt IDE: 1.7.10/5.9.8
> > >
> > > Platform: Darwin 64
> > >
> > > Installer: J807 install
> > >
> > > InstallPath: /applications/j64-807
> > >
> > > Contact: www.jsoftware.com
> > >
> > >
> > > On Mon, Sep 6, 2021 at 4:00 PM Raul Miller <rauldmil...@gmail.com>
> wrote:
> > >
> > > > Ok..
> > > >
> > > > So we can use indices into ORPnoid to uniquely identify each relevant
> > > > connection, and then use Michal Wallace's algorithm. (This means that
> > > > in our connection list, we'll add 1 to each of those indexes, since
> > > > that's what we had working.)
> > > >
> > > > In other words, I think that this should work:
> > > >
> > > > cnGroupIndices=:{{
> > > >   NB. x: ORPnoid
> > > >   NB. y: ORPioid
> > > >   assert. 1=>./x#/.y
> > > >   cns=. (1+x i.y) (1+i.#x)} i.1+#x NB. connections
> > > >   (</. i.@#) }.{:@({&cns^:a:"0)i.#cns
> > > > }}
> > > >
> > > > Running that should give you a list of boxes where each box contains
> > > > indices into ORPnoid (and ORPioid).
> > > >
> > > > So, for example, you should be able to do:
> > > >
> > > > ndx=: ORPnoid cnGroupIndices ORPioid
> > > >
> > > > And then, for example, you could do this:
> > > >    ndx {each ORPnoid
> > > >
> > > > I hope that that makes sense. (And, I hope that I did not make a
> silly
> > > > mistake here.)
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Raul
> > > >
> > > > On Mon, Sep 6, 2021 at 4:47 AM Pablo Landherr <
> pablo.landh...@gmail.com>
> > > > wrote:
> > > > >
> > > > > Raul,
> > > > > True to form, your approach might offer a way forward. To add some
> > > > details
> > > > > to the actual case:
> > > > >
> > > > > #ORPnoid
> > > > >
> > > > > 31636439
> > > > >
> > > > > #ORPioid
> > > > >
> > > > > 31636439
> > > > >
> > > > > #ORPioid -. ORPnoid NB. some ioid do not connect to any other
> > > > >
> > > > > 563228
> > > > >
> > > > > 10{.ORPioid NB. the original data is hex in character form, but I
> convert
> > > > > it to symbols on import to facilitate sorting and searching
> > > > >
> > > > > `69fba00 `6a25a00 `6a2c000 `6a37e00 `6a3f600 `6a44b00 `6a53500
> `6a5cf00
> > > > > `6a5ed00 `6a5f800
> > > > >
> > > > > >./ ORPnoid #/. ORPioid
> > > > >
> > > > > 1
> > > > >
> > > > > Where do I go from here?
> > > > >
> > > > > Thanks,
> > > > > Pablo
> > > > >
> > > > > On Mon, Sep 6, 2021 at 10:21 AM Raul Miller <rauldmil...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > If you get 1 from either
> > > > > >
> > > > > >    >./ NID #/.OID
> > > > > >
> > > > > > or
> > > > > >
> > > > > >    >/. OID #/. NID
> > > > > >
> > > > > > then you can represent the connection matrix with a connection
> list,
> > > > > > which would be about the same size as your OID or NID list.
> > > > > >
> > > > > > I hope this makes sense,
> > > > > >
> > > > > > --
> > > > > > Raul
> > > > > >
> > > > > > On Mon, Sep 6, 2021 at 3:53 AM Pablo Landherr <
> > > > pablo.landh...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > Thank you for your clever solutions. Unfortunately I
> underestimated
> > > > the
> > > > > > > problem of the size of the data I'm processing. As nid and oid
> of my
> > > > test
> > > > > > > data each have a tally of 3.16e7, the connection matrix of my
> data
> > > > > > contains
> > > > > > > 1e15 bits, which naturally gives me a limit error. I'll have
> to chew
> > > > > > > through it in chunks using a much less elegant approach.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Pablo
> > > > > > >
> > > > > > > On Fri, Sep 3, 2021 at 6:59 PM Raul Miller <
> rauldmil...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Your approach is more compact than mine, which would be a
> > > > significant
> > > > > > > > advantage for large collections.
> > > > > > > >
> > > > > > > > Also, I'd discard the 0 value before grouping (since 0 here
> is an
> > > > > > > > artifact of the representation and not a part of the
> original data
> > > > > > > > set).
> > > > > > > >
> > > > > > > > This gives:
> > > > > > > >
> > > > > > > > OID=: 1 9 6 2 10 7 3 11 4
> > > > > > > > NID=: 2 10 7 3 11 12 4 8 5
> > > > > > > > G=: NID OID} i.1+>./NID,OID
> > > > > > > >
> > > > > > > >    (</. #\)}.{:@({&G^:a:"0)i.#G
> > > > > > > > +---------+------+---------+
> > > > > > > > |1 2 3 4 5|6 7 12|8 9 10 11|
> > > > > > > > +---------+------+---------+
> > > > > > > >
> > > > > > > > Or, this could be sorted.
> > > > > > > >
> > > > > > > > That said, note that this only works for tree structured
> graphs.
> > > > Which
> > > > > > > > is probably always going to be the case here.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > --
> > > > > > > > Raul
> > > > > > > >
> > > > > > > > On Fri, Sep 3, 2021 at 12:53 PM Michal Wallace <
> > > > > > michal.wall...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Oh, Raul's version using +. (or) on the connection matrix
> is way
> > > > > > nicer
> > > > > > > > than
> > > > > > > > > my version, and gets rid of my bug with directions. Do that
> > > > instead.
> > > > > > :)
> > > > > > > > >
> > > > > > > > > On Fri, Sep 3, 2021 at 12:31 PM Raul Miller <
> > > > rauldmil...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I should note that your example connection matrix does
> not
> > > > seem to
> > > > > > > > > > match the oid, nid values you displayed.
> > > > > > > > > >
> > > > > > > > > > OID=: 1 9 6 2 10 7 3 11 4
> > > > > > > > > > NID=: 2 10 7 3 11 12 4 8 5
> > > > > > > > > >
> > > > > > > > > > Here's the connection matrix I see represented:
> > > > > > > > > >    ]CM=: 1 (<:OID,.NID)} 0$~,~>./OID,NID
> > > > > > > > > > 0 1 0 0 0 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 1 0 0 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 1 0 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 1 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 1 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 1
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 1 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 1 0
> > > > > > > > > > 0 0 0 0 0 0 0 1 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0
> > > > > > > > > >
> > > > > > > > > > Note that I am using a recent J version here. In older
> > > > versions of
> > > > > > J,
> > > > > > > > > > that would have to be 1 (<"1<:OID,.NID)} 0$~,~>./OID,NID
> > > > > > > > > >
> > > > > > > > > > Anyways, its transitive closure would be:
> > > > > > > > > >    (+. +./ .*~)^:_ CM
> > > > > > > > > > 0 1 1 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 1 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 1 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 1 0 0 0 0 1
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 1
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 1 0 1 1 0
> > > > > > > > > > 0 0 0 0 0 0 0 1 0 0 1 0
> > > > > > > > > > 0 0 0 0 0 0 0 1 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 0 0 0 0 0 0 0
> > > > > > > > > >
> > > > > > > > > > But that does not match your suggested grouping operation
> > > > (foo), so
> > > > > > > > > > let's assume that connections work both ways:
> > > > > > > > > >
> > > > > > > > > >     ]TC=: (+. +./ .*~)^:_ CM+.|:CM
> > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 1 1 1 1 1 0 0 0 0 0 0 0
> > > > > > > > > > 0 0 0 0 0 1 1 0 0 0 0 1
> > > > > > > > > > 0 0 0 0 0 1 1 0 0 0 0 1
> > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0
> > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0
> > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0
> > > > > > > > > > 0 0 0 0 0 0 0 1 1 1 1 0
> > > > > > > > > > 0 0 0 0 0 1 1 0 0 0 0 1
> > > > > > > > > >
> > > > > > > > > > That gives us something close to the grouping you asked
> for:
> > > > > > > > > >    TC </. 1+i.#TC
> > > > > > > > > > +---------+------+---------+
> > > > > > > > > > |1 2 3 4 5|6 7 12|8 9 10 11|
> > > > > > > > > > +---------+------+---------+
> > > > > > > > > >
> > > > > > > > > > The ordering here is different, but if that's important
> we
> > > > could
> > > > > > try to
> > > > > > > > > > fix it:
> > > > > > > > > >
> > > > > > > > > >    (/: {:@>) TC </. 1+i.#TC
> > > > > > > > > > +---------+---------+------+
> > > > > > > > > > |1 2 3 4 5|8 9 10 11|6 7 12|
> > > > > > > > > > +---------+---------+------+
> > > > > > > > > >
> > > > > > > > > > I hope this helps,
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Raul
> > > > > > > > > >
> > > > > > > > > > On Fri, Sep 3, 2021 at 11:21 AM Pablo Landherr <
> > > > > > > > pablo.landh...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > I want to group items that are linked to each other. I
> tried
> > > > to
> > > > > > use
> > > > > > > > some
> > > > > > > > > > > kind of connection matrix
> > > > > > > > > > >
> > > > > > > > > > > nid =/ oid NB. an example
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 1 0 0 0 0 0
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 1 0 0 0 0
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 0 1 0 0 0
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 0 0 1 0 0
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 0 0 0 1 0
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 0 0 0 0 0
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 0 0 0 0 1
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 0 0 0 0 0
> > > > > > > > > > >
> > > > > > > > > > > 0 0 0 0 0 0 0 0 0
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > to group items linked to each other but I can't figure
> out
> > > > how to
> > > > > > > > proceed
> > > > > > > > > > > from there. I'm hoping someone has a trick in their
> toolbox
> > > > to
> > > > > > share
> > > > > > > > with
> > > > > > > > > > > me.
> > > > > > > > > > >
> > > > > > > > > > > oid is the first number of all connections and nid is
> the
> > > > second
> > > > > > > > number.
> > > > > > > > > > So
> > > > > > > > > > > in this example 6 is connected to 7, 7 is connected to
> 12
> > > > and 12
> > > > > > has
> > > > > > > > no
> > > > > > > > > > > further connection. What should foo be?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > oid,.nid
> > > > > > > > > > >
> > > > > > > > > > > 1 2
> > > > > > > > > > >
> > > > > > > > > > > 9 10
> > > > > > > > > > >
> > > > > > > > > > > 6 7
> > > > > > > > > > >
> > > > > > > > > > > 2 3
> > > > > > > > > > >
> > > > > > > > > > > 10 11
> > > > > > > > > > >
> > > > > > > > > > > 7 12
> > > > > > > > > > >
> > > > > > > > > > > 3 4
> > > > > > > > > > >
> > > > > > > > > > > 11 8
> > > > > > > > > > >
> > > > > > > > > > > 4 5
> > > > > > > > > > >
> > > > > > > > > > > oid foo nid
> > > > > > > > > > >
> > > > > > > > > > > ┌─────────┬─────────┬──────┐
> > > > > > > > > > >
> > > > > > > > > > > │1 2 3 4 5│9 10 11 8│6 7 12│
> > > > > > > > > > >
> > > > > > > > > > > └─────────┴─────────┴──────┘
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thank you,
> > > > > > > > > > >
> > > > > > > > > > > Pablo
> > > > > > > > > > >
> > > > > > > >
> > > >
> ----------------------------------------------------------------------
> > > > > > > > > > > For information about J forums see
> > > > > > > > http://www.jsoftware.com/forums.htm
> > > > > > > > > >
> > > > > >
> ----------------------------------------------------------------------
> > > > > > > > > > For information about J forums see
> > > > > > http://www.jsoftware.com/forums.htm
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> ----------------------------------------------------------------------
> > > > > > > > > For information about J forums see
> > > > > > http://www.jsoftware.com/forums.htm
> > > > > > > >
> > > >
> ----------------------------------------------------------------------
> > > > > > > > For information about J forums see
> > > > http://www.jsoftware.com/forums.htm
> > > > > > > >
> > > > > > >
> > > >
> ----------------------------------------------------------------------
> > > > > > > For information about J forums see
> > > > http://www.jsoftware.com/forums.htm
> > > > > >
> ----------------------------------------------------------------------
> > > > > > For information about J forums see
> http://www.jsoftware.com/forums.htm
> > > > > >
> > > > >
> ----------------------------------------------------------------------
> > > > > For information about J forums see
> http://www.jsoftware.com/forums.htm
> > > >
> ----------------------------------------------------------------------
> > > > For information about J forums see
> http://www.jsoftware.com/forums.htm
> > > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to