Re: [Jprogramming] finding matching sets

R.E. Boss Sat, 26 Jul 2014 10:26:24 -0700

What information is available about difference in  performance between using 
symbols and indices?



R.E. Boss

(Add your info to http://www.jsoftware.com/jwiki/Community/Demographics )




> -----Original Message-----
> From: [email protected] [mailto:programming-
> [email protected]] On Behalf Of 'Pascal Jasmin' via
> Programming
> Sent: zaterdag 26 juli 2014 17:20
> To: [email protected]
> Subject: Re: [Jprogramming] finding matching sets
> 
> If your non key columns are very repetitive, you might find symbols
> worthwhile:
> 
> ({. (, <) [: s: }.)"1 t
> ┌──┬─────────┐
> │c1│`p1 `0.25│
> ├──┼─────────┤
> │c1│`p2 `0.35│
> ├──┼─────────┤
> │c2│`p1 `0.25│
> ├──┼─────────┤
> │c2│`p2 `0.35│
> ├──┼─────────┤
> │c3│`p1 `0.25│
> ├──┼─────────┤
> │c3│`p2 `0.35│
> ├──┼─────────┤
> │c3│`p3 `0.45│
> └──┴─────────┘
> 
> this alternative makes the non key columns into one symbol which should
> make comparisons faster for just your requirement.
> 
>    ({. (, <)  [: s: <@:(1&{::  , ' ' , [: ": 2&{::))"1 t
> ┌──┬────────┐
> │c1│`p1 0.25│
> ├──┼────────┤
> │c1│`p2 0.35│
> ├──┼────────┤
> │c2│`p1 0.25│
> ├──┼────────┤
> │c2│`p2 0.35│
> ├──┼────────┤
> │c3│`p1 0.25│
> ├──┼────────┤
> │c3│`p2 0.35│
> ├──┼────────┤
> │c3│`p3 0.45│
> └──┴────────┘
> 
> sticking the bossman's function ahead of it
> 
>  (([: ~. {."1) </.~ [: (i.~ ~.)({."1 </.}."1)) symboled=. ( ({. (, <)  [: s: 
> <@:(1&{::  , '
> ' , [: ": 2&{::))) "1 t
> ┌───────┬────┐
> │┌──┬──┐│┌──┐│
> ││c1│c2│││c3││
> │└──┴──┘│└──┘│
> └───────┴────┘
> 
> 
> ----- Original Message -----
> From: R.E. Boss <[email protected]>
> To: [email protected]
> Cc:
> Sent: Saturday, July 26, 2014 6:40:45 AM
> Subject: Re: [Jprogramming] finding matching sets
> 
> 
>    (~.{."1 t)</.~ (i.~ ~.)({."1 </.}."1) t
> +-------+----+
> |+--+--+|+--+|
> ||c1|c2|||c3||
> |+--+--+|+--+|
> +-------+----+
> 
> If that costs too much memory, try
> 
>    [T=:(i.~ ~.)"1 &.|:t
> 0 0 0
> 0 1 1
> 1 0 0
> 1 1 1
> 2 0 0
> 2 1 1
> 2 2 2
> 
>    (~.{."1 t)</.~ (i.~ ~.)({."1 </.}."1) T
> +-------+----+
> |+--+--+|+--+|
> ||c1|c2|||c3||
> |+--+--+|+--+|
> +-------+----+
> 
> 
> R.E. Boss
> 
> (Add your info to
> http://www.jsoftware.com/jwiki/Community/Demographics )
> 
> 
> 
> > -----Original Message-----
> > From: [email protected]
> [mailto:programming-
> > [email protected]] On Behalf Of Joe Bogner
> > Sent: vrijdag 25 juli 2014 13:08
> > To: [email protected]
> > Subject: [Jprogramming] finding matching sets
> >
> > Given the following data:
> >
> > t =: ;: ;._2 (0 : 0)
> > c1 p1 0.25
> > c1 p2 0.35
> > c2 p1 0.25
> > c2 p2 0.35
> > c3 p1 0.25
> > c3 p2 0.35
> > c3 p3 0.45
> > )
> >
> >
> > c1 has two rows (p1 0.25) and (p2 0.35)
> > c2 has two rows (p1 0.25) and (p2 0.35)
> > c3 has three rows (p1 025), (p2 0.35), (p3 0.45)
> >
> > How can I identify that c1 and c2 have the same set of values and that c3
> > is different?
> >
> > I'd like to run the algorithm on a 1.6M row table
> >
> > I created a prototype in javascript using a rough approach, but I haven't
> > translated it to J in case there is a better way:
> >
> > 1. Sort array by column 2 (product)
> > 2. Loop through the array and create a hash table of the concatenated
> > product/value pair (e.g: p2 0.35)  for each customer
> > 3. Loop through the hash table and create a list of customers for each
> > unique string of product/value pairs
> >
> > var t = function(){/*
> > c1 p1 0.25
> > c1 p2 0.35
> > c2 p1 0.25
> > c2 p2 0.35
> > c3 p1 0.25
> > c3 p2 0.35
> > c3 p3 0.45
> > */}.toString().slice(15,-4).split('\n').map(function(x) { return x.split('
> > ') })
> > t = t.sort(function(x,y) { return x[1]>y[1] })
> >
> > var cs = t.reduce(function(memo,val) { memo[val[0]] =
> > (memo[val[0]]||'')+val[1]+val[2]; return memo;}, {});
> >
> > //JSON.stringify(cs)
> > //"{"c1":"p10.25p20.35","c2":"p10.25p20.35","c3":"p10.25p20.35p30.45"}"
> >
> > var matches = Object.keys(cs).reduce(function(memo,val) { var key =
> > memo[cs[val]] = (memo[cs[val]] || []); key.push(val);  return memo;}, {})
> >
> > JSON.stringify(matches)
> >
> > "{"p10.25p20.35":["c1","c2"],"p10.25p20.35p30.45":["c3"]}"
> >
> > How should this problem be approached in J?
> > ----------------------------------------------------------------------
> > For information about J forums see
> http://www.jsoftware.com/forums.htm
> 
> 
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] finding matching sets

Reply via email to