Re: [Jprogramming] finding matching sets

robert therriault Sat, 26 Jul 2014 14:35:55 -0700

To quote Joey Tuttle from the Jconference...benchmarks. :-)

There are a lot of things that go into the decision to use symbols and in the 
end you test and decide if it provides the performance you want.


Cheers, bob

On Jul 26, 2014, at 1:21 PM, R.E. Boss <[email protected]> wrote:

> What information is available about difference in  performance between using 
> symbols and indices?
> 
> 
> R.E. Boss
> 
> (Add your info to http://www.jsoftware.com/jwiki/Community/Demographics )
> 
> 
> 
> 
>> -----Original Message-----
>> From: [email protected] [mailto:programming-
>> [email protected]] On Behalf Of 'Pascal Jasmin' via
>> Programming
>> Sent: zaterdag 26 juli 2014 17:20
>> To: [email protected]
>> Subject: Re: [Jprogramming] finding matching sets
>> 
>> If your non key columns are very repetitive, you might find symbols
>> worthwhile:
>> 
>> ({. (, <) [: s: }.)"1 t
>> ┌──┬─────────┐
>> │c1│`p1 `0.25│
>> ├──┼─────────┤
>> │c1│`p2 `0.35│
>> ├──┼─────────┤
>> │c2│`p1 `0.25│
>> ├──┼─────────┤
>> │c2│`p2 `0.35│
>> ├──┼─────────┤
>> │c3│`p1 `0.25│
>> ├──┼─────────┤
>> │c3│`p2 `0.35│
>> ├──┼─────────┤
>> │c3│`p3 `0.45│
>> └──┴─────────┘
>> 
>> this alternative makes the non key columns into one symbol which should
>> make comparisons faster for just your requirement.
>> 
>>   ({. (, <)  [: s: <@:(1&{::  , ' ' , [: ": 2&{::))"1 t
>> ┌──┬────────┐
>> │c1│`p1 0.25│
>> ├──┼────────┤
>> │c1│`p2 0.35│
>> ├──┼────────┤
>> │c2│`p1 0.25│
>> ├──┼────────┤
>> │c2│`p2 0.35│
>> ├──┼────────┤
>> │c3│`p1 0.25│
>> ├──┼────────┤
>> │c3│`p2 0.35│
>> ├──┼────────┤
>> │c3│`p3 0.45│
>> └──┴────────┘
>> 
>> sticking the bossman's function ahead of it
>> 
>> (([: ~. {."1) </.~ [: (i.~ ~.)({."1 </.}."1)) symboled=. ( ({. (, <)  [: s: 
>> <@:(1&{::  , '
>> ' , [: ": 2&{::))) "1 t
>> ┌───────┬────┐
>> │┌──┬──┐│┌──┐│
>> ││c1│c2│││c3││
>> │└──┴──┘│└──┘│
>> └───────┴────┘
>> 
>> 
>> ----- Original Message -----
>> From: R.E. Boss <[email protected]>
>> To: [email protected]
>> Cc:
>> Sent: Saturday, July 26, 2014 6:40:45 AM
>> Subject: Re: [Jprogramming] finding matching sets
>> 
>> 
>>   (~.{."1 t)</.~ (i.~ ~.)({."1 </.}."1) t
>> +-------+----+
>> |+--+--+|+--+|
>> ||c1|c2|||c3||
>> |+--+--+|+--+|
>> +-------+----+
>> 
>> If that costs too much memory, try
>> 
>>   [T=:(i.~ ~.)"1 &.|:t
>> 0 0 0
>> 0 1 1
>> 1 0 0
>> 1 1 1
>> 2 0 0
>> 2 1 1
>> 2 2 2
>> 
>>   (~.{."1 t)</.~ (i.~ ~.)({."1 </.}."1) T
>> +-------+----+
>> |+--+--+|+--+|
>> ||c1|c2|||c3||
>> |+--+--+|+--+|
>> +-------+----+
>> 
>> 
>> R.E. Boss
>> 
>> (Add your info to
>> http://www.jsoftware.com/jwiki/Community/Demographics )
>> 
>> 
>> 
>>> -----Original Message-----
>>> From: [email protected]
>> [mailto:programming-
>>> [email protected]] On Behalf Of Joe Bogner
>>> Sent: vrijdag 25 juli 2014 13:08
>>> To: [email protected]
>>> Subject: [Jprogramming] finding matching sets
>>> 
>>> Given the following data:
>>> 
>>> t =: ;: ;._2 (0 : 0)
>>> c1 p1 0.25
>>> c1 p2 0.35
>>> c2 p1 0.25
>>> c2 p2 0.35
>>> c3 p1 0.25
>>> c3 p2 0.35
>>> c3 p3 0.45
>>> )
>>> 
>>> 
>>> c1 has two rows (p1 0.25) and (p2 0.35)
>>> c2 has two rows (p1 0.25) and (p2 0.35)
>>> c3 has three rows (p1 025), (p2 0.35), (p3 0.45)
>>> 
>>> How can I identify that c1 and c2 have the same set of values and that c3
>>> is different?
>>> 
>>> I'd like to run the algorithm on a 1.6M row table
>>> 
>>> I created a prototype in javascript using a rough approach, but I haven't
>>> translated it to J in case there is a better way:
>>> 
>>> 1. Sort array by column 2 (product)
>>> 2. Loop through the array and create a hash table of the concatenated
>>> product/value pair (e.g: p2 0.35)  for each customer
>>> 3. Loop through the hash table and create a list of customers for each
>>> unique string of product/value pairs
>>> 
>>> var t = function(){/*
>>> c1 p1 0.25
>>> c1 p2 0.35
>>> c2 p1 0.25
>>> c2 p2 0.35
>>> c3 p1 0.25
>>> c3 p2 0.35
>>> c3 p3 0.45
>>> */}.toString().slice(15,-4).split('\n').map(function(x) { return x.split('
>>> ') })
>>> t = t.sort(function(x,y) { return x[1]>y[1] })
>>> 
>>> var cs = t.reduce(function(memo,val) { memo[val[0]] =
>>> (memo[val[0]]||'')+val[1]+val[2]; return memo;}, {});
>>> 
>>> //JSON.stringify(cs)
>>> //"{"c1":"p10.25p20.35","c2":"p10.25p20.35","c3":"p10.25p20.35p30.45"}"
>>> 
>>> var matches = Object.keys(cs).reduce(function(memo,val) { var key =
>>> memo[cs[val]] = (memo[cs[val]] || []); key.push(val);  return memo;}, {})
>>> 
>>> JSON.stringify(matches)
>>> 
>>> "{"p10.25p20.35":["c1","c2"],"p10.25p20.35p30.45":["c3"]}"
>>> 
>>> How should this problem be approached in J?
>>> ----------------------------------------------------------------------
>>> For information about J forums see
>> http://www.jsoftware.com/forums.htm
>> 
>> 
>> 
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>> 
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] finding matching sets

Reply via email to