Re: [Jprogramming] matching & cancelling transactions

Louis de Forcrand Thu, 13 Apr 2017 00:31:47 -0700

Try this. On my phone, it runs almost 6 times faster than your version on your 
small example data, but uses more space (maybe sparse matrices would be 
faster?), with:
   1e3 timespacex 'applycredits'''''
0.000630182 4608
vs
   1e3 timespacex 'ac t'
0.000131279 6400


Of course this extra space could be a big slowdown on larger data. At least 
it's more J-ish:

ac=: +/"1@s@p
s=: [: u&.(,&0) d&.(0&,)
u=: [: +/\.^:_1 (0<.+)/\.   NB. u -: d&.(-@|.)
d=: [: +/\ ^:_1 (0>.+)/\.&.|.
p=: |:@=@:| * [: * {:"1

t=: (}."1 /: {."1) ".;._2 data
ac t

It works on a minimal form of your data where records are sorted by their 
index, and results are in the form:
 0: ok
 1: cancelled
_1: credit

Louis

> On 12 Apr 2017, at 21:57, 'Bo Jacoby' via Programming 
> <[email protected]> wrote:
> 
> Hi Joe!
> My favorite datastructure is ORDINAL FRACTIONS - the algebra of data 
> 
> |  
> |   
> |   
> |   |    |
> 
>   |
> 
>  |
> |  
> |    |  
> ORDINAL FRACTIONS - the algebra of data
> This paper was submitted to the 10th World Computer Congress, IFIP 1986 
> conference, but rejected by the referee....  |   |
> 
>  |
> 
>  |
> 
> 
> Your data are coded like this
> 10 Joe
> 20 Bob
> 30 Jane
> 01 blue
> 02 red
> 03 purple
> 11 1
> 11 -1
> 11 1
> 22 1
> 22 1
> 22 3
> 22 -1
> 22 -1
> 33 5
> 33 -2
> 33 2
> (Written with double CRs because the mail program has a history of deleting 
> my CRs).
> Summation gives the result
> 10 Joe
> 20 Bob
> 30 Jane
> 01 blue
> 02 red
> 03 purple
> 11 1
> 22 3
> 
> 33 5
> I have not done the summation in J, but I'd like to do it.
> Perhaps this helps you.
> Bo.
> 
> 
> 
>    Den 0:04 torsdag den 13. april 2017 skrev chris burke 
> <[email protected]>:
> 
> 
> Incidentally, for production code, I suggest starting by removing any sales
> not matched in returns and vice versa, so that the matching algorithm is
> applied only to potential matches.
> 
>> On Wed, Apr 12, 2017 at 2:53 PM, chris burke <[email protected]> wrote:
>> 
>> Great.
>> 
>> In case you need more complicated handling of the "gray area"
>> transactions, I believe they would be relatively few in number, so most of
>> the time you could do the matching efficiently, then check for any keys
>> with returns preceding sales. For those, setting aside the first such
>> return and repeating should clear them quickly.
>> 
>> Timing should be well under 1 second for a million records.
>> 
>>> On Wed, Apr 12, 2017 at 1:57 PM, Joe Bogner <[email protected]> wrote:
>>> 
>>> Just for completeness, I added a line that incorporates the sequence check
>>> into the cancel logic. Works great
>>> 
>>> NB. hui progressive index
>>> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of
>>> oc=: i.~ (] - {) /:@/:
>>> pi=: #@[ ({. i.&(,.oc) }.) [ i. ,
>>> 
>>> NB. argument is 3-col table of seq,key,qty
>>> NB. result is the unmatched transactions
>>> matchtrans=: 3 : 0
>>> msk=. 0<{:"1 y
>>> sales=. msk#y
>>> returns=. (-.msk)#y
>>> ndx=. (}."1 sales) pi | }."1 returns
>>> cancels=. ndx<#sales
>>> NB. ensure cancel is after sale
>>> cancels =. cancels *. (({."1 (<<(cancels)#ndx){sales) < ({."1
>>> (cancels#returns)))
>>> ((<<<cancels#ndx){sales),(-.cancels)#returns
>>> )
>>> 
>>> 
>>>> On Wed, Apr 12, 2017 at 4:14 PM, Joe Bogner <[email protected]> wrote:
>>>> 
>>>> Chris, this looks promising. Thanks for sharing. It's nearly instant on
>>> a
>>>> million rows.
>>>> 
>>>> Which row had a return before a transaction? seq 10 was an example of a
>>>> partial return. The hypothetical customer returned 2 out of the 5
>>> purchased
>>>> prior. I added that example since technically per the original spec it
>>>> wouldn't be cancelled out in this pass.  It's a gray area so I may be
>>> able
>>>> to use this approach, especially since I don't see how to incorporate
>>> the
>>>> time element into the progressive index.
>>>> 
>>>> Thanks again
>>>> 
>>>> 
>>>> On Wed, Apr 12, 2017 at 3:52 PM, chris burke <[email protected]>
>>> wrote:
>>>> 
>>>>> This might be done by comparing matrices of sales and returns. The
>>>>> function
>>>>> below seems to be close to what you want. It doesn't exactly match your
>>>>> example, but your example has cases where returns are made before the
>>>>> transactions. Was this intentional?
>>>>> 
>>>>> The code should run faster than a looping solution.
>>>>> 
>>>>> Code:
>>>>> 
>>>>> NB. hui progressive index
>>>>> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of
>>>>> oc=: i.~ (] - {) /:@/:
>>>>> pi=: #@[ ({. i.&(,.oc) }.) [ i. ,
>>>>> 
>>>>> NB. argument is 3-col table of seq,key,qty
>>>>> NB. result is the unmatched transactions
>>>>> matchtrans=: 3 : 0
>>>>> msk=. 0<{:"1 y
>>>>> sales=. msk#y
>>>>> returns=. (-.msk)#y
>>>>> ndx=. (}."1 sales) pi | }."1 returns
>>>>> cancels=. ndx<#sales
>>>>> ((<<<cancels#ndx){sales),(-.cancels)#returns
>>>>> )
>>>>> 
>>>>> Example:
>>>>> 
>>>>>     dat=: ".;._2 (0 : 0)
>>>>> 1 1 1
>>>>> 2 1 _1
>>>>> 3 1 1
>>>>> 4 2 1
>>>>> 5 2 1
>>>>> 6 2 3
>>>>> 7 2 _1
>>>>> 8 2 _1
>>>>> 9 3 5
>>>>> 10 3 _2
>>>>> 11 3 2
>>>>> )
>>>>> 
>>>>>     matchtrans dat
>>>>> 3 1 1
>>>>> 6 2 3
>>>>> 9 3 5
>>>>> 
>>>>> 
>>>>> On Wed, Apr 12, 2017 at 9:35 AM, Joe Bogner <[email protected]>
>>> wrote:
>>>>> 
>>>>>> I have a problem I'm trying to solve in different languages. I have a
>>>>>> solution in SQL and also in kdb which largely resembles the SQL
>>>>> solution.
>>>>>> I'm curious what a J solution would look like. More specifically, I'm
>>>>>> interested in picking the brains of others here to see if this type
>>> of
>>>>>> problem can be solved without looping (some form of scan?).
>>>>>> 
>>>>>> EDIT: Initially I wrote this up thinking the J solution would
>>> difficult,
>>>>>> but it was actually fairly straightforward -- about 15 minutes, but
>>>>> still
>>>>>> would like to see if there are alternatives. If nothing else, maybe
>>> an
>>>>>> interesting problem to share.
>>>>>> 
>>>>>> Example data:
>>>>>> 
>>>>>> A store has a transaction log with a sequence for each transaction.
>>> The
>>>>>> transaction log records a key for a unique customer/item combination.
>>>>> The
>>>>>> transaction log records how many units were purchased or returned.
>>>>>> 
>>>>>> Goal:
>>>>>> Attempt to match up related transactions and cancel out instances
>>> when
>>>>> the
>>>>>> customer/item combination is returned at the same quantity as a
>>> previous
>>>>>> transaction
>>>>>> 
>>>>>> Examples:
>>>>>> 
>>>>>> Joe buys 1 blue pen, which is defective, then returns the 1 defective
>>>>> blue
>>>>>> pen, then buys another blue pen. EXPECTED: cancel out first two
>>>>>> transactions and leave the the last one for 1 pen
>>>>>> 
>>>>>> Bob buys 2 red pens in two separate transactions. He then buys 3
>>> more.
>>>>> He
>>>>>> returns the first two purchases as two separate return transactions.
>>>>>> EXPECTED: cancel out all transactions except the one for qty 3
>>>>>> 
>>>>>> Jane buys 5 purple pens and subsequently returns two of them. She
>>> buys
>>>>> two
>>>>>> more. EXPECTED: No transactions match exactly, so nothing is
>>> cancelled
>>>>> out
>>>>>> 
>>>>>> 
>>>>>> Data:
>>>>>> 
>>>>>> data=: 0 : 0
>>>>>> seq key qty
>>>>>> 1 1 1
>>>>>> 2 1 _1
>>>>>> 3 1 1
>>>>>> 4 2 1
>>>>>> 5 2 1
>>>>>> 6 2 3
>>>>>> 7 2 _1
>>>>>> 8 2 _1
>>>>>> 9 3 5
>>>>>> 10 3 _2
>>>>>> 11 3 2
>>>>>> )
>>>>>> tbl =: ,. ' ' cut every cutLF data
>>>>>> 'seqs keys qtys' =: |: ". every }. tbl
>>>>>> 
>>>>>> 
>>>>>> Goal:
>>>>>> 
>>>>>> goals =: 0 : 0
>>>>>> 
>>>>>> goal
>>>>>> 
>>>>>> cancelled
>>>>>> 
>>>>>> credit
>>>>>> 
>>>>>> ok
>>>>>> 
>>>>>> cancelled
>>>>>> 
>>>>>> cancelled
>>>>>> 
>>>>>> ok
>>>>>> 
>>>>>> credit
>>>>>> 
>>>>>> credit
>>>>>> 
>>>>>> ok
>>>>>> 
>>>>>> ok
>>>>>> 
>>>>>> ok
>>>>>> 
>>>>>> )
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> tbl,.(cutLF goals)
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |seq|key|qty|goal |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |1 |1 |1 |cancelled|
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |2 |1 |_1 |credit |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |3 |1 |1 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |4 |2 |1 |cancelled|
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |5 |2 |1 |cancelled|
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |6 |2 |3 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |7 |2 |_1 |credit |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |8 |2 |_1 |credit |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |9 |3 |5 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |10 |3 |_2 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |11 |3 |2 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> One approach:
>>>>>> 
>>>>>> applycredits =: 3 : 0
>>>>>> 
>>>>>> goals=.(<'goal')
>>>>>> 
>>>>>> creditids=.0
>>>>>> 
>>>>>> for_i. (i. # seqs) do.
>>>>>> 
>>>>>>   key=.i{keys
>>>>>> 
>>>>>>   seq=.i{seqs
>>>>>> 
>>>>>>   qty=.i{qtys
>>>>>> 
>>>>>>   nextcredit =.| {. qtys #~ ((key=keys)*(seqs>seq)*(qtys<0))
>>>>>> 
>>>>>>   if. nextcredit = qty do.
>>>>>> 
>>>>>>   goals=.goals,<'cancelled'
>>>>>> 
>>>>>>   creditids =. creditids, seqs #~ ((key=keys)*(seqs>seq)*(qtys<0))
>>>>>> 
>>>>>>   elseif. creditids e.~ seq do.
>>>>>> 
>>>>>>     goals=.goals,<'credit'
>>>>>> 
>>>>>>   elseif. do.
>>>>>> 
>>>>>>     goals=.goals,<'ok'
>>>>>> 
>>>>>> end.
>>>>>> 
>>>>>> end.
>>>>>> 
>>>>>> goals
>>>>>> 
>>>>>> )
>>>>>> 
>>>>>> tbl ,. ( applycredits 0 )
>>>>>> 
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |seq|key|qty|goal |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |1 |1 |1 |cancelled|
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |2 |1 |_1 |credit |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |3 |1 |1 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |4 |2 |1 |cancelled|
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |5 |2 |1 |cancelled|
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |6 |2 |3 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |7 |2 |_1 |credit |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |8 |2 |_1 |credit |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |9 |3 |5 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |10 |3 |_2 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> |11 |3 |2 |ok |
>>>>>> 
>>>>>> +---+---+---+---------+
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> (cutLF goals) -: ( applycredits 0 )
>>>>>> 
>>>>>> 1
>>>>>> 
>>>>>> 
>>>>>> thanks for any input
>>>>>> ------------------------------------------------------------
>>> ----------
>>>>>> For information about J forums see http://www.jsoftware.com/forum
>>> s.htm
>>>>> ----------------------------------------------------------------------
>>>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>>> 
>>>> 
>>>> 
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>> 
>> 
>> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> 
> 
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] matching & cancelling transactions

Reply via email to