Just for completeness, I added a line that incorporates the sequence check into the cancel logic. Works great
NB. hui progressive index NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of oc=: i.~ (] - {) /:@/: pi=: #@[ ({. i.&(,.oc) }.) [ i. , NB. argument is 3-col table of seq,key,qty NB. result is the unmatched transactions matchtrans=: 3 : 0 msk=. 0<{:"1 y sales=. msk#y returns=. (-.msk)#y ndx=. (}."1 sales) pi | }."1 returns cancels=. ndx<#sales NB. ensure cancel is after sale cancels =. cancels *. (({."1 (<<(cancels)#ndx){sales) < ({."1 (cancels#returns))) ((<<<cancels#ndx){sales),(-.cancels)#returns ) On Wed, Apr 12, 2017 at 4:14 PM, Joe Bogner <[email protected]> wrote: > Chris, this looks promising. Thanks for sharing. It's nearly instant on a > million rows. > > Which row had a return before a transaction? seq 10 was an example of a > partial return. The hypothetical customer returned 2 out of the 5 purchased > prior. I added that example since technically per the original spec it > wouldn't be cancelled out in this pass. It's a gray area so I may be able > to use this approach, especially since I don't see how to incorporate the > time element into the progressive index. > > Thanks again > > > On Wed, Apr 12, 2017 at 3:52 PM, chris burke <[email protected]> wrote: > >> This might be done by comparing matrices of sales and returns. The >> function >> below seems to be close to what you want. It doesn't exactly match your >> example, but your example has cases where returns are made before the >> transactions. Was this intentional? >> >> The code should run faster than a looping solution. >> >> Code: >> >> NB. hui progressive index >> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of >> oc=: i.~ (] - {) /:@/: >> pi=: #@[ ({. i.&(,.oc) }.) [ i. , >> >> NB. argument is 3-col table of seq,key,qty >> NB. result is the unmatched transactions >> matchtrans=: 3 : 0 >> msk=. 0<{:"1 y >> sales=. msk#y >> returns=. (-.msk)#y >> ndx=. (}."1 sales) pi | }."1 returns >> cancels=. ndx<#sales >> ((<<<cancels#ndx){sales),(-.cancels)#returns >> ) >> >> Example: >> >> dat=: ".;._2 (0 : 0) >> 1 1 1 >> 2 1 _1 >> 3 1 1 >> 4 2 1 >> 5 2 1 >> 6 2 3 >> 7 2 _1 >> 8 2 _1 >> 9 3 5 >> 10 3 _2 >> 11 3 2 >> ) >> >> matchtrans dat >> 3 1 1 >> 6 2 3 >> 9 3 5 >> >> >> On Wed, Apr 12, 2017 at 9:35 AM, Joe Bogner <[email protected]> wrote: >> >> > I have a problem I'm trying to solve in different languages. I have a >> > solution in SQL and also in kdb which largely resembles the SQL >> solution. >> > I'm curious what a J solution would look like. More specifically, I'm >> > interested in picking the brains of others here to see if this type of >> > problem can be solved without looping (some form of scan?). >> > >> > EDIT: Initially I wrote this up thinking the J solution would difficult, >> > but it was actually fairly straightforward -- about 15 minutes, but >> still >> > would like to see if there are alternatives. If nothing else, maybe an >> > interesting problem to share. >> > >> > Example data: >> > >> > A store has a transaction log with a sequence for each transaction. The >> > transaction log records a key for a unique customer/item combination. >> The >> > transaction log records how many units were purchased or returned. >> > >> > Goal: >> > Attempt to match up related transactions and cancel out instances when >> the >> > customer/item combination is returned at the same quantity as a previous >> > transaction >> > >> > Examples: >> > >> > Joe buys 1 blue pen, which is defective, then returns the 1 defective >> blue >> > pen, then buys another blue pen. EXPECTED: cancel out first two >> > transactions and leave the the last one for 1 pen >> > >> > Bob buys 2 red pens in two separate transactions. He then buys 3 more. >> He >> > returns the first two purchases as two separate return transactions. >> > EXPECTED: cancel out all transactions except the one for qty 3 >> > >> > Jane buys 5 purple pens and subsequently returns two of them. She buys >> two >> > more. EXPECTED: No transactions match exactly, so nothing is cancelled >> out >> > >> > >> > Data: >> > >> > data=: 0 : 0 >> > seq key qty >> > 1 1 1 >> > 2 1 _1 >> > 3 1 1 >> > 4 2 1 >> > 5 2 1 >> > 6 2 3 >> > 7 2 _1 >> > 8 2 _1 >> > 9 3 5 >> > 10 3 _2 >> > 11 3 2 >> > ) >> > tbl =: ,. ' ' cut every cutLF data >> > 'seqs keys qtys' =: |: ". every }. tbl >> > >> > >> > Goal: >> > >> > goals =: 0 : 0 >> > >> > goal >> > >> > cancelled >> > >> > credit >> > >> > ok >> > >> > cancelled >> > >> > cancelled >> > >> > ok >> > >> > credit >> > >> > credit >> > >> > ok >> > >> > ok >> > >> > ok >> > >> > ) >> > >> > >> > >> > >> > tbl,.(cutLF goals) >> > >> > +---+---+---+---------+ >> > >> > |seq|key|qty|goal | >> > >> > +---+---+---+---------+ >> > >> > |1 |1 |1 |cancelled| >> > >> > +---+---+---+---------+ >> > >> > |2 |1 |_1 |credit | >> > >> > +---+---+---+---------+ >> > >> > |3 |1 |1 |ok | >> > >> > +---+---+---+---------+ >> > >> > |4 |2 |1 |cancelled| >> > >> > +---+---+---+---------+ >> > >> > |5 |2 |1 |cancelled| >> > >> > +---+---+---+---------+ >> > >> > |6 |2 |3 |ok | >> > >> > +---+---+---+---------+ >> > >> > |7 |2 |_1 |credit | >> > >> > +---+---+---+---------+ >> > >> > |8 |2 |_1 |credit | >> > >> > +---+---+---+---------+ >> > >> > |9 |3 |5 |ok | >> > >> > +---+---+---+---------+ >> > >> > |10 |3 |_2 |ok | >> > >> > +---+---+---+---------+ >> > >> > |11 |3 |2 |ok | >> > >> > +---+---+---+---------+ >> > >> > >> > >> > One approach: >> > >> > applycredits =: 3 : 0 >> > >> > goals=.(<'goal') >> > >> > creditids=.0 >> > >> > for_i. (i. # seqs) do. >> > >> > key=.i{keys >> > >> > seq=.i{seqs >> > >> > qty=.i{qtys >> > >> > nextcredit =.| {. qtys #~ ((key=keys)*(seqs>seq)*(qtys<0)) >> > >> > if. nextcredit = qty do. >> > >> > goals=.goals,<'cancelled' >> > >> > creditids =. creditids, seqs #~ ((key=keys)*(seqs>seq)*(qtys<0)) >> > >> > elseif. creditids e.~ seq do. >> > >> > goals=.goals,<'credit' >> > >> > elseif. do. >> > >> > goals=.goals,<'ok' >> > >> > end. >> > >> > end. >> > >> > goals >> > >> > ) >> > >> > tbl ,. ( applycredits 0 ) >> > >> > >> > +---+---+---+---------+ >> > >> > |seq|key|qty|goal | >> > >> > +---+---+---+---------+ >> > >> > |1 |1 |1 |cancelled| >> > >> > +---+---+---+---------+ >> > >> > |2 |1 |_1 |credit | >> > >> > +---+---+---+---------+ >> > >> > |3 |1 |1 |ok | >> > >> > +---+---+---+---------+ >> > >> > |4 |2 |1 |cancelled| >> > >> > +---+---+---+---------+ >> > >> > |5 |2 |1 |cancelled| >> > >> > +---+---+---+---------+ >> > >> > |6 |2 |3 |ok | >> > >> > +---+---+---+---------+ >> > >> > |7 |2 |_1 |credit | >> > >> > +---+---+---+---------+ >> > >> > |8 |2 |_1 |credit | >> > >> > +---+---+---+---------+ >> > >> > |9 |3 |5 |ok | >> > >> > +---+---+---+---------+ >> > >> > |10 |3 |_2 |ok | >> > >> > +---+---+---+---------+ >> > >> > |11 |3 |2 |ok | >> > >> > +---+---+---+---------+ >> > >> > >> > >> > (cutLF goals) -: ( applycredits 0 ) >> > >> > 1 >> > >> > >> > thanks for any input >> > ---------------------------------------------------------------------- >> > For information about J forums see http://www.jsoftware.com/forums.htm >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
