Incidentally, for production code, I suggest starting by removing any sales not matched in returns and vice versa, so that the matching algorithm is applied only to potential matches.
On Wed, Apr 12, 2017 at 2:53 PM, chris burke <[email protected]> wrote: > Great. > > In case you need more complicated handling of the "gray area" > transactions, I believe they would be relatively few in number, so most of > the time you could do the matching efficiently, then check for any keys > with returns preceding sales. For those, setting aside the first such > return and repeating should clear them quickly. > > Timing should be well under 1 second for a million records. > > On Wed, Apr 12, 2017 at 1:57 PM, Joe Bogner <[email protected]> wrote: > >> Just for completeness, I added a line that incorporates the sequence check >> into the cancel logic. Works great >> >> NB. hui progressive index >> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of >> oc=: i.~ (] - {) /:@/: >> pi=: #@[ ({. i.&(,.oc) }.) [ i. , >> >> NB. argument is 3-col table of seq,key,qty >> NB. result is the unmatched transactions >> matchtrans=: 3 : 0 >> msk=. 0<{:"1 y >> sales=. msk#y >> returns=. (-.msk)#y >> ndx=. (}."1 sales) pi | }."1 returns >> cancels=. ndx<#sales >> NB. ensure cancel is after sale >> cancels =. cancels *. (({."1 (<<(cancels)#ndx){sales) < ({."1 >> (cancels#returns))) >> ((<<<cancels#ndx){sales),(-.cancels)#returns >> ) >> >> >> On Wed, Apr 12, 2017 at 4:14 PM, Joe Bogner <[email protected]> wrote: >> >> > Chris, this looks promising. Thanks for sharing. It's nearly instant on >> a >> > million rows. >> > >> > Which row had a return before a transaction? seq 10 was an example of a >> > partial return. The hypothetical customer returned 2 out of the 5 >> purchased >> > prior. I added that example since technically per the original spec it >> > wouldn't be cancelled out in this pass. It's a gray area so I may be >> able >> > to use this approach, especially since I don't see how to incorporate >> the >> > time element into the progressive index. >> > >> > Thanks again >> > >> > >> > On Wed, Apr 12, 2017 at 3:52 PM, chris burke <[email protected]> >> wrote: >> > >> >> This might be done by comparing matrices of sales and returns. The >> >> function >> >> below seems to be close to what you want. It doesn't exactly match your >> >> example, but your example has cases where returns are made before the >> >> transactions. Was this intentional? >> >> >> >> The code should run faster than a looping solution. >> >> >> >> Code: >> >> >> >> NB. hui progressive index >> >> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of >> >> oc=: i.~ (] - {) /:@/: >> >> pi=: #@[ ({. i.&(,.oc) }.) [ i. , >> >> >> >> NB. argument is 3-col table of seq,key,qty >> >> NB. result is the unmatched transactions >> >> matchtrans=: 3 : 0 >> >> msk=. 0<{:"1 y >> >> sales=. msk#y >> >> returns=. (-.msk)#y >> >> ndx=. (}."1 sales) pi | }."1 returns >> >> cancels=. ndx<#sales >> >> ((<<<cancels#ndx){sales),(-.cancels)#returns >> >> ) >> >> >> >> Example: >> >> >> >> dat=: ".;._2 (0 : 0) >> >> 1 1 1 >> >> 2 1 _1 >> >> 3 1 1 >> >> 4 2 1 >> >> 5 2 1 >> >> 6 2 3 >> >> 7 2 _1 >> >> 8 2 _1 >> >> 9 3 5 >> >> 10 3 _2 >> >> 11 3 2 >> >> ) >> >> >> >> matchtrans dat >> >> 3 1 1 >> >> 6 2 3 >> >> 9 3 5 >> >> >> >> >> >> On Wed, Apr 12, 2017 at 9:35 AM, Joe Bogner <[email protected]> >> wrote: >> >> >> >> > I have a problem I'm trying to solve in different languages. I have a >> >> > solution in SQL and also in kdb which largely resembles the SQL >> >> solution. >> >> > I'm curious what a J solution would look like. More specifically, I'm >> >> > interested in picking the brains of others here to see if this type >> of >> >> > problem can be solved without looping (some form of scan?). >> >> > >> >> > EDIT: Initially I wrote this up thinking the J solution would >> difficult, >> >> > but it was actually fairly straightforward -- about 15 minutes, but >> >> still >> >> > would like to see if there are alternatives. If nothing else, maybe >> an >> >> > interesting problem to share. >> >> > >> >> > Example data: >> >> > >> >> > A store has a transaction log with a sequence for each transaction. >> The >> >> > transaction log records a key for a unique customer/item combination. >> >> The >> >> > transaction log records how many units were purchased or returned. >> >> > >> >> > Goal: >> >> > Attempt to match up related transactions and cancel out instances >> when >> >> the >> >> > customer/item combination is returned at the same quantity as a >> previous >> >> > transaction >> >> > >> >> > Examples: >> >> > >> >> > Joe buys 1 blue pen, which is defective, then returns the 1 defective >> >> blue >> >> > pen, then buys another blue pen. EXPECTED: cancel out first two >> >> > transactions and leave the the last one for 1 pen >> >> > >> >> > Bob buys 2 red pens in two separate transactions. He then buys 3 >> more. >> >> He >> >> > returns the first two purchases as two separate return transactions. >> >> > EXPECTED: cancel out all transactions except the one for qty 3 >> >> > >> >> > Jane buys 5 purple pens and subsequently returns two of them. She >> buys >> >> two >> >> > more. EXPECTED: No transactions match exactly, so nothing is >> cancelled >> >> out >> >> > >> >> > >> >> > Data: >> >> > >> >> > data=: 0 : 0 >> >> > seq key qty >> >> > 1 1 1 >> >> > 2 1 _1 >> >> > 3 1 1 >> >> > 4 2 1 >> >> > 5 2 1 >> >> > 6 2 3 >> >> > 7 2 _1 >> >> > 8 2 _1 >> >> > 9 3 5 >> >> > 10 3 _2 >> >> > 11 3 2 >> >> > ) >> >> > tbl =: ,. ' ' cut every cutLF data >> >> > 'seqs keys qtys' =: |: ". every }. tbl >> >> > >> >> > >> >> > Goal: >> >> > >> >> > goals =: 0 : 0 >> >> > >> >> > goal >> >> > >> >> > cancelled >> >> > >> >> > credit >> >> > >> >> > ok >> >> > >> >> > cancelled >> >> > >> >> > cancelled >> >> > >> >> > ok >> >> > >> >> > credit >> >> > >> >> > credit >> >> > >> >> > ok >> >> > >> >> > ok >> >> > >> >> > ok >> >> > >> >> > ) >> >> > >> >> > >> >> > >> >> > >> >> > tbl,.(cutLF goals) >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |seq|key|qty|goal | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |1 |1 |1 |cancelled| >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |2 |1 |_1 |credit | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |3 |1 |1 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |4 |2 |1 |cancelled| >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |5 |2 |1 |cancelled| >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |6 |2 |3 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |7 |2 |_1 |credit | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |8 |2 |_1 |credit | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |9 |3 |5 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |10 |3 |_2 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |11 |3 |2 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > >> >> > >> >> > One approach: >> >> > >> >> > applycredits =: 3 : 0 >> >> > >> >> > goals=.(<'goal') >> >> > >> >> > creditids=.0 >> >> > >> >> > for_i. (i. # seqs) do. >> >> > >> >> > key=.i{keys >> >> > >> >> > seq=.i{seqs >> >> > >> >> > qty=.i{qtys >> >> > >> >> > nextcredit =.| {. qtys #~ ((key=keys)*(seqs>seq)*(qtys<0)) >> >> > >> >> > if. nextcredit = qty do. >> >> > >> >> > goals=.goals,<'cancelled' >> >> > >> >> > creditids =. creditids, seqs #~ ((key=keys)*(seqs>seq)*(qtys<0)) >> >> > >> >> > elseif. creditids e.~ seq do. >> >> > >> >> > goals=.goals,<'credit' >> >> > >> >> > elseif. do. >> >> > >> >> > goals=.goals,<'ok' >> >> > >> >> > end. >> >> > >> >> > end. >> >> > >> >> > goals >> >> > >> >> > ) >> >> > >> >> > tbl ,. ( applycredits 0 ) >> >> > >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |seq|key|qty|goal | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |1 |1 |1 |cancelled| >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |2 |1 |_1 |credit | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |3 |1 |1 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |4 |2 |1 |cancelled| >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |5 |2 |1 |cancelled| >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |6 |2 |3 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |7 |2 |_1 |credit | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |8 |2 |_1 |credit | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |9 |3 |5 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |10 |3 |_2 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > |11 |3 |2 |ok | >> >> > >> >> > +---+---+---+---------+ >> >> > >> >> > >> >> > >> >> > (cutLF goals) -: ( applycredits 0 ) >> >> > >> >> > 1 >> >> > >> >> > >> >> > thanks for any input >> >> > ------------------------------------------------------------ >> ---------- >> >> > For information about J forums see http://www.jsoftware.com/forum >> s.htm >> >> ---------------------------------------------------------------------- >> >> For information about J forums see http://www.jsoftware.com/forums.htm >> >> >> > >> > >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
