Hi Bo, This is cool. As for the way you suggest using it here, isn't it equivalent to (without the first six rows of your data):
(~.@[ ,. +//.)/@|: ? Louis > On 12 Apr 2017, at 21:57, 'Bo Jacoby' via Programming > <[email protected]> wrote: > > Hi Joe! > My favorite datastructure is ORDINAL FRACTIONS - the algebra of data > > | > | > | > | | | > > | > > | > | > | | > ORDINAL FRACTIONS - the algebra of data > This paper was submitted to the 10th World Computer Congress, IFIP 1986 > conference, but rejected by the referee.... | | > > | > > | > > > Your data are coded like this > 10 Joe > 20 Bob > 30 Jane > 01 blue > 02 red > 03 purple > 11 1 > 11 -1 > 11 1 > 22 1 > 22 1 > 22 3 > 22 -1 > 22 -1 > 33 5 > 33 -2 > 33 2 > (Written with double CRs because the mail program has a history of deleting > my CRs). > Summation gives the result > 10 Joe > 20 Bob > 30 Jane > 01 blue > 02 red > 03 purple > 11 1 > 22 3 > > 33 5 > I have not done the summation in J, but I'd like to do it. > Perhaps this helps you. > Bo. > > > > Den 0:04 torsdag den 13. april 2017 skrev chris burke > <[email protected]>: > > > Incidentally, for production code, I suggest starting by removing any sales > not matched in returns and vice versa, so that the matching algorithm is > applied only to potential matches. > >> On Wed, Apr 12, 2017 at 2:53 PM, chris burke <[email protected]> wrote: >> >> Great. >> >> In case you need more complicated handling of the "gray area" >> transactions, I believe they would be relatively few in number, so most of >> the time you could do the matching efficiently, then check for any keys >> with returns preceding sales. For those, setting aside the first such >> return and repeating should clear them quickly. >> >> Timing should be well under 1 second for a million records. >> >>> On Wed, Apr 12, 2017 at 1:57 PM, Joe Bogner <[email protected]> wrote: >>> >>> Just for completeness, I added a line that incorporates the sequence check >>> into the cancel logic. Works great >>> >>> NB. hui progressive index >>> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of >>> oc=: i.~ (] - {) /:@/: >>> pi=: #@[ ({. i.&(,.oc) }.) [ i. , >>> >>> NB. argument is 3-col table of seq,key,qty >>> NB. result is the unmatched transactions >>> matchtrans=: 3 : 0 >>> msk=. 0<{:"1 y >>> sales=. msk#y >>> returns=. (-.msk)#y >>> ndx=. (}."1 sales) pi | }."1 returns >>> cancels=. ndx<#sales >>> NB. ensure cancel is after sale >>> cancels =. cancels *. (({."1 (<<(cancels)#ndx){sales) < ({."1 >>> (cancels#returns))) >>> ((<<<cancels#ndx){sales),(-.cancels)#returns >>> ) >>> >>> >>>> On Wed, Apr 12, 2017 at 4:14 PM, Joe Bogner <[email protected]> wrote: >>>> >>>> Chris, this looks promising. Thanks for sharing. It's nearly instant on >>> a >>>> million rows. >>>> >>>> Which row had a return before a transaction? seq 10 was an example of a >>>> partial return. The hypothetical customer returned 2 out of the 5 >>> purchased >>>> prior. I added that example since technically per the original spec it >>>> wouldn't be cancelled out in this pass. It's a gray area so I may be >>> able >>>> to use this approach, especially since I don't see how to incorporate >>> the >>>> time element into the progressive index. >>>> >>>> Thanks again >>>> >>>> >>>> On Wed, Apr 12, 2017 at 3:52 PM, chris burke <[email protected]> >>> wrote: >>>> >>>>> This might be done by comparing matrices of sales and returns. The >>>>> function >>>>> below seems to be close to what you want. It doesn't exactly match your >>>>> example, but your example has cases where returns are made before the >>>>> transactions. Was this intentional? >>>>> >>>>> The code should run faster than a looping solution. >>>>> >>>>> Code: >>>>> >>>>> NB. hui progressive index >>>>> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of >>>>> oc=: i.~ (] - {) /:@/: >>>>> pi=: #@[ ({. i.&(,.oc) }.) [ i. , >>>>> >>>>> NB. argument is 3-col table of seq,key,qty >>>>> NB. result is the unmatched transactions >>>>> matchtrans=: 3 : 0 >>>>> msk=. 0<{:"1 y >>>>> sales=. msk#y >>>>> returns=. (-.msk)#y >>>>> ndx=. (}."1 sales) pi | }."1 returns >>>>> cancels=. ndx<#sales >>>>> ((<<<cancels#ndx){sales),(-.cancels)#returns >>>>> ) >>>>> >>>>> Example: >>>>> >>>>> dat=: ".;._2 (0 : 0) >>>>> 1 1 1 >>>>> 2 1 _1 >>>>> 3 1 1 >>>>> 4 2 1 >>>>> 5 2 1 >>>>> 6 2 3 >>>>> 7 2 _1 >>>>> 8 2 _1 >>>>> 9 3 5 >>>>> 10 3 _2 >>>>> 11 3 2 >>>>> ) >>>>> >>>>> matchtrans dat >>>>> 3 1 1 >>>>> 6 2 3 >>>>> 9 3 5 >>>>> >>>>> >>>>> On Wed, Apr 12, 2017 at 9:35 AM, Joe Bogner <[email protected]> >>> wrote: >>>>> >>>>>> I have a problem I'm trying to solve in different languages. I have a >>>>>> solution in SQL and also in kdb which largely resembles the SQL >>>>> solution. >>>>>> I'm curious what a J solution would look like. More specifically, I'm >>>>>> interested in picking the brains of others here to see if this type >>> of >>>>>> problem can be solved without looping (some form of scan?). >>>>>> >>>>>> EDIT: Initially I wrote this up thinking the J solution would >>> difficult, >>>>>> but it was actually fairly straightforward -- about 15 minutes, but >>>>> still >>>>>> would like to see if there are alternatives. If nothing else, maybe >>> an >>>>>> interesting problem to share. >>>>>> >>>>>> Example data: >>>>>> >>>>>> A store has a transaction log with a sequence for each transaction. >>> The >>>>>> transaction log records a key for a unique customer/item combination. >>>>> The >>>>>> transaction log records how many units were purchased or returned. >>>>>> >>>>>> Goal: >>>>>> Attempt to match up related transactions and cancel out instances >>> when >>>>> the >>>>>> customer/item combination is returned at the same quantity as a >>> previous >>>>>> transaction >>>>>> >>>>>> Examples: >>>>>> >>>>>> Joe buys 1 blue pen, which is defective, then returns the 1 defective >>>>> blue >>>>>> pen, then buys another blue pen. EXPECTED: cancel out first two >>>>>> transactions and leave the the last one for 1 pen >>>>>> >>>>>> Bob buys 2 red pens in two separate transactions. He then buys 3 >>> more. >>>>> He >>>>>> returns the first two purchases as two separate return transactions. >>>>>> EXPECTED: cancel out all transactions except the one for qty 3 >>>>>> >>>>>> Jane buys 5 purple pens and subsequently returns two of them. She >>> buys >>>>> two >>>>>> more. EXPECTED: No transactions match exactly, so nothing is >>> cancelled >>>>> out >>>>>> >>>>>> >>>>>> Data: >>>>>> >>>>>> data=: 0 : 0 >>>>>> seq key qty >>>>>> 1 1 1 >>>>>> 2 1 _1 >>>>>> 3 1 1 >>>>>> 4 2 1 >>>>>> 5 2 1 >>>>>> 6 2 3 >>>>>> 7 2 _1 >>>>>> 8 2 _1 >>>>>> 9 3 5 >>>>>> 10 3 _2 >>>>>> 11 3 2 >>>>>> ) >>>>>> tbl =: ,. ' ' cut every cutLF data >>>>>> 'seqs keys qtys' =: |: ". every }. tbl >>>>>> >>>>>> >>>>>> Goal: >>>>>> >>>>>> goals =: 0 : 0 >>>>>> >>>>>> goal >>>>>> >>>>>> cancelled >>>>>> >>>>>> credit >>>>>> >>>>>> ok >>>>>> >>>>>> cancelled >>>>>> >>>>>> cancelled >>>>>> >>>>>> ok >>>>>> >>>>>> credit >>>>>> >>>>>> credit >>>>>> >>>>>> ok >>>>>> >>>>>> ok >>>>>> >>>>>> ok >>>>>> >>>>>> ) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> tbl,.(cutLF goals) >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |seq|key|qty|goal | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |1 |1 |1 |cancelled| >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |2 |1 |_1 |credit | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |3 |1 |1 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |4 |2 |1 |cancelled| >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |5 |2 |1 |cancelled| >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |6 |2 |3 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |7 |2 |_1 |credit | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |8 |2 |_1 |credit | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |9 |3 |5 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |10 |3 |_2 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |11 |3 |2 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> >>>>>> >>>>>> One approach: >>>>>> >>>>>> applycredits =: 3 : 0 >>>>>> >>>>>> goals=.(<'goal') >>>>>> >>>>>> creditids=.0 >>>>>> >>>>>> for_i. (i. # seqs) do. >>>>>> >>>>>> key=.i{keys >>>>>> >>>>>> seq=.i{seqs >>>>>> >>>>>> qty=.i{qtys >>>>>> >>>>>> nextcredit =.| {. qtys #~ ((key=keys)*(seqs>seq)*(qtys<0)) >>>>>> >>>>>> if. nextcredit = qty do. >>>>>> >>>>>> goals=.goals,<'cancelled' >>>>>> >>>>>> creditids =. creditids, seqs #~ ((key=keys)*(seqs>seq)*(qtys<0)) >>>>>> >>>>>> elseif. creditids e.~ seq do. >>>>>> >>>>>> goals=.goals,<'credit' >>>>>> >>>>>> elseif. do. >>>>>> >>>>>> goals=.goals,<'ok' >>>>>> >>>>>> end. >>>>>> >>>>>> end. >>>>>> >>>>>> goals >>>>>> >>>>>> ) >>>>>> >>>>>> tbl ,. ( applycredits 0 ) >>>>>> >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |seq|key|qty|goal | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |1 |1 |1 |cancelled| >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |2 |1 |_1 |credit | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |3 |1 |1 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |4 |2 |1 |cancelled| >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |5 |2 |1 |cancelled| >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |6 |2 |3 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |7 |2 |_1 |credit | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |8 |2 |_1 |credit | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |9 |3 |5 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |10 |3 |_2 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> |11 |3 |2 |ok | >>>>>> >>>>>> +---+---+---+---------+ >>>>>> >>>>>> >>>>>> >>>>>> (cutLF goals) -: ( applycredits 0 ) >>>>>> >>>>>> 1 >>>>>> >>>>>> >>>>>> thanks for any input >>>>>> ------------------------------------------------------------ >>> ---------- >>>>>> For information about J forums see http://www.jsoftware.com/forum >>> s.htm >>>>> ---------------------------------------------------------------------- >>>>> For information about J forums see http://www.jsoftware.com/forums.htm >>>>> >>>> >>>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >> >> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
