That's a really good idea -- you should check out and meld this with the prior work in Stern-Brocot trees. They're basically ordinal fractions which allow one to iterate in an orderly manner through all the ranges of rationals within a given range.
http://arxiv.org/abs/cs/0401014 (uses the term "Farey fractions", which is one of the ways of traversing the tree; another term is "continued fractions"). A related term is "Calkin-Wilf tree". -Wm On Wed, Apr 12, 2017 at 6:57 PM 'Bo Jacoby' via Programming < [email protected]> wrote: > Hi Joe! > My favorite datastructure is ORDINAL FRACTIONS - the algebra of data > > | > | > | > | | | > > | > > | > | > | | > ORDINAL FRACTIONS - the algebra of data > This paper was submitted to the 10th World Computer Congress, IFIP 1986 > conference, but rejected by the referee.... | | > > | > > | > > > Your data are coded like this > 10 Joe > 20 Bob > 30 Jane > 01 blue > 02 red > 03 purple > 11 1 > 11 -1 > 11 1 > 22 1 > 22 1 > 22 3 > 22 -1 > 22 -1 > 33 5 > 33 -2 > 33 2 > (Written with double CRs because the mail program has a history of > deleting my CRs). > Summation gives the result > 10 Joe > 20 Bob > 30 Jane > 01 blue > 02 red > 03 purple > 11 1 > 22 3 > > 33 5 > I have not done the summation in J, but I'd like to do it. > Perhaps this helps you. > Bo. > > > > Den 0:04 torsdag den 13. april 2017 skrev chris burke < > [email protected]>: > > > Incidentally, for production code, I suggest starting by removing any > sales > not matched in returns and vice versa, so that the matching algorithm is > applied only to potential matches. > > On Wed, Apr 12, 2017 at 2:53 PM, chris burke <[email protected]> wrote: > > > Great. > > > > In case you need more complicated handling of the "gray area" > > transactions, I believe they would be relatively few in number, so most > of > > the time you could do the matching efficiently, then check for any keys > > with returns preceding sales. For those, setting aside the first such > > return and repeating should clear them quickly. > > > > Timing should be well under 1 second for a million records. > > > > On Wed, Apr 12, 2017 at 1:57 PM, Joe Bogner <[email protected]> wrote: > > > >> Just for completeness, I added a line that incorporates the sequence > check > >> into the cancel logic. Works great > >> > >> NB. hui progressive index > >> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of > >> oc=: i.~ (] - {) /:@/: > >> pi=: #@[ ({. i.&(,.oc) }.) [ i. , > >> > >> NB. argument is 3-col table of seq,key,qty > >> NB. result is the unmatched transactions > >> matchtrans=: 3 : 0 > >> msk=. 0<{:"1 y > >> sales=. msk#y > >> returns=. (-.msk)#y > >> ndx=. (}."1 sales) pi | }."1 returns > >> cancels=. ndx<#sales > >> NB. ensure cancel is after sale > >> cancels =. cancels *. (({."1 (<<(cancels)#ndx){sales) < ({."1 > >> (cancels#returns))) > >> ((<<<cancels#ndx){sales),(-.cancels)#returns > >> ) > >> > >> > >> On Wed, Apr 12, 2017 at 4:14 PM, Joe Bogner <[email protected]> > wrote: > >> > >> > Chris, this looks promising. Thanks for sharing. It's nearly instant > on > >> a > >> > million rows. > >> > > >> > Which row had a return before a transaction? seq 10 was an example of > a > >> > partial return. The hypothetical customer returned 2 out of the 5 > >> purchased > >> > prior. I added that example since technically per the original spec it > >> > wouldn't be cancelled out in this pass. It's a gray area so I may be > >> able > >> > to use this approach, especially since I don't see how to incorporate > >> the > >> > time element into the progressive index. > >> > > >> > Thanks again > >> > > >> > > >> > On Wed, Apr 12, 2017 at 3:52 PM, chris burke <[email protected]> > >> wrote: > >> > > >> >> This might be done by comparing matrices of sales and returns. The > >> >> function > >> >> below seems to be close to what you want. It doesn't exactly match > your > >> >> example, but your example has cases where returns are made before the > >> >> transactions. Was this intentional? > >> >> > >> >> The code should run faster than a looping solution. > >> >> > >> >> Code: > >> >> > >> >> NB. hui progressive index > >> >> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of > >> >> oc=: i.~ (] - {) /:@/: > >> >> pi=: #@[ ({. i.&(,.oc) }.) [ i. , > >> >> > >> >> NB. argument is 3-col table of seq,key,qty > >> >> NB. result is the unmatched transactions > >> >> matchtrans=: 3 : 0 > >> >> msk=. 0<{:"1 y > >> >> sales=. msk#y > >> >> returns=. (-.msk)#y > >> >> ndx=. (}."1 sales) pi | }."1 returns > >> >> cancels=. ndx<#sales > >> >> ((<<<cancels#ndx){sales),(-.cancels)#returns > >> >> ) > >> >> > >> >> Example: > >> >> > >> >> dat=: ".;._2 (0 : 0) > >> >> 1 1 1 > >> >> 2 1 _1 > >> >> 3 1 1 > >> >> 4 2 1 > >> >> 5 2 1 > >> >> 6 2 3 > >> >> 7 2 _1 > >> >> 8 2 _1 > >> >> 9 3 5 > >> >> 10 3 _2 > >> >> 11 3 2 > >> >> ) > >> >> > >> >> matchtrans dat > >> >> 3 1 1 > >> >> 6 2 3 > >> >> 9 3 5 > >> >> > >> >> > >> >> On Wed, Apr 12, 2017 at 9:35 AM, Joe Bogner <[email protected]> > >> wrote: > >> >> > >> >> > I have a problem I'm trying to solve in different languages. I > have a > >> >> > solution in SQL and also in kdb which largely resembles the SQL > >> >> solution. > >> >> > I'm curious what a J solution would look like. More specifically, > I'm > >> >> > interested in picking the brains of others here to see if this type > >> of > >> >> > problem can be solved without looping (some form of scan?). > >> >> > > >> >> > EDIT: Initially I wrote this up thinking the J solution would > >> difficult, > >> >> > but it was actually fairly straightforward -- about 15 minutes, but > >> >> still > >> >> > would like to see if there are alternatives. If nothing else, maybe > >> an > >> >> > interesting problem to share. > >> >> > > >> >> > Example data: > >> >> > > >> >> > A store has a transaction log with a sequence for each transaction. > >> The > >> >> > transaction log records a key for a unique customer/item > combination. > >> >> The > >> >> > transaction log records how many units were purchased or returned. > >> >> > > >> >> > Goal: > >> >> > Attempt to match up related transactions and cancel out instances > >> when > >> >> the > >> >> > customer/item combination is returned at the same quantity as a > >> previous > >> >> > transaction > >> >> > > >> >> > Examples: > >> >> > > >> >> > Joe buys 1 blue pen, which is defective, then returns the 1 > defective > >> >> blue > >> >> > pen, then buys another blue pen. EXPECTED: cancel out first two > >> >> > transactions and leave the the last one for 1 pen > >> >> > > >> >> > Bob buys 2 red pens in two separate transactions. He then buys 3 > >> more. > >> >> He > >> >> > returns the first two purchases as two separate return > transactions. > >> >> > EXPECTED: cancel out all transactions except the one for qty 3 > >> >> > > >> >> > Jane buys 5 purple pens and subsequently returns two of them. She > >> buys > >> >> two > >> >> > more. EXPECTED: No transactions match exactly, so nothing is > >> cancelled > >> >> out > >> >> > > >> >> > > >> >> > Data: > >> >> > > >> >> > data=: 0 : 0 > >> >> > seq key qty > >> >> > 1 1 1 > >> >> > 2 1 _1 > >> >> > 3 1 1 > >> >> > 4 2 1 > >> >> > 5 2 1 > >> >> > 6 2 3 > >> >> > 7 2 _1 > >> >> > 8 2 _1 > >> >> > 9 3 5 > >> >> > 10 3 _2 > >> >> > 11 3 2 > >> >> > ) > >> >> > tbl =: ,. ' ' cut every cutLF data > >> >> > 'seqs keys qtys' =: |: ". every }. tbl > >> >> > > >> >> > > >> >> > Goal: > >> >> > > >> >> > goals =: 0 : 0 > >> >> > > >> >> > goal > >> >> > > >> >> > cancelled > >> >> > > >> >> > credit > >> >> > > >> >> > ok > >> >> > > >> >> > cancelled > >> >> > > >> >> > cancelled > >> >> > > >> >> > ok > >> >> > > >> >> > credit > >> >> > > >> >> > credit > >> >> > > >> >> > ok > >> >> > > >> >> > ok > >> >> > > >> >> > ok > >> >> > > >> >> > ) > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > tbl,.(cutLF goals) > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |seq|key|qty|goal | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |1 |1 |1 |cancelled| > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |2 |1 |_1 |credit | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |3 |1 |1 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |4 |2 |1 |cancelled| > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |5 |2 |1 |cancelled| > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |6 |2 |3 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |7 |2 |_1 |credit | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |8 |2 |_1 |credit | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |9 |3 |5 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |10 |3 |_2 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |11 |3 |2 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > > >> >> > > >> >> > One approach: > >> >> > > >> >> > applycredits =: 3 : 0 > >> >> > > >> >> > goals=.(<'goal') > >> >> > > >> >> > creditids=.0 > >> >> > > >> >> > for_i. (i. # seqs) do. > >> >> > > >> >> > key=.i{keys > >> >> > > >> >> > seq=.i{seqs > >> >> > > >> >> > qty=.i{qtys > >> >> > > >> >> > nextcredit =.| {. qtys #~ ((key=keys)*(seqs>seq)*(qtys<0)) > >> >> > > >> >> > if. nextcredit = qty do. > >> >> > > >> >> > goals=.goals,<'cancelled' > >> >> > > >> >> > creditids =. creditids, seqs #~ ((key=keys)*(seqs>seq)*(qtys<0)) > >> >> > > >> >> > elseif. creditids e.~ seq do. > >> >> > > >> >> > goals=.goals,<'credit' > >> >> > > >> >> > elseif. do. > >> >> > > >> >> > goals=.goals,<'ok' > >> >> > > >> >> > end. > >> >> > > >> >> > end. > >> >> > > >> >> > goals > >> >> > > >> >> > ) > >> >> > > >> >> > tbl ,. ( applycredits 0 ) > >> >> > > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |seq|key|qty|goal | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |1 |1 |1 |cancelled| > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |2 |1 |_1 |credit | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |3 |1 |1 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |4 |2 |1 |cancelled| > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |5 |2 |1 |cancelled| > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |6 |2 |3 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |7 |2 |_1 |credit | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |8 |2 |_1 |credit | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |9 |3 |5 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |10 |3 |_2 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > |11 |3 |2 |ok | > >> >> > > >> >> > +---+---+---+---------+ > >> >> > > >> >> > > >> >> > > >> >> > (cutLF goals) -: ( applycredits 0 ) > >> >> > > >> >> > 1 > >> >> > > >> >> > > >> >> > thanks for any input > >> >> > ------------------------------------------------------------ > >> ---------- > >> >> > For information about J forums see http://www.jsoftware.com/forum > >> s.htm > >> >> > ---------------------------------------------------------------------- > >> >> For information about J forums see > http://www.jsoftware.com/forums.htm > >> >> > >> > > >> > > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > >> > > > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
