Hi Joe!
My favorite datastructure is ORDINAL FRACTIONS - the algebra of data 
  
|  
|   
|   
|   |    |

   |

  |
|  
|    |  
ORDINAL FRACTIONS - the algebra of data
 This paper was submitted to the 10th World Computer Congress, IFIP 1986 
conference, but rejected by the referee....  |   |

  |

  |

 
Your data are coded like this
10 Joe
20 Bob
30 Jane
01 blue
02 red
03 purple
11 1
11 -1
11 1
22 1
22 1
22 3
22 -1
22 -1
33 5
33 -2
33 2
(Written with double CRs because the mail program has a history of deleting my 
CRs).
Summation gives the result
10 Joe
20 Bob
30 Jane
01 blue
02 red
03 purple
11 1
22 3

33 5
I have not done the summation in J, but I'd like to do it.
Perhaps this helps you.
Bo.

 

    Den 0:04 torsdag den 13. april 2017 skrev chris burke 
<[email protected]>:
 

 Incidentally, for production code, I suggest starting by removing any sales
not matched in returns and vice versa, so that the matching algorithm is
applied only to potential matches.

On Wed, Apr 12, 2017 at 2:53 PM, chris burke <[email protected]> wrote:

> Great.
>
> In case you need more complicated handling of the "gray area"
> transactions, I believe they would be relatively few in number, so most of
> the time you could do the matching efficiently, then check for any keys
> with returns preceding sales. For those, setting aside the first such
> return and repeating should clear them quickly.
>
> Timing should be well under 1 second for a million records.
>
> On Wed, Apr 12, 2017 at 1:57 PM, Joe Bogner <[email protected]> wrote:
>
>> Just for completeness, I added a line that incorporates the sequence check
>> into the cancel logic. Works great
>>
>> NB. hui progressive index
>> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of
>> oc=: i.~ (] - {) /:@/:
>> pi=: #@[ ({. i.&(,.oc) }.) [ i. ,
>>
>> NB. argument is 3-col table of seq,key,qty
>> NB. result is the unmatched transactions
>> matchtrans=: 3 : 0
>> msk=. 0<{:"1 y
>> sales=. msk#y
>> returns=. (-.msk)#y
>> ndx=. (}."1 sales) pi | }."1 returns
>> cancels=. ndx<#sales
>> NB. ensure cancel is after sale
>> cancels =. cancels *. (({."1 (<<(cancels)#ndx){sales) < ({."1
>> (cancels#returns)))
>> ((<<<cancels#ndx){sales),(-.cancels)#returns
>> )
>>
>>
>> On Wed, Apr 12, 2017 at 4:14 PM, Joe Bogner <[email protected]> wrote:
>>
>> > Chris, this looks promising. Thanks for sharing. It's nearly instant on
>> a
>> > million rows.
>> >
>> > Which row had a return before a transaction? seq 10 was an example of a
>> > partial return. The hypothetical customer returned 2 out of the 5
>> purchased
>> > prior. I added that example since technically per the original spec it
>> > wouldn't be cancelled out in this pass.  It's a gray area so I may be
>> able
>> > to use this approach, especially since I don't see how to incorporate
>> the
>> > time element into the progressive index.
>> >
>> > Thanks again
>> >
>> >
>> > On Wed, Apr 12, 2017 at 3:52 PM, chris burke <[email protected]>
>> wrote:
>> >
>> >> This might be done by comparing matrices of sales and returns. The
>> >> function
>> >> below seems to be close to what you want. It doesn't exactly match your
>> >> example, but your example has cases where returns are made before the
>> >> transactions. Was this intentional?
>> >>
>> >> The code should run faster than a looping solution.
>> >>
>> >> Code:
>> >>
>> >> NB. hui progressive index
>> >> NB. http://code.jsoftware.com/wiki/Essays/Progressive_Index-Of
>> >> oc=: i.~ (] - {) /:@/:
>> >> pi=: #@[ ({. i.&(,.oc) }.) [ i. ,
>> >>
>> >> NB. argument is 3-col table of seq,key,qty
>> >> NB. result is the unmatched transactions
>> >> matchtrans=: 3 : 0
>> >> msk=. 0<{:"1 y
>> >> sales=. msk#y
>> >> returns=. (-.msk)#y
>> >> ndx=. (}."1 sales) pi | }."1 returns
>> >> cancels=. ndx<#sales
>> >> ((<<<cancels#ndx){sales),(-.cancels)#returns
>> >> )
>> >>
>> >> Example:
>> >>
>> >>    dat=: ".;._2 (0 : 0)
>> >> 1 1 1
>> >> 2 1 _1
>> >> 3 1 1
>> >> 4 2 1
>> >> 5 2 1
>> >> 6 2 3
>> >> 7 2 _1
>> >> 8 2 _1
>> >> 9 3 5
>> >> 10 3 _2
>> >> 11 3 2
>> >> )
>> >>
>> >>    matchtrans dat
>> >> 3 1 1
>> >> 6 2 3
>> >> 9 3 5
>> >>
>> >>
>> >> On Wed, Apr 12, 2017 at 9:35 AM, Joe Bogner <[email protected]>
>> wrote:
>> >>
>> >> > I have a problem I'm trying to solve in different languages. I have a
>> >> > solution in SQL and also in kdb which largely resembles the SQL
>> >> solution.
>> >> > I'm curious what a J solution would look like. More specifically, I'm
>> >> > interested in picking the brains of others here to see if this type
>> of
>> >> > problem can be solved without looping (some form of scan?).
>> >> >
>> >> > EDIT: Initially I wrote this up thinking the J solution would
>> difficult,
>> >> > but it was actually fairly straightforward -- about 15 minutes, but
>> >> still
>> >> > would like to see if there are alternatives. If nothing else, maybe
>> an
>> >> > interesting problem to share.
>> >> >
>> >> > Example data:
>> >> >
>> >> > A store has a transaction log with a sequence for each transaction.
>> The
>> >> > transaction log records a key for a unique customer/item combination.
>> >> The
>> >> > transaction log records how many units were purchased or returned.
>> >> >
>> >> > Goal:
>> >> > Attempt to match up related transactions and cancel out instances
>> when
>> >> the
>> >> > customer/item combination is returned at the same quantity as a
>> previous
>> >> > transaction
>> >> >
>> >> > Examples:
>> >> >
>> >> > Joe buys 1 blue pen, which is defective, then returns the 1 defective
>> >> blue
>> >> > pen, then buys another blue pen. EXPECTED: cancel out first two
>> >> > transactions and leave the the last one for 1 pen
>> >> >
>> >> > Bob buys 2 red pens in two separate transactions. He then buys 3
>> more.
>> >> He
>> >> > returns the first two purchases as two separate return transactions.
>> >> > EXPECTED: cancel out all transactions except the one for qty 3
>> >> >
>> >> > Jane buys 5 purple pens and subsequently returns two of them. She
>> buys
>> >> two
>> >> > more. EXPECTED: No transactions match exactly, so nothing is
>> cancelled
>> >> out
>> >> >
>> >> >
>> >> > Data:
>> >> >
>> >> > data=: 0 : 0
>> >> > seq key qty
>> >> > 1 1 1
>> >> > 2 1 _1
>> >> > 3 1 1
>> >> > 4 2 1
>> >> > 5 2 1
>> >> > 6 2 3
>> >> > 7 2 _1
>> >> > 8 2 _1
>> >> > 9 3 5
>> >> > 10 3 _2
>> >> > 11 3 2
>> >> > )
>> >> > tbl =: ,. ' ' cut every cutLF data
>> >> > 'seqs keys qtys' =: |: ". every }. tbl
>> >> >
>> >> >
>> >> > Goal:
>> >> >
>> >> > goals =: 0 : 0
>> >> >
>> >> > goal
>> >> >
>> >> > cancelled
>> >> >
>> >> > credit
>> >> >
>> >> > ok
>> >> >
>> >> > cancelled
>> >> >
>> >> > cancelled
>> >> >
>> >> > ok
>> >> >
>> >> > credit
>> >> >
>> >> > credit
>> >> >
>> >> > ok
>> >> >
>> >> > ok
>> >> >
>> >> > ok
>> >> >
>> >> > )
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > tbl,.(cutLF goals)
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |seq|key|qty|goal |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |1 |1 |1 |cancelled|
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |2 |1 |_1 |credit |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |3 |1 |1 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |4 |2 |1 |cancelled|
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |5 |2 |1 |cancelled|
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |6 |2 |3 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |7 |2 |_1 |credit |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |8 |2 |_1 |credit |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |9 |3 |5 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |10 |3 |_2 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |11 |3 |2 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> >
>> >> >
>> >> > One approach:
>> >> >
>> >> > applycredits =: 3 : 0
>> >> >
>> >> > goals=.(<'goal')
>> >> >
>> >> > creditids=.0
>> >> >
>> >> > for_i. (i. # seqs) do.
>> >> >
>> >> >  key=.i{keys
>> >> >
>> >> >  seq=.i{seqs
>> >> >
>> >> >  qty=.i{qtys
>> >> >
>> >> >  nextcredit =.| {. qtys #~ ((key=keys)*(seqs>seq)*(qtys<0))
>> >> >
>> >> >  if. nextcredit = qty do.
>> >> >
>> >> >  goals=.goals,<'cancelled'
>> >> >
>> >> >  creditids =. creditids, seqs #~ ((key=keys)*(seqs>seq)*(qtys<0))
>> >> >
>> >> >  elseif. creditids e.~ seq do.
>> >> >
>> >> >    goals=.goals,<'credit'
>> >> >
>> >> >  elseif. do.
>> >> >
>> >> >    goals=.goals,<'ok'
>> >> >
>> >> > end.
>> >> >
>> >> > end.
>> >> >
>> >> > goals
>> >> >
>> >> > )
>> >> >
>> >> > tbl ,. ( applycredits 0 )
>> >> >
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |seq|key|qty|goal |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |1 |1 |1 |cancelled|
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |2 |1 |_1 |credit |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |3 |1 |1 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |4 |2 |1 |cancelled|
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |5 |2 |1 |cancelled|
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |6 |2 |3 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |7 |2 |_1 |credit |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |8 |2 |_1 |credit |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |9 |3 |5 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |10 |3 |_2 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> > |11 |3 |2 |ok |
>> >> >
>> >> > +---+---+---+---------+
>> >> >
>> >> >
>> >> >
>> >> > (cutLF goals) -: ( applycredits 0 )
>> >> >
>> >> > 1
>> >> >
>> >> >
>> >> > thanks for any input
>> >> > ------------------------------------------------------------
>> ----------
>> >> > For information about J forums see http://www.jsoftware.com/forum
>> s.htm
>> >> ----------------------------------------------------------------------
>> >> For information about J forums see http://www.jsoftware.com/forums.htm
>> >>
>> >
>> >
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

   
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to