The first thing that should be done here is coming up with good names
for these operations.

But, I don't know enough to pick a really good name for the desired
result here...

Anyways, here's two equivalent expressions that I think do what you want:

allHands=: 5 comb 52
HCs=: 4 comb 52
HCmask=: (2 comb 4) e."1~ i.4

someNdx=: I.+./HCmask*/ .=+./"1({.HCs)=/allHands
someIndex=: 10{.I.+./4=HCmask+/ .=+./"1({.HCs)=/allHands

(The shorter expression is about twice as fast as the longer
expression when I test them. However, I have not tried rephrasing to
replace the +./ ... =/ ... bit with an e. expression.)

   someNdx-:someIndex
1
   10{.someNdx
2304 2305 2306 2307 2308 2309 2310 2311 2312 2313
   10{.someNdx{allHands
0 1 4 5  6
0 1 4 5  7
0 1 4 5  8
0 1 4 5  9
0 1 4 5 10
0 1 4 5 11
0 1 4 5 12
0 1 4 5 13
0 1 4 5 14
0 1 4 5 15

I hope this makes sense,

-- 
Raul

On Sun, Aug 8, 2021 at 4:42 PM Devon McCormick <[email protected]> wrote:
>
> Hi,
>
> I'm stuck on getting decent performance for the following problem.
> Say we have a list of all 5-card hands from a deck of 52 cards:
>    load '~addons/stats/base/combinatorial.ijs'
>    $allHands=. 5 comb 52
> 2598960 5
>
> We also have a list of all 4-card combinations:
>    $HCs=. 4 comb 52
> 270725 4
>
> I want to find all hands possible using any two cards from a given row of *HCs
> *but excluding those with either of the other two cards from the same row.
>
> For example, for this randomly-chosen row of *HCs*, which hands include the
> first two but exclude the second two?
>    (]{~[:?#) HCs
> 15 22 33 44
>    (#,+/) include=. 2=+/"1 allHands e."1 ] *15 22*
> 2598960 19600
>    (#,+/) exclude=. 0=+/"1 allHands e."1 ] *33 44*
> 2598960 480200
>
> So there are 17,296 hands we want:
>    +/include*.exclude
> 17296
>
> Check five at random to verify that these look correct:
>    ]rcix=. (]{~[:5&?#) I. include*.exclude
> 1002358 1165504 2176134 1960355 56685
>    rcix{allHands
>  4 15 22 37 39
>  5 15 22 34 45
> 15 18 20 22 39
> 12 15 22 27 39
>  0  4  6 15 22
>
> These look good because all include (15,22) but exclude any of  (33,44).
>
> However, this is only one possibility for this single row of *HCs* but we
> have six variations for this row based on choosing all possible pairs of
> indexes into the row of *HCs* to include,
>    |:inclIx=. 2 comb 4
> 0 0 0 1 1 2
> 1 2 3 2 3 3
>
> And six corresponding pairs to exclude (indexes into the row of *HCs*):
>    |:exclIx=. (i.4)-."1 ] 2 comb 4
> 2 1 1 0 0 0
> 3 3 2 3 2 1
>
> In the above example, we did the one of the six corresponding to *inclIx *of
> (0,1) and *exclIx *of (2,3).  Is there a way to do all six sets for each
> row of *HCs* compared to each row of *allHands*?
>
> We might start by reducing the sizes of our arrays by making them character
> instead of integer:
>    allHands=. allHands{a. [ HCs=. HCs{a.
>
> However, we quickly see that we run out of memory for the expression we
> would like to write:
>    include=. 2=+/"1 allHands e."1 / inclIx{"1 / HCs
> |out of memory
> |   include=.2=+/"1 allHands     e."1/inclIx{"1/HCs
>
> A little arithmetic shows us the extent of the problem:
>    (],*/) #&>(,/inclIx{"1/HCs);<allHands
> 1624350 2598960 4221620676000
>    10^.1624350*2598960
> 12.6255
>
> So we have a 4-trillion-element intermediate result which is a bit much.
>
> Our immediate thought might be to do this in pieces by selecting successive
> portions of the right-hand part (*HCs*) of this expression.  We would like
> to work on integral pieces for neatness, so we look at the prime factors of
> the size of *HCs* and select the product of a couple of them to be the
> "chunk size":
>    q:#HCs
> 5 5 7 7 13 17
>    ixs=. (ctr*chsz)+i.chsz [ ctr=. 0 [ chsz=. */5 5
>
> [The expression for the next chunk would be
>    ixs=. (ctr*chsz)+i.chsz [ ctr=. >:ctr
> and so on.]
>
> Timing how long one piece takes and scoring the results by simply adding up
> their indexes into *allHands*, we get this:
>    6!:2 'sel=. (2=+/"1 allHands e."1 / inclIx{"1 / ixs{HCs) *. 0=+/"1
> allHands e."1 / exclIx{"1 / ixs{HCs'
> 11.5105
>    scores=. 0$~#HCs    NB. Initialize scores vector
>    6!:2 'theseScores=. +/&>I.&.><"1 |:+./"1 ] 0 2 1|:sel'
> 0.687056
>    6!:2 'scores=. (theseScores) ixs}scores'
> 6.4e_6
>    +/11.5105 0.687056 6.4e_6      NB. Total time per chunk
> 12.1976
>    12.1976*(#ixs)%~#HCs           NB. Total estimated time
> 132088
>    0 60 60#:12.1976*(#ixs)%~#HCs  NB. Estimated time: h m s
> 36 41 27.8104
>
> So we should be able to do it this way in about 36 hours.  Can someone
> think of a faster method to accomplish this?  The memory requirements seem
> modest enough that I should be able to run five to ten processes
> simultaneously on this to reduce the overall time but is there a way to
> speed up the basic selection?
>
> Thanks,
>
> Devon
>
>
> --
>
> Devon McCormick, CFA
>
> Quantitative Consultant
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to