Hi all!

I created a percentage split trick.

a=:50
b=: 0.35 0.36 0.29
[r=:b*a
NB. 17.5 18 14.5
[rr=: <.!.0 r
NB. 17 18 14
[d=:50-+/rr
NB. 1
[s=:\: r-rr
NB. 0 2 1
[f=:(/:s){($s){.d$1
NB. 1 0 0
+/rr+f
NB. 50
NB. Distribute items according to percentages.
NB. Left argument: Array of percentages. Sum must be 1.
NB. Right argument: Number of items to distribute according to percentages.
NB. Result: Array of integers, distributed according to percentages.
Distribute=: 4 : 'rr+(/:s){($s=.\:r-rr){.(d=.y-+/rr=.<.!.0 r=.x*y)$1'
b Distribute 0
NB. 0 0 0
b Distribute 1
NB. 0 1 0
b Distribute 2
NB. 1 1 0
b Distribute 3
NB. 1 1 1
b Distribute 4
NB. 1 2 1
b Distribute 5
NB. 2 2 1

Cheers,

Erling Hellenäs

On 2017-10-20 15:37, Erling Hellenäs wrote:
Hi all!

You want a trick to make the percentage split as good as possible?

Cheers,

Erling Hellenäs


Den 2017-10-20 kl. 15:14, skrev 'Jon Hough' via Programming:
Hi,

What I am really after is a verb that splits by percentage. To give a concrete uses case: I have a dataset, which I wish to split into training set, validation set, and testing set.

I want 35% of the datapoints to go in the training set,
35% go in the validation set,
the rest go in the test set. (Just example numbers).


No need to worry about shuffling, randomizing etc, I am assuming the data is sufficiently random. As Raul said, I can simplify slightly by just using the size of the dataset as the right argument.

--------------------------------------------
On Fri, 10/20/17, Erling Hellenäs <erl...@erlinghellenas.se> wrote:

  Subject: Re: [Jprogramming] Splitting an Array into several arrays
  To: programm...@jsoftware.com
  Date: Friday, October 20, 2017, 10:06 PM
    Hi all !
    A splitSubs with CutN could possibly look like
  this:
    splitSubsE=: ([ (([:
  # [) {. ]) ([: <. 0.5 + [: }: [ * [: # ]) ( [ , ([:
  # ]) - [: +/ [) ]) CutN ]
        (i.0) splitSubsE i.0
        (,55) splitSubsE ,5
  ┌─┐
  │5│
  └─┘
      split
  splitSubsE i.0
  ┌┬┬┐
  ││││
  └┴┴┘
      split splitSubsE i.1
  ┌┬┬─┐
  │││0│
  └┴┴─┘
      split
  splitSubsE i.2
  ┌─┬─┬┐
  │0│1││
  └─┴─┴┘
      split
  splitSubsE i.3
  ┌─┬─┬─┐
  │0│1│2│
  └─┴─┴─┘
       split splitSubsE i.4
  ┌─┬─┬───┐
  │0│1│2 3│
  └─┴─┴───┘
    Cheers
    Erling
  Hellenäs
      Den 2017-10-20 kl. 14:11, skrev Erling
  Hellenäs:
  > Hi all!
  >
  > I looked for a
  version of Cut which takes the number of items in each
  > group as left argument. I didn't find
  one. I think it is what you most
  > often
  need, because it allows groups with zero length content.
  >
  > I made CutN as an
  illustration:
  >
  >
  CutN=:((# {. 0 , [: }: [: +/\ ])([: < [ + [: i. ])"0
  ])@:[ {&.>/ [: < ]
  >
  >    (i.0) CutN i.0
  >
  >    (,0) CutN i.0
  > ┌┐
  > ││
  > └┘
  >    (,1) CutN
  10+i.1
  > ┌──┐
  > │10│
  >
  └──┘
  >    0 2 CutN 10+i.2
  > ┌┬─────┐
  > ││10 11│
  >
  └┴─────┘
  >    2 5 0
  CutN 10+i.7
  >
  ┌─────┬──────────────┬┐
  > │10 11│12 13 14 15 16││
  >
  └─────┴──────────────┴┘
  >    0 7 0 CutN 10+i.7
  >
  ┌┬────────────────────┬┐
  > ││10 11 12 13 14 15 16││
  >
  └┴────────────────────┴┘
  >
  > Cheers,
  >
  > Erling Hellenäs
  >
  >
  >
  Den 2017-10-20 kl. 10:42, skrev 'Jon Hough' via
  Programming:
  >> The problem:
  >> Let X be an array.
  >> X=: i. 50 NB.  example
  >>
  >> Let
  'split' be the percentages that each subarray takes
  from X,
  >> sequentially
  >> e.g
  >> split =:
  0.35 0.35 0.3 NB. first array takes 35% , second sub array
    >> takes  35%, third takes 30%
  >> So in the end
  >>
  >> My
  solution
  >>
  >>
  splitSubs =:
  -.~&.>/\@:(i.&.>"0@:<"0)@:}.@:>.@:((+/\
  - ])@:[ (* , ])
  >> #@:])
  >>
  >> split
  splitSubs X
  >>
  >>
  >> This gives 3
  boxed arrays. Each array holds the indices to take from
  X.
  >>
  >>
  There is a slight problem in that the first and second
  subarrays
  >> have different
  >> length, due to rounding error. I am
  not too bothered about that
  >> since,
  depending on the size of X and the percentages, this is
  >> unavoidable.
  >>
  >> Any more
  succinct, nicer solutions?
  >>
----------------------------------------------------------------------
  >> For information about J forums see http://www.jsoftware.com/forums.htm
  >
  >
----------------------------------------------------------------------
  > For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
  For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to