Ric, 

Here's a response to an old thread which I took from my "things to think about 
collection:

group2=:(+/\@(* #) I. i.@#@]) </. ]
   0.35 0.3 0.3 group2 i.14
┌─────────┬─────────┬───────────┐
│0 1 2 3 4│5 6 7 8 9│10 11 12 13│
└─────────┴─────────┴───────────┘
   f=: 13 : '(x (] I.~ [: +/\ [ * [: # ]) y)</. y'
   0.35 0.3 0.3 f i.14
┌─────────┬─────────┬───────────┐
│0 1 2 3 4│5 6 7 8 9│10 11 12 13│
└─────────┴─────────┴───────────┘
   
   group2
(+/\@(* #) I. i.@#@]) </. ]
   f
] </.~ ] I.~ [: +/\ [ * [: # ]

I was surprised at how neat f turned out when I wrote an explicit definition of 
my version of you way to group the numbers.

Linda


-----Original Message-----
From: Programming [mailto:[email protected]] On Behalf 
Of Ric Sherlock
Sent: Saturday, October 21, 2017 3:21 PM
To: Programming JForum <[email protected]>
Subject: Re: [Jprogramming] Splitting an Array into several arrays

Jon,
The following assigns each of the data points to a group, then boxes those 
groups using key.

   assignGrp=: +/\@(* #) I. i.@#@]
   group=: assignGrp </. ]
   0.35 0.35 0.3 group i.14
┌─────────┬─────────┬───────────┐
│0 1 2 3 4│5 6 7 8 9│10 11 12 13│
└─────────┴─────────┴───────────┘

On 21/10/2017 02:14, "'Jon Hough' via Programming" < [email protected]> 
wrote:

> Hi,
>
> What I am really after is a verb that splits by percentage. To give a 
> concrete uses case:
> I have a dataset, which I wish to split into training set, validation 
> set, and testing set.
>
> I want 35% of the datapoints to go in the training set, 35% go in the 
> validation set, the rest go in the test set. (Just example numbers).
>
>
> No need to worry about shuffling, randomizing etc, I am assuming the 
> data is sufficiently random.
> As Raul said, I can simplify slightly by just using the size of the 
> dataset as the right argument.
>
> --------------------------------------------
> On Fri, 10/20/17, Erling Hellenäs <[email protected]> wrote:
>
>  Subject: Re: [Jprogramming] Splitting an Array into several arrays
>  To: [email protected]
>  Date: Friday, October 20, 2017, 10:06 PM
>
>  Hi all !
>
>  A splitSubs with CutN could possibly look like
>  this:
>
>  splitSubsE=: ([ (([:
>  # [) {. ]) ([: <. 0.5 + [: }: [ * [: # ]) ( [ , ([:
>  # ]) - [: +/ [) ]) CutN ]
>
>      (i.0) splitSubsE i.0
>
>      (,55) splitSubsE ,5
>  ┌─┐
>  │5│
>  └─┘
>      split
>  splitSubsE i.0
>  ┌┬┬┐
>  ││││
>  └┴┴┘
>      split splitSubsE i.1
>  ┌┬┬─┐
>  │││0│
>  └┴┴─┘
>      split
>  splitSubsE i.2
>  ┌─┬─┬┐
>  │0│1││
>  └─┴─┴┘
>      split
>  splitSubsE i.3
>  ┌─┬─┬─┐
>  │0│1│2│
>  └─┴─┴─┘
>
>  split splitSubsE i.4
>  ┌─┬─┬───┐
>  │0│1│2 3│
>  └─┴─┴───┘
>
>  Cheers
>
>  Erling
>  Hellenäs
>
>
>  Den 2017-10-20 kl. 14:11, skrev Erling
>  Hellenäs:
>  > Hi all!
>  >
>  > I looked for a
>  version of Cut which takes the number of items in each  > group as 
> left argument. I didn't find  one. I think it is what you most  > 
> often  need, because it allows groups with zero length content.
>  >
>  > I made CutN as an
>  illustration:
>  >
>  >
>  CutN=:((# {. 0 , [: }: [: +/\ ])([: < [ + [: i. ])"0  ])@:[ {&.>/ [: 
> < ]  >
>  >    (i.0) CutN i.0
>  >
>  >    (,0) CutN i.0
>  > ┌┐
>  > ││
>  > └┘
>  >    (,1) CutN
>  10+i.1
>  > ┌──┐
>  > │10│
>  >
>  └──┘
>  >    0 2 CutN 10+i.2
>  > ┌┬─────┐
>  > ││10 11│
>  >
>  └┴─────┘
>  >    2 5 0
>  CutN 10+i.7
>  >
>  ┌─────┬──────────────┬┐
>  > │10 11│12 13 14 15 16││
>  >
>  └─────┴──────────────┴┘
>  >    0 7 0 CutN 10+i.7
>  >
>  ┌┬────────────────────┬┐
>  > ││10 11 12 13 14 15 16││
>  >
>  └┴────────────────────┴┘
>  >
>  > Cheers,
>  >
>  > Erling Hellenäs
>  >
>  >
>  >
>  Den 2017-10-20 kl. 10:42, skrev 'Jon Hough' via
>  Programming:
>  >> The problem:
>  >> Let X be an array.
>  >> X=: i. 50 NB.  example
>  >>
>  >> Let
>  'split' be the percentages that each subarray takes  from X,  >> 
> sequentially  >> e.g  >> split =:
>  0.35 0.35 0.3 NB. first array takes 35% , second sub array
>
>  >> takes  35%, third takes 30%
>  >> So in the end
>  >>
>  >> My
>  solution
>  >>
>  >>
>  splitSubs =:
>  -.~&.>/\@:(i.&.>"0@:<"0)@:}.@:>.@:((+/\
>  - ])@:[ (* , ])
>  >> #@:])
>  >>
>  >> split
>  splitSubs X
>  >>
>  >>
>  >> This gives 3
>  boxed arrays. Each array holds the indices to take from  X.
>  >>
>  >>
>  There is a slight problem in that the first and second  subarrays  >> 
> have different  >> length, due to rounding error. I am  not too 
> bothered about that  >> since,  depending on the size of X and the 
> percentages, this is  >> unavoidable.
>  >>
>  >> Any more
>  succinct, nicer solutions?
>  >>
>  
> ----------------------------------------------------------------------
>  >> For information about J forums see 
> http://www.jsoftware.com/forums.htm
>  >
>  >
>  
> ----------------------------------------------------------------------
>  > For information about J forums see 
> http://www.jsoftware.com/forums.htm
>
>  
> ----------------------------------------------------------------------
>  For information about J forums see 
> http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to