[R] using cut

2007-05-26 Thread J. Scott Olsson
Suppose I have some data

x - rnorm(1000);
y - x*x;

then try to cut it into 2 chunks,

c - cut(y, breaks=2);

summary(y)
 Min.   1st Qu.Median  Mean   3rd Qu.  Max.
6.879e-06 9.911e-02 3.823e-01 9.499e-01 1.297e+00 8.342e+00

summary(c)
(-0.00833,4.17] (4.17,8.35]
958  42

Is that the correct behavior? Why is the left hand side of the interval
negative?

How would I split a data vector into groups/intervals such that each
interval contained the same number of points?


Thanks much!
Scott

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using cut

2007-05-26 Thread David Barron
You can split a vector into groups with equal numbers using the
quantcut function in the gtools package.  For example, to split into
two groups, use:

c - quantcut(y,c(0,.5,1))

On 26/05/07, J. Scott Olsson [EMAIL PROTECTED] wrote:
 Suppose I have some data

 x - rnorm(1000);
 y - x*x;

 then try to cut it into 2 chunks,

 c - cut(y, breaks=2);

 summary(y)
  Min.   1st Qu.Median  Mean   3rd Qu.  Max.
 6.879e-06 9.911e-02 3.823e-01 9.499e-01 1.297e+00 8.342e+00

 summary(c)
 (-0.00833,4.17] (4.17,8.35]
 958  42

 Is that the correct behavior? Why is the left hand side of the interval
 negative?

 How would I split a data vector into groups/intervals such that each
 interval contained the same number of points?


 Thanks much!
 Scott

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using cut on matrices

2003-07-12 Thread Tamas Papp
Dear list,

I'd like to use the function cut() on matrices, ie that when I apply
it to a matrix, it would return a matrix of the same dimensions
instead of a vector.

I wonder if there is a better (more elegant) solution than

matrix(cut(a, ...), ncol=ncol(a), nrow=nrow(a))

because I would like to use cut on both vectors and matrices and avoid
testing whether a is a matrix.

Thanks,

Tamas

-- 
Tamás K. Papp
E-mail: [EMAIL PROTECTED] (preferred, especially for large messages)
[EMAIL PROTECTED]
Please try to send only (latin-2) plain text, not HTML or other garbage.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] using cut on matrices

2003-07-12 Thread Peter Dalgaard BSA
Tamas Papp [EMAIL PROTECTED] writes:

 Dear list,
 
 I'd like to use the function cut() on matrices, ie that when I apply
 it to a matrix, it would return a matrix of the same dimensions
 instead of a vector.
 
 I wonder if there is a better (more elegant) solution than
 
 matrix(cut(a, ...), ncol=ncol(a), nrow=nrow(a))
 
 because I would like to use cut on both vectors and matrices and avoid
 testing whether a is a matrix.

Will this not work?:

ac - cut(a)
dim(ac) - dim(a)

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] using cut on matrices

2003-07-12 Thread Spencer Graves
Peter's solution led me to an apparent bug in R.

 a - 1:9
 cut(a)
Error in cut.default(a) : Argument breaks is missing, with no default
##
That didn't work, so I read the documentation, found that a second 
argument was required.  Result:
 cut(a, 2)
[1] (0.992,5] (0.992,5] (0.992,5] (0.992,5] (5,9.01]  (5,9.01]  (5,9.01]
[8] (5,9.01]  (5,9.01]
Levels: (0.992,5] (5,9.01]

# LOOK VERY CAREFULLY:
# R 1.6.2 coded 5 as (5, 9.01].
# (I know I need to upgrade to R 1.7.1.)
# S-Plus 6.1 for Windows 2000 produced the following:
 cut(a, 2)
[1] 1 1 1 1 1 2 2 2 2
attr(, levels):
[1] 0.92+ thru 5.00 5.00+ thru 9.08
# NOTE:  S-Plus 6.1 did it correctly.
# I'm sorry to report an error without a fix,
# but I'm out of time for this now.
# Before I found the bug, I wrote a function
# to retain dimnames:
Cut -
 function(a, ...){
  ac - cut(a, ...)
  if(is.array(a)){
   dim(ac) - dim(a)
   dimnames(ac) - dimnames(a)
  }
  else names(ac) - names(a)
  ac
}
 Cut(a, 2)
[1] (0.992,5] (0.992,5] (0.992,5] (0.992,5] (5,9.01]  (5,9.01]  (5,9.01]
[8] (5,9.01]  (5,9.01]
Levels: (0.992,5] (5,9.01]
# That worked fine.  What about a vector with names?
 a1 - a
 names(a1) - letters[1:9]
 Cut(a1, 2)
[1] (0.992,5] (0.992,5] (0.992,5] (0.992,5] (5,9.01]  (5,9.01]  (5,9.01]
[8] (5,9.01]  (5,9.01]
Levels: (0.992,5] (5,9.01]
# What happened to the names?
 names(Cut(a1,2))
[1] a b c d e f g h i
# The names were there, but R chose not to display them.
# S-Plus 6.1 under Win2000 produced the following:
 Cut(a1, 2)
 a b c d e f g h i
 1 1 1 1 1 2 2 2 2
attr(, levels):
[1] 0.92+ thru 5.00 5.00+ thru 9.08
# The names appear with codes and a translate table
# Something similar happens with an array:
# In R 1.6.2:
 a2 - a
 dim(a2) - c(3,3)
 dimnames(a2) - list(LETTERS[1:3], c(ab,bc,cd))
 Cut(a2, 2)
[1] (0.992,5] (0.992,5] (0.992,5] (0.992,5] (5,9.01]  (5,9.01]  (5,9.01]
[8] (5,9.01]  (5,9.01]
Levels: (0.992,5] (5,9.01]
 dimnames(Cut(a2,2))
[[1]]
[1] A B C
[[2]]
[1] ab bc cd
# S-Plus produced the following:
 Cut(a2, 2)
  ab bc cd
A  1  1  2
B  1  1  2
C  1  2  2
attr(, levels):
[1] 0.92+ thru 5.00 5.00+ thru 9.08
#
hope this helps.
spencer graves
Peter Dalgaard BSA wrote:
Tamas Papp [EMAIL PROTECTED] writes:


Dear list,

I'd like to use the function cut() on matrices, ie that when I apply
it to a matrix, it would return a matrix of the same dimensions
instead of a vector.
I wonder if there is a better (more elegant) solution than

matrix(cut(a, ...), ncol=ncol(a), nrow=nrow(a))

because I would like to use cut on both vectors and matrices and avoid
testing whether a is a matrix.


Will this not work?:

ac - cut(a)
dim(ac) - dim(a)
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help