[R] data manipulation help

2007-08-28 Thread Zheng Lu
Dear All: I have a dataset like A=c(0,12,34,5,6,0,4,5,6,0,12,3,4,8,7,0,4,3,5,0,...),I want to add a column to this dataset, it must be in B=c(1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5,..), How can I create B based on the sequence of A. Appreciate. Zheng

Re: [R] data manipulation help

2007-08-28 Thread Leeds, Mark \(IED\)
Lu Sent: Tuesday, August 28, 2007 5:00 PM To: r-help@stat.math.ethz.ch Subject: [R] data manipulation help Dear All: I have a dataset like A=c(0,12,34,5,6,0,4,5,6,0,12,3,4,8,7,0,4,3,5,0,...),I want to add a column to this dataset, it must be in B=c(1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5

Re: [R] data manipulation help

2007-08-28 Thread Charles C. Berry
On Tue, 28 Aug 2007, Zheng Lu wrote: Dear All: I have a dataset like A=c(0,12,34,5,6,0,4,5,6,0,12,3,4,8,7,0,4,3,5,0,...),I want to add a column to this dataset, it must be in B=c(1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5,..), How can I create B based on the sequence of A. Appreciate.

Re: [R] Data Manipulation using R

2007-04-18 Thread Stephen Tucker
...is this what you're looking for? donedat - subset(data,ID 6000 | ID = 7000) findat - donedat[-unique(rapply(donedat,function(x) which( x 0 ))),,drop=FALSE] the second line looks through each column, and finds the indices of negative values - rapply() returns

[R] Data Manipulation using R

2007-04-17 Thread Anup Nandialath
Dear Friends, I have data set with around 220,000 rows and 17 columns. One of the columns is an id variable which is grouped from 1000 through 9000. I need to perform the following operations. 1) Remove all the observations with id's between 6000 and 6999 I tried using this method. remdat1

Re: [R] Data Manipulation using R

2007-04-17 Thread Charilaos Skiadas
On Apr 17, 2007, at 8:03 PM, Anup Nandialath wrote: Dear Friends, I have data set with around 220,000 rows and 17 columns. One of the columns is an id variable which is grouped from 1000 through 9000. I need to perform the following operations. 1) Remove all the observations with id's

[R] Data manipulation in columns (with apply?)

2006-10-10 Thread Bret Collier
R Users, I have written a small simulation model in R which outputs a datafile consisting of ending population sizes for each simulation run (year). The data (see short data example below) is labeled by NUM (simulation run), sim (year) and N (yearly count). After searching the help files and

Re: [R] Data manipulation in columns (with apply?)

2006-10-10 Thread jim holtman
Does this start to do what you want? x - NUM sim N + 1 1 466 + 1 2 450 + 1 3 473 + 1 4 531 + 1 5 515 + 1 6 502 + 1 7 471 + 1 8 460 + 1 9 458 + 1 10 434 + 2 1 289 + 2 2 356 + 2 3 387 + 2 4 440 + 2 5 457 + 2 6 466 + 2 7 467 + 2 8 449 + 2 9 387 + 2 10 394 + 3 1 367 + 3 2 400 + 3 3 476 +

Re: [R] data manipulation docs

2006-05-05 Thread Frank E Harrell Jr
Federico Calboli wrote: Hi All, Is there some document/manual about data manipulation within R that I could use as a reference (obviously, aside the R manuals)? The reason I am asking is that I have a number of data frames/matrices containg genetic data. The data is in a character form,

[R] data manipulation docs

2006-05-04 Thread Federico Calboli
Hi All, Is there some document/manual about data manipulation within R that I could use as a reference (obviously, aside the R manuals)? The reason I am asking is that I have a number of data frames/matrices containg genetic data. The data is in a character form, as in: V1 V2 V3 V4 V5 1 AA

Re: [R] data manipulation docs

2006-05-04 Thread Larry Howe
On Thursday May 4 2006 10:20, Federico Calboli wrote: The reason I am asking is that I have a number of data frames/matrices containg genetic data. The data is in a character form, as in: Take a look at the Bioconductor project: Bioconductor is an open source and open development software

Re: [R] data manipulation

2005-09-10 Thread Dieter Menne
Marc Bernard bernarduse1 at yahoo.fr writes: I would be grateful if you can help me. My problem is the following: I have a data set like: ID time X1 X2 11 x111 x211 12 x112 x212 where X1 and X2 are 2 covariates and time is the time

[R] data manipulation

2005-09-08 Thread Marc Bernard
Dear All, I would be grateful if you can help me. My problem is the following: I have a data set like: ID time X1 X2 11 x111 x211 12 x112 x212 21 x121 x221 22 x122 x222 23 x123 x223 where

Re: [R] data manipulation

2005-09-08 Thread Martin Lam
Hi, This may not be the best solution, but at least it's easy to see what i'm doing, assume that your data set is called data: # remove the 4th column data1 = data[,-4] # remove the 3rd column data2 = data[,-3] # use cbind to add an extra column with only X1 #elements data1 = cbind(data1,

Re: [R] data manipulation

2005-09-08 Thread Sebastian Luque
Marc Bernard [EMAIL PROTECTED] wrote: Dear All, I would be grateful if you can help me. My problem is the following: I have a data set like: ID time X1 X2 11 x111 x211 12 x112 x212 21 x121 x221 22 x122

Re: [R] data manipulation

2005-09-08 Thread Thomas Lumley
This is what reshape() does. -thomas On Thu, 8 Sep 2005, Marc Bernard wrote: Dear All, I would be grateful if you can help me. My problem is the following: I have a data set like: ID time X1 X2 11 x111 x211 12 x112 x212 21

Re: [R] data manipulation

2005-09-08 Thread Jim Porzak
Also see Hadley Wickham's reshape package for more bells whistles. -- HTH! Jim Porzak Loyalty Matrix Inc. On 9/8/05, Thomas Lumley [EMAIL PROTECTED] wrote: This is what reshape() does. -thomas On Thu, 8 Sep 2005, Marc Bernard wrote: Dear All, I would be grateful if

Re: [R] data manipulation

2005-09-08 Thread Jean Eid
I am sure all this work but If you want exaclty the output to be the way you mentioned do this temp-read.table(yourfile, as.is=T, header=T) temp1-temp[, 1:3] temp2-temp[, c(1,2,4)] colnames(temp1)[3]-X colnames(temp2)[3]-X temp3-merge(temp1, temp2, all=T) temp3$type-toupper(substr(temp3$X, 1,2))

Re: [R] data manipulation help

2005-08-16 Thread Dieter Menne
roberto munguia munguiar at posgrado.ecologia.edu.mx writes: I have a dataframe with 468 individuals (rows) that I captured at least once during 28 visits (columns), it looks like: mortality[1:10,] 11 0 0 0 1 1 1

[R] data manipulation help

2005-08-15 Thread roberto munguia
Hellow everybody, I have a dataframe with 468 individuals (rows) that I captured at least once during 28 visits (columns), it looks like: mortality[1:10,] X18.10.2004 X20.10.2004 X22.10.2004 X24.10.2004 X26.10.2004 X28.10.2004 X30.10.2004 X01.11.2004 X03.11.2004 X07.11.2004 1

Re: [R] data manipulation

2005-04-23 Thread Yoko Nakajima
Hello, may I ask a further question? I have realized that data - matrix(scan(file-name), ncol=29) will read the data differently than I thought, i.e., (4,1) is the first column, (17,1) is the second column, and (1,1) is the third and so on by this code - please see the data below. Therefore,

RE: [R] data manipulation

2005-04-23 Thread Liaw, Andy
You just need to try harder in reading the documentation. Try: data - matrix(scan(file-name), ncol=29, byrow=TRUE) Andy From: Yoko Nakajima Hello, may I ask a further question? I have realized that data - matrix(scan(file-name), ncol=29) will read the data differently than I

[R] data manipulation

2005-04-13 Thread Yoko Nakajima
Hello, my question is about the data handling. I have a data set that is lined as: 4 1 17 1 1 -5.1536 -0.1668 -2.3412 -0.5062 0.9621 0.3640 0.3678 -0.5081 -0.2227 0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232 0.8673 -0.1033 -0.0796 -0.0341 -0.1716 -0.1801 -0.7014 0.6578 0.5611 4 1 17

RE: [R] data manipulation

2005-04-13 Thread John Fox
: Wednesday, April 13, 2005 7:56 PM To: r-help@stat.math.ethz.ch Subject: [R] data manipulation Hello, my question is about the data handling. I have a data set that is lined as: 4 1 17 1 1 -5.1536 -0.1668 -2.3412 -0.5062 0.9621 0.3640 0.3678 -0.5081 -0.2227 0.8142 -0.0389

Re: [R] data manipulation

2005-04-13 Thread Marc Schwartz
On Wed, 2005-04-13 at 20:56 -0400, Yoko Nakajima wrote: Hello, my question is about the data handling. I have a data set that is lined as: 4 1 17 1 1 -5.1536 -0.1668 -2.3412 -0.5062 0.9621 0.3640 0.3678 -0.5081 -0.2227 0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232 0.8673 -0.1033

Re: [R] Data manipulation

2005-02-08 Thread Helmut Kudrnovsky
thanks a lot for the information, reshape did the job datars -reshape(data, timevar=TERRCODE, idvar=BID, direction=wide) greetings helli BID TERRCODEANMCODE 200310413120660 22 0 200310413120660 273 0 200310413120660 280 0 200310413120660 467 0 200310413120660 468

[R] Data manipulation

2005-02-07 Thread Helmut Kudrnovsky
Content-Type: text/plain; charset=iso-8859-1 Received-SPF: none (hypatia: domain of [EMAIL PROTECTED] does not designate permitted sender hosts) X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by

Re: [R] Data manipulation

2005-02-07 Thread Uwe Ligges
Helmut Kudrnovsky wrote: Content-Type: text/plain; charset=iso-8859-1 Received-SPF: none (hypatia: domain of [EMAIL PROTECTED] does not designate permitted sender hosts) X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable

[R] Data manipulation query

2004-08-03 Thread Manoj - Hachibushu Capital
Hi, Not sure if I am making a simple problem complex but still here we go: I have a data frame with four columns say, X1 X2 X3 and X4. I want to break X4 into deciles and for each deciles obtained, I want to see corresponding elements of X1. Ideally, the output should be

[R] Data manipulation query

2004-08-03 Thread Vito Ricci
Hi, see ? quantile to obtain deciles of variable X1 see ? cut to divide the range of 'x' into intervals and codes the values in 'x' according to which interval they fall. se ? table to use the cross-classifying factors to build a contingency table of the counts at each combination of factor

Re: [R] data manipulation

2003-09-08 Thread Gabor Grothendieck
Try this. The function takes a vector of dates of the form -mm and produces a new character vector of dates of the same form except the output date is the beginning of the 6 month period in which the input date lies. The 6 month intervals are measured from the minimum date.

Re: [R] data manipulation

2003-09-08 Thread Gabor Grothendieck
Sorry but there was an error in the seq statement. Here it is again. date.grouping - function(d) { # for ea date in d calculate date beginning 6 month period which contains it mat - matrix(as.numeric(unlist(strsplit(as.character(d),-))),nr=2) f - function(x) do.call( ISOdate, as.list(x) )

Re: [R] data manipulation

2003-09-08 Thread Gabor Grothendieck
And here is a simplification I just noticed: date.grouping - function(d) { # for ea date in d calculate date beginning 6 month period which contains it POSIXct.dates - as.POSIXct(paste(as.character(d),01,sep=-)) breaks - c(seq(from=min(POSIXct.dates), to=max(POSIXct.dates), by=6 mo), Inf)

[R] data manipulation

2003-09-07 Thread Ricardo Pietrobon
Hi, I am new to R, coming from a few years using Stata. I've been twisting my brain and checking several R and S references over the last few days to try to solve this data management problem: I have a data set with a unique patient identifier that is repeated along multiple rows, a variable

Re: [R] data manipulation

2003-09-07 Thread Peter Dalgaard BSA
Ricardo Pietrobon [EMAIL PROTECTED] writes: IDdatecost 1 2001-01 200.00 1 2001-01 123.94 1 2001-03 100.23 1 2001-04 150.34 2 2001-03 296.34 2 2002-05 156.36 I would like to obtain the median costs and boxplots

[R] data manipulation: getting mean value every 5 rows

2003-07-28 Thread Federico Calboli
Dear All, I would like to ask you how to accomplish a little tricky data manipulation. I have a large dataset, looking something like: templinecagenumber 18 18 1 6678.63 18 18 1 7774.458 18 18 1 7845.902 18 18 1 9483.578

Re: [R] data manipulation: getting mean value every 5 rows

2003-07-28 Thread Spencer Graves
Have you considered aggregate [documented in help(aggregate) or www.r-project.org - search - R site search or Venables and Ripley, Modern Applied Statistics with S]? hope this helps. spencer graves Federico Calboli wrote: Dear All, I would like to ask you how to accomplish a little tricky

Re: [R] data manipulation: getting mean value every 5 rows

2003-07-28 Thread Tony Plate
x - read.table(file(clipboard), header=T) # add an extra field to define groups of 5 sequential rows x[,code] - rep(seq(len=nrow(x)/5), each=5) x temp line cage number code 118 181 6678.6301 218 181 7774.4581 318 181 7845.9021 418 181

Re: [R] data manipulation function descriptions

2003-02-14 Thread Luke Tierney
On Fri, 14 Feb 2003 [EMAIL PROTECTED] wrote: On Thu, 13 Feb 2003, kjetil brinchmann halvorsen wrote: On 13 Feb 2003 at 17:09, Jason Bond wrote: case switch [R-core : switch should be better

[R] data manipulation function descriptions

2003-02-13 Thread Jason Bond
Hello, I'm a recovering xlispstat user, and am trying to become a good R user. I've looked around on the CRAN doc website and have found quite a few sets of documentation with various level of data manipulation function descriptions (of what I've seen, most relatively low levels), and many

Re: [R] data manipulation function descriptions

2003-02-13 Thread kjetil brinchmann halvorsen
On 13 Feb 2003 at 17:09, Jason Bond wrote: As lisp-stat user, I tried to compile a short dictionary within your answer below: Hello, I'm a recovering xlispstat user, and am trying to become a good R user. I've looked around on the CRAN doc website and have found quite a few sets of

Re: [R] data manipulation function descriptions

2003-02-13 Thread ripley
On Thu, 13 Feb 2003, kjetil brinchmann halvorsen wrote: On 13 Feb 2003 at 17:09, Jason Bond wrote: case switch [R-core : switch should be better announced. It is for

[R] Data manipulation

2003-02-07 Thread Lew
I am interested in building a model with a subset of data from a column. The first 6 lines of my data look like this: QUAD YEAR SITE TREAT HERB TILL PLANT SEED Kweed 1 A4 2002s 1NN NN 55.00 2A10 2002s 1NN NN 60.00 3 B2 2002

Re: [R] Data manipulation

2003-02-07 Thread Roger Peng
You might want to try subsetting the data frame first, and then fit the model. Something like knap.sub - knap[c(41:60,81:100,101:120,121:140), ] knap.fit1 - lm(Kweed ~ TREAT, data = knap.sub) might work for you. -roger ___ UCLA Department of Statistics [EMAIL