from:"jim holtman"

[R] POSIXct dates on x-axis using xyplot

2007-09-10 Thread jim holtman

I am using 'xyplot' in lattice to plot some data where the x-axis is a
POSIXct date.  I have data which spans a 6 month period, but when I
plot it, only the last month is printed on the right hand side of the
axis.  I would have expected that at least I would have a beginning
and an ending point so that I have a point of reference as to the time
that the data spans.  Here is some test data.


> # create test data
> dates <- seq(as.POSIXct('2006-01-03'), as.POSIXct('2006-06-26'), by='1 week')
> my.data <- seq(1, length=length(dates))
> require(lattice)
[1] TRUE
> # plot only shows a single month ("Jul" on the right).  Would have
> # expected at least the beginning and the ending month since this spans
> # a 6 month period
> pdf('/test.pdf')
> xyplot(my.data ~ dates)
> dev.off()
windows
  2
> sessionInfo()
R version 2.5.1 (2007-06-27)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
[7] "base"

other attached packages:
 lattice
"0.16-5"
> Sys.info()
  sysname   release
"Windows"  "NT 5.1"
  version      nodename
"(build 2600) Service Pack 2"          "JIM-LAPTOP"
  machine login
"x86" "jim holtman"
 user
"jim holtman"
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitdistr()

2007-09-09 Thread jim holtman

I assume that you want to do the fitdistr on one of the columns of the
dataframe that you have read in. What does 'str(ONES3)' show?  If the
data is in the first column, try:

fitdistr(ONES3[[1]],"chi-squared")


On 9/9/07, Terence Broderick <[EMAIL PROTECTED]> wrote:
> I am trying to fit the chi-squared distribution to a set of data using the 
> fitdistr function found in the MASS4 library, the data set is called ONES3, I 
> have loaded it using the command
>
>  ONES3<-read.table("ONES3.pdf",header=TRUE,na="NA")
>
>  I print out the dataset ONES3 to the screen to make sure it has loaded
>
>  Then I try to fit this data using the command fitdistr
>
>   fitdistr(ONES3,"chi-squared")
>
>  and it returns the comment
>
>  Error in fitdistr(ONES3, "chi-squared") : 'x' must be a non-empty numeric 
> vector
>
>  Can anybody help with this, I imagine it is a common mistake for beginners 
> like myself
>
>
>
> audaces fortuna iuvat
>
> -
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with the aggregate command

2007-09-07 Thread jim holtman

Your 'lst' is not the same length as either set1 or set2.  If one of
your columns in the dataframe is the year, then you should have:

aggregate(set1, set1$year, median)

On 9/7/07, Anup Nandialath <[EMAIL PROTECTED]> wrote:
> Dear friends,
>
> I have a data set with 23 columns and 38000 rows. It is a panel running from 
> the years 1991 through 2005. I want to aggregate the data and get the medians 
> of each of the 23 columns for each of the years. In other words my output 
> should be like this
>
> Year Median
>
> 1991123
> 1992145
> 1993132
>
> etc.
>
> The sample lines of code to do this operation is
>
> set1 <- subset(as.data.frame(dataset),rep1==1)
> set2 <- subset(as.data.frame(dataset),rep1==0)
> lst <- list(unique(yeara))
>
> y1 <- aggregate(set1,lst,median)
> y2 <- aggregate(set2,lst,median)
>
> However I'm getting an error as follows
> Error in FUN(X[[1]], ...) : arguments must have same length
>
> Can somebody please help me with what I'm doing wrong here?
>
> Thanks in advance
> Regards
>
> Anup
>
>
>
>
> -
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove particular elements in a vector

2007-09-07 Thread jim holtman

x <- answer(100)
x <- x[!is.na(x)]  # remove NAs

On 9/7/07, kevinchang <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> Is there any build-in function allowing us to remove a particular group of
> elements in a vector?
>
> For example, if I want to remove all the "NA" in the output of "answer"
> function . Please help. Thanks
>
>  answer(100)
>  [1]  1  2 NA  4 NA NA  7  8 NA NA 11 NA 13 14 NA 16 17 NA 19 NA NA 22 23
> NA NA
>  [26] 26 NA 28 29 NA 31 32 NA 34 NA NA 37 38 NA NA 41 NA 43 44 NA 46 47 NA
> 49 NA
>  [51] NA 52 53 NA NA 56 NA 58 59 NA 61 62 NA 64 NA NA 67 68 NA NA 71 NA 73
> 74 NA
>  [76] 76 77 NA 79 NA NA 82 83 NA NA 86 NA 88 89 NA 91 92 NA 94 NA NA 97 98
> NA NA
>
> --
> View this message in context: 
> http://www.nabble.com/remove-particular-elements-in-a-vector-tf4404489.html#a12565480
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting lines to sets of points

2007-09-07 Thread jim holtman

?segments

On 9/7/07, lawnboy34 <[EMAIL PROTECTED]> wrote:
>
> I am using R to plot baseball spray charts from play-by-play data. I have
> used the following command to plot the diamond:
>
> plot (0:250, -250:0, type="n", bg="white")
>lines(c(125,150,125,100,125),c(-210,-180,-150,-180,-210), 
> col=c("black"))
>
> I have also plotted different hit locations using commands such as the
> following:
>
> points(subset(framename$hit_x, framename$hit_traj=="line_drive"),
> subset(-framename$hit_y, framename$hit_traj=="line_drive"), pch=20,
> col=c("red"))
>
> My question: Is there any easy way to plot a line from the origin (home
> plate) to each point on the graph? Preferably the line would share the same
> color as the dot that denotes where the ball landed. I have tried searching
> Google and these forums, and most graphing questions have to do with
> scatterplots or other varieties of graphs I am not using. Thanks very much
> in advance.
>
> -Jason
> --
> View this message in context: 
> http://www.nabble.com/Plotting-lines-to-sets-of-points-tf4404235.html#a12564704
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R first.id last.id function error

2007-09-07 Thread jim holtman

This function should do it for you:


> file1 <- read.table(textConnection("   id rx week dv1
+ 1   1  11   1
+ 2   1  12   1
+ 3   1  13   2
+ 4   2  11   3
+ 5   2  12   4
+ 6   2  13   1
+ 7   3  11   2
+ 8   3  12   3
+ 9   3  13   4
+ 10  4  11   2
+ 11  4  12   6
+ 12  4  13   5
+ 13  5  21   7
+ 14  5  22   8
+ 15  5  23   5
+ 16  6  21   2
+ 17  6  22   4
+ 18  6  23   6
+ 19  7  21   7
+ 20  7  22   8
+ 21  8  21   9
+ 22  9  21   4
+ 23  9  22   5"), header=TRUE)
>
> mark.function <-
+ function(df){
+ df <- df[order(df$id, df$week),]
+ # create 'diff' of 'id' to determine where the breaks are
+ breaks <- diff(df$id)
+ # the first entry will be TRUE, and then every occurance of
non-zero in breaks
+ df$first.id <- c(TRUE, breaks != 0)
+ # the last entry is TRUE and every non-zero breaks
+ df$last.id <- c(breaks != 0, TRUE)
+ df
+ }
>
> mark.function(file1)
   id rx week dv1 first.id last.id
1   1  11   1 TRUE   FALSE
2   1  12   1FALSE   FALSE
3   1  13   2FALSETRUE
4   2  11   3 TRUE   FALSE
5   2  12   4FALSE   FALSE
6   2  13   1FALSETRUE
7   3  11   2 TRUE   FALSE
8   3  12   3FALSE   FALSE
9   3  13   4FALSETRUE
10  4  11   2 TRUE   FALSE
11  4  12   6FALSE   FALSE
12  4  13   5FALSETRUE
13  5  21   7 TRUE   FALSE
14  5  22   8FALSE   FALSE
15  5  23   5FALSETRUE
16  6  21   2 TRUE   FALSE
17  6  22   4FALSE   FALSE
18  6  23   6FALSETRUE
19  7  21   7 TRUE   FALSE
20  7  22   8FALSETRUE
21  8  21   9 TRUETRUE
22  9  21   4 TRUE   FALSE
23  9  22   5FALSETRUE
>
>


On 9/7/07, Gerard Smits <[EMAIL PROTECTED]> wrote:
> Hi R users,
>
> I have a test dataframe ("file1," shown below) for which I am trying
> to create a flag for the first and last ID record (equivalent to SAS
> first.id and last.id variables.
>
> Dump of file1:
>
>  > file1
>id rx week dv1
> 1   1  11   1
> 2   1  12   1
> 3   1  13   2
> 4   2  11   3
> 5   2  12   4
> 6   2  13   1
> 7   3  11   2
> 8   3  12   3
> 9   3  13   4
> 10  4  11   2
> 11  4  12   6
> 12  4  13   5
> 13  5  21   7
> 14  5  22   8
> 15  5  23   5
> 16  6  21   2
> 17  6  22   4
> 18  6  23   6
> 19  7  21   7
> 20  7  22   8
> 21  8  21   9
> 22  9  21   4
> 23  9  22   5
>
> I have written code that correctly assigns the first.id and last.id variabes:
>
> require(Hmisc)  #for Lags
> #ascending order to define first dot
> file1<- file1[order(file1$id, file1$week),]
> file1$first.id <- (Lag(file1$id) != file1$id)
> file1$first.id[1]<-TRUE  #force NA to TRUE
>
> #descending order to define last dot
> file1<- file1[order(-file1$id,-file1$week),]
> file1$last.id  <- (Lag(file1$id) != file1$id)
> file1$last.id[1]<-TRUE   #force NA to TRUE
>
> #resort to original order
> file1<- file1[order(file1$id,file1$week),]
>
>
>
> I am now trying to get the above code to work as a function, and am
> clearly doing something wrong:
>
>  > first.last <- function (df, idvar, sortvars1, sortvars2)
> +   {
> +   #sort in ascending order to define first dot
> +   df<- df[order(sortvars1),]
> +   df$first.idvar <- (Lag(df$idvar) != df$idvar)
> +   #force first record NA to TRUE
> +   df$first.idvar[1]<-TRUE
> +
> +   #sort in descending order to define last dot
> +   df<- df[order(-sortvars2),]
> +   df$last.idvar  <- (Lag(df$idvar) != df$idvar)
> +   #force last record NA to TRUE
> +   df$last.idvar[1]<-TRUE
> +
> +   #resort to original order
> +   df<- df[order(sortvars1),]
> +   }
>  >
>
> Function call:
>
>  > first.last(df=file1, idvar=file1$id,
> sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week))
>
> R Error:
>
> Error in as.vector(x, mode) : invalid argument 'mode'
>  >
>
> I am not sure about the passing of the sort strings.  Perhaps this is
> were things are off.  Any help greatly appreciated.
>
> Thanks,
>
> Gerard
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] change all . to 0 in a data.frame

2007-09-05 Thread jim holtman

Here is one way.  You might want to read in the data with 'as.is=TRUE'
to prevent conversion to factors.

> x <- data.frame(a=c(1,2,3,'.',5,'.'))
> str(x)
'data.frame':   6 obs. of  1 variable:
 $ a: Factor w/ 5 levels ".","1","2","3",..: 2 3 4 1 5 1
> # replace '.' with zero; either readin with 'as.is=TRUE' or convert to 
> character
> x$a <- as.character(x$a)
> x$a[x$a == '.'] <- '0'
> x$a <- as.numeric(x$a)
> str(x)
'data.frame':   6 obs. of  1 variable:
 $ a: num  1 2 3 0 5 0
>
>


On 9/5/07, Dieter Best <[EMAIL PROTECTED]> wrote:
> Hello,
>  I read in a tab delimited text file via mydata = read.delim(myfile). The 
> text file was originally an excel file where . was used in place of 0. Now 
> all the columns which should be integers are factors. Any ideas how to change 
> all the . to 0 and factors back to integer?
>  Thanks a lot in advance for any suggestions,
>  -- D
>
>
> -
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing elements in list as a data frame

2007-09-05 Thread jim holtman

Try this:

> sls <- list(a=matrix(sample(10), ncol=2, dimnames=list(NULL, c('x', 'y'))),
+ b=matrix(sample(16), ncol=2, dimnames=list(NULL, c('x', 'y'
> sls
$a
 x  y
[1,] 8  2
[2,] 9 10
[3,] 4  1
[4,] 5  7
[5,] 3  6

$b
  x  y
[1,]  4 14
[2,]  3 15
[3,] 16  5
[4,]  1  9
[5,]  8  7
[6,] 10  2
[7,] 12 13
[8,] 11  6

> # create output matrix
> do.call('rbind', lapply(names(sls), function(.name){
+ data.frame(sls[[.name]], Name=.name)
+ }))
x  y Name
1   8  2a
2   9 10a
3   4  1a
4   5  7a
5   3  6a
6   4 14b
7   3 15b
8  16  5b
9   1  9b
10  8  7b
11 10  2b
12 12 13b
13 11  6b
>
>


On 9/5/07, Srinivas Iyyer <[EMAIL PROTECTED]> wrote:
> Dear R-helpers,
> Lists in R are stumbling block for me.
>
> I kindly ask you to help me able to write a
> data-frame.
>
> I have a list of lists.
>
> > sls[1:2]
> $Andromeda_maya1
>   x   y
> [1,] 369 103
> [2,] 382 265
> [3,] 317 471
> [4,] 169 465
> [5,] 577 333
>
> $Andromeda_maya2
>x   y
>  [1,] 173 507
>  [2,] 540 395
>  [3,] 268 143
>  [4,] 346 175
>  [5,] 489  91
>
> I want to be able to write a data.frame like the
> following:
> X   Y Name
> 369 103  Andromeda_maya1
> 382 265  Andromeda_maya1
> 317 471  Andromeda_maya1
> 169 465  Andromeda_maya1
> 577 333  Andromeda_maya1
> 173 507  Andromeda_maya2
> 540 395  Andromeda_maya2
> 268 143  Andromeda_maya2
> 346 175  Andromeda_maya2
> 489  91  Andromeda_maya2
>
> Is there a way to convert this list-of-list into a
> data.frame.
>
> Thanks
> srini
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] list element to matrix

2007-09-05 Thread jim holtman

If they are already a matrix in the list, then you don't have to use
'as.matrix'; you can just say:

M1 <- D[[1]]

Now the question is, what do you mean by how do you index M1?  Do you
want to go through the list applying a function to each matrix?  If
so, then just 'lapply'.  For example, to get the column means, you
would do:

mean.list <- lapply(D, colMeans)

Can you explain in a little more detail the problem you are trying to solve.

On 9/5/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> I have created a list of "matrices" using sapply or lapply and wish to 
> extract each of the "matrices" as a matrix.  Some of them are 2x2, 3x3, etc.
>
> I can do this one at a time as:
>
> M1<-as.matrix(D[[1]])
>
> How can repeat this process for an unknown number of entries in the list?  In 
> other words, how shall I index M1?
>
> Diana
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data.frame loses name when constructed with one column

2007-09-04 Thread jim holtman

Try drop=FALSE:

> x
  out pred1 predd2
1   1   2.03.0
2   2   3.55.5
3   3   5.5   11.0
> x[,1]
[1] 1 2 3
> data.frame(x[,1])
  x...1.
1  1
2  2
3  3
> data.frame(x[,1, drop=FALSE])
  out
1   1
2   2
3   3
>


On 9/4/07, Stan Hopkins <[EMAIL PROTECTED]> wrote:
> Not sure why the data.frame function does not capture the name of the column 
> field when its being built with only one column.
>
> Can anyone help?
>
>
>
> > data
>  out pred1 predd2
> 1   1   2.03.0
> 2   2   3.55.5
> 3   3   5.5   11.0
> > data1=data.frame(data[,1])
> > data1
>  data...1.
> 1 1
> 2 2
> 3 3
> > data1=data.frame(data[,1:2])
> > data1
>  out pred1
> 1   1   2.0
> 2   2   3.5
> 3   3   5.5
> > sessionInfo()
> R version 2.5.1 (2007-06-27)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
> States.1252;LC_MONETARY=English_United 
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
> [7] "base"
> >
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Howto sort dataframe columns by colMeans

2007-09-04 Thread jim holtman

Here is one way of doing it by 'skipping' the first column which is a
factor and your 'time':

> x <- read.table(textConnection(" time   met-amet-bmet-c
+ 00:0042 18  99
+ 00:0588 16  67
+ 00:1080 27  84"), header=TRUE)
> x.mean <- colMeans(x[-1])
> x.new <- x[,c('time', names(sort(x.mean, decreasing=TRUE)))]
>
> x.new
   time met.c met.a met.b
1 00:00994218
2 00:05678816
3 00:10848027
>


On 9/4/07, Lynn Osburn <[EMAIL PROTECTED]> wrote:
>
> I read from external data source containing several columns.  Each column
> represents value of a metric.  The columns are time series data.
>
> I want to sort the resulting dataframe such that the column with the largest
> mean is the leftmost column, descending in colMean values to the right.
>
> I see many solutions for sorting rows based on some column characteristic,
> but haven't found any discussion of sorting columns based on column
> characteristics.
>
> viz.  input data looks like this
>  time   met-amet-bmet-c
> 00:0042 18  99
> 00:0588 16  67
> 00:1080 27  84
>
> desired output:
>  time   met-cmet-a met-b
> 00:0099 42  18
> 00:0567 88  16
> 00:1084 80  27
>
> Thanks,
> -Lynn
>
> --
> View this message in context: 
> http://www.nabble.com/Howto-sort-dataframe-columns-by-colMeans-tf4380044.html#a12485729
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confusion using "functions to access the function call stack" example section

2007-09-04 Thread jim holtman

It is because you have a recursive function call and the value of 'y'
when you print is it 0.  I have added another statement that might
help clarify what you are seeing.  At the point at which the most
current value of the function 'ggg' is evaluated (last call), the
value of 'y' is zero and you are 5 levels down from the 'main frame':

> gg <- function(y) {
+cat ("gg y=", y, "current frame =", sys.nframe(), "\n")
+ggg <- function() {
+cat("y = ", y, "\n")
+cat("current frame is ", sys.nframe(), "\n")
+cat("parents are ", sys.parents(), "\n")
+print(sys.function(0)) # ggg
+print(sys.function(2)) # gg
+}
+
+if (y > 0) gg(y-1) else ggg()
+ }
>
> gg(3)
gg y= 3 current frame = 1
gg y= 2 current frame = 2
gg y= 1 current frame = 3
gg y= 0 current frame = 4
y =  0
current frame is  5
parents are  0 1 2 3 4
function() {
   cat("y = ", y, "\n")
   cat("current frame is ", sys.nframe(), "\n")
   cat("parents are ", sys.parents(), "\n")
   print(sys.function(0)) # ggg
   print(sys.function(2)) # gg
   }

function(y) {
   cat ("gg y=", y, "current frame =", sys.nframe(), "\n")
   ggg <- function() {
   cat("y = ", y, "\n")
   cat("current frame is ", sys.nframe(), "\n")
   cat("parents are ", sys.parents(), "\n")
   print(sys.function(0)) # ggg
   print(sys.function(2)) # gg
   }

   if (y > 0) gg(y-1) else ggg()
}


On 9/4/07, Leeds, Mark (IED) <[EMAIL PROTECTED]> wrote:
> I was going through the example below which is taken from the example
> section in the R documentation for accessing the function call stack.
> I am confused and I have 3 questions that I was hoping someone could
> answer.
>
> 1) why is y equal to zero even though the call was done with gg(3)
>
> 2) what does parents are 0,1,2,0,4,5,6,7 mean ? I understand what a
> parent frame is but how do the #'s relate to this
> particular example ? Why is the current frame # 8 ?
>
> 3) it says that sys.function(2) should be gg but I would think that
> sys.function(1) would be gg since it's one up from where
> the call is being made.
>
> Thanks a lot. If the answers are too complicated and someone knows of a
> good reference that goes into more details about
> the sys functions, that's appreciated also.
>
>
>
>
> gg <- function(y) {
>ggg <- function() {
>cat("y = ", y, "\n")
>cat("current frame is ", sys.nframe(), "\n")
>cat("parents are ", sys.parents(), "\n")
>print(sys.function(0)) # ggg
>print(sys.function(2)) # gg
>}
>
>if (y > 0) gg(y-1) else ggg()
> }
>
> gg(3)
>
>
>
> # OUTPUT
>
>
> y =  0
> current frame is  8
> parents are  0 1 2 0 4 5 6 7
> function() {
>cat("y = ", y, "\n")
>cat("current frame is ", sys.nframe(), "\n")
>cat("parents are ", sys.parents(), "\n")
>print(sys.function(0)) # ggg
>print(sys.function(2)) # gg
>}
> 
> function (expr, envir = parent.frame(), enclos = if (is.list(envir) ||
>is.pairlist(envir)) parent.frame() else baseenv())
> .Internal(eval.with.vis(expr, envir, enclos))
> 
> 
>
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to calculate mean into a list

2007-08-28 Thread jim holtman

try:

colMeans(do.call('rbind', lapply(a0, mean)))


On 8/28/07, Weiwei Shi <[EMAIL PROTECTED]> wrote:
> Dear Listers:
>
> I have this task and suppose a0 is a list of 10 data.frames, I want to
> calculate like this
> > (a0[[1]]+a0[[2]]+..+a[[10]])/10
>
> Thanks.
>
> --
> Weiwei Shi, Ph.D
> Research Scientist
> GeneGO, Inc.
>
> "Did you always know?"
> "No, I did not. But I believed..."
> ---Matrix III
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] alternate methods to perform a calculation

2007-08-28 Thread jim holtman

I think you can use 'outer'

outer(b$xk1, a$x1, function(y,z)abs(z-y))
outer(b$xk2, a$x2, function(y,z)abs(z-y))

On 8/28/07, dxc13 <[EMAIL PROTECTED]> wrote:
>
> Consider a data frame (x) with 2 variables, x1 and x2, having equal values.
> It looks like:
>
> x1   x2
> 11
> 22
> 33
>
> Now, consider a second data frame (xk):
> xk1   xk2
> 0.50.5
> 1.00.5
> 1.50.5
> 2.00.5
> 0.51
> 1.01
> 1.51
> 2.01
> 0.51.5
> 1.01.5
> 1.51.5
> 2.01.5
> 0.52
> 1.02
> 1.52
> 2.02
>
> I have written code to calculate some differences between these two data
> sets; the main idea is to subtract off each element of xk1 from each value
> of x1, and similarly for xk2 and x2.  This is what I have:
>
> w1 <- array(NA,dim=c(nrow(xk),length(x$x1)))
> w2 <- array(NA,dim=c(nrow(xk),length(x$x2)))
> for (j in 1:nrow(xk)) {
>w1[j,] <- abs(x$x1-xk$xk1[j])
>w2[j,] <- abs(x$x2-xk$xk2[j])
> }
>
> Is there  a way to do the above calculation without use of a FOR loop?
> Thank you
>
> Derek
>
>
> --
> View this message in context: 
> http://www.nabble.com/alternate-methods-to-perform-a-calculation-tf4344469.html#a12376906
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data formatting: from rows to columns

2007-08-28 Thread jim holtman

Here is a way using sprintf:

x <- read.table(textConnection(" V2  V3
27  2032567  19
28  2035482  19
126 2472826  19
132 2473320  19
136 2035480 135
145 2062458 135
148 2074927 135
151 2102395 142
156 2027252 142
158 2473082 142"))

# output the data
cat(sprintf("%d\n%d\n\n", x$V2, x$V3), sep='', file='tempxx.txt')


On 8/28/07, Federico Calboli <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I have some data I need to write as a file from R to use in a different 
> program.
> My data comes as a numeric matrix of n rows and 2 colums, I need to transform
> each row as a two rows 1 col output, and separate the output of each row with 
> a
> blanck line.
>
> Foe instance I need to go from this:
>
>  V2  V3
> 27  2032567  19
> 28  2035482  19
> 126 2472826  19
> 132 2473320  19
> 136 2035480 135
> 145 2062458 135
> 148 2074927 135
> 151 2102395 142
> 156 2027252 142
> 158 2473082 142
>
> to
>
> 2032567
> 19
>
> 2035482
> 19
>
> 2472826
> 19
>
> 2473320
> 19
>
> 2035480
> 135
>
> ...
>
> Any hint? I seem a bit stuck. cat(unlist(data), file ='data.txt', sep = '\n')
> (obviously) does not work...
>
> Cheers,
>
> Fede
>
>
>
>
>
>
> --
> Federico C. F. Calboli
> Department of Epidemiology and Public Health
> Imperial College, St Mary's Campus
> Norfolk Place, London W2 1PG
>
> Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193
>
> f.calboli [.a.t] imperial.ac.uk
> f.calboli [.a.t] gmail.com
>
> __________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset question

2007-08-27 Thread jim holtman

Here is one way of checking to see if a row contains a particular
value and setting the contents of a new column:

n <- 20
# create test data
x <- 
data.frame(sample(letters,n),sample(letters,n),sample(letters,n),sample(letters,n))
# add a column indicating if the row contains 'a', 'b' or 'c'
x$a <- apply(x[, 1:4], 1, function(.row) any(.row %in% c('a','b','c'))) + 0


On 8/27/07, Kirsten Beyer <[EMAIL PROTECTED]> wrote:
> I would like to code records in a dataset with a 1 if any of the
> columns 9-67 contain a particular code, and zero if they don't.  I've
> been working with "subset" and it seems that something like
> subset(data, data[9:67]--"12345") would work, but I have been
> unsuccessful so far.  It seems like a simple problem - any help is
> appreciated!
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fill circles

2007-08-25 Thread jim holtman

Here is a function that will generate a color sequence for an input
vector,  You can specify the colors to use, the range and the number
of color steps:

# specify the colors and the number of increments you want for a specified
# range.  It will return the colors for the input vector
# specify the colors and the number of increments you want for a specified
# range.  It will return the colors for the input vector
f.color <-
function(input, # input vector
 colors=c('green','yellow','red'),  # desired colors
 input.range=c(0,0.01),  # range of input to create colors
 input.steps=10)  # number of increments
{
myColors <- colorRampPalette(colors)(input.steps)  # generate colors
myColors[cut(input, seq(input.range[1], input.range[2],
length=input.steps+1),
labels=FALSE, include.lowest=TRUE)]
}
# generate a legend to show colors
plot.new()  # create blank plot
x <- round(runif(15), 3)
legend('topleft', legend=x, fill=f.color(x, input.range=c(0,1)))
legend('topright', legend=x, fill=f.color(x, input.range=c(0,1),
colors=c('purple','red','blue','orange')))
legend('top', legend=x, fill=f.color(x, input.range=c(0,1),
colors=c('red','yellow','green')))

So you should be able to use something like this.

On 8/25/07, Cristian cristian <[EMAIL PROTECTED]> wrote:
> Hi all,
> I'm an R newbie,
> I did this script to create a scatterplot using the "tree" matrix from
> "datasets" package:
>
> library('datasets')
> with(trees,
> {
> plot(Height, Volume, pch=3, xlab="Height", ylab="Volume")
> symbols(Height, Volume, circles=Girth/12, fg="grey", inches=FALSE,
> add=FALSE)
> }
> )
>
> I'd like to use the column Named "Height" to fill the circles with colors
> (ex.: the small numbers in green then yellow and the high numbers in red).
> I'd like to have a legend for the size and the colors too.
> I did it manually using a script like that:
> color[(x>=0.001)&(x<0.002)]<-"#41FF41"
> color[(x>=0.002)&(x<0.003)]<-"#2BFF2B"
> color[(x>=0.003)&(x<0.004)]<-"#09FF09"
> color[(x>=0.004)&(x<0.005)]<-"#00FE00"
> color[(x>=0.005)&(x<0.006)]<-"#00F700"
> color[(x>=0.006)&(x<0.007)]<-"#00E400"
> color[(x>=0.007)&(x<0.008)]<-"#00D600"
> color[(x>=0.008)&(x<0.009)]<-"#00C300" and so on but I don't like to do it
> manually... do know a solution...
> Thank you very much
> chris
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.numeric : what goes wrong?

2007-08-24 Thread jim holtman

Do an 'str' on the vector.  Are you sure it is not a 'factor'?

Try:

as.numeric(as.character(j1[1]))

On 8/24/07, Wolfgang Polasek <[EMAIL PROTECTED]> wrote:
> I have a character vector j1 created from dimnames and want it to convert it
> to numeric.
> Like the first element:
>
> > j1[1]
>  f896
> 1  896
>
> > as.numeric(j1[1])
> [1] 1990
>
> why is it not 896 as it should be?
> This is true fr the whole vector.
>
> Thanks
> W.P.
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need a variant of rbind for datasets with different numbers of columns

2007-08-22 Thread jim holtman

Where is the data coming from since it has a variable number of
columns in each row?  Is it coming from a text file?  If so, you can
use the "fill=TRUE" option when reading to fill out empty columns.
You need to provide at least a subset of the data so we can see what
you are working with.

On 8/22/07, Kirsten Beyer <[EMAIL PROTECTED]> wrote:
> Hello.  I am looking for a function that will allow me to paste rows
> together without regard for the numbers of columns in the datasets to
> be joined.  The only columns where it matters if they are aligned
> correctly are at the beginning - the rest of the columns represent
> differing numbers of ICD9 (disease) codes reported by each
> person(record) at a health visit.  They are in no particular order.
>
> For example, a result would look like this:
>
> patient  ICD91  ICD92  ICD93
> patient A   12345  67891543
> patient B3469   9090
> patient C   1234
>
> I am trying to accomplish this inside a loop which first identifies
> the codes associated with the person and then joins them to the
> person.  I have the code working so that it can create a row for each
> person, but I can't figure out how to join these rows together!  FYI,
> my dataset has 200,000+ people.
>
> Thanks
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rectify a program of seasonal dummies matrix

2007-08-21 Thread jim holtman

Your syntax is wrong; e.g.,
if i==j

should be

if (i == j)

same with your use of 'if else'.  You need to use the correct syntax.
Your example is hard to follow without the correct indentation since
you are using the incorrect syntax.

On 8/21/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Hi friends,
> I would like to construct a matrix of seasonal dummies with number of rows 
> (observations)=100. such matrix is written as follows:[1 0 0 0;0 1 0 0;0 0 1 
> 0;0 0 0 1;1 0 0 0;0 1 0 0;0 0 1 0;0 0 0 1;etc...] . I wrote the following 
> program:
> T=100
> br=matrix(0,T,4)
> {
> for (i in 1:T)
> for (j in 1:4)
> if i==j
> br[i,j]=1
> if else (abs(i-j)%%4==0
> br[i,j]=1
> else
> br[i,j]=0
> }
> z<-br
> z
>
> but unfortunately I obtained from the console the following message:
> > {
> + for (i in 1:T)
> +  for (j in 1:4)
> + (if i==j)
> Erreur : syntax error, unexpected SYMBOL, expecting '(' dans :
> "
> "
> > br[i,j]=1
> Erreur dans br[i, j] = 1 : objet "i" non trouvé
> >
> > (if else (abs(i-j)%%4==0)
> Erreur : syntax error, unexpected ELSE, expecting '(' dans "(if else"
> > br[i,j]=1
> Erreur dans br[i, j] = 1 : objet "i" non trouvé
> > else
> Erreur : syntax error, unexpected ELSE dans "else"
> > br[i,j]=0
> Erreur dans br[i, j] = 0 : objet "i" non trouvé
> >   }
> Erreur : syntax error, unexpected '}' dans "  }"
> >
> Can you please rectify my smal program, I tried to rectify it but I can't. 
> Many thanks in advance.
>[[alternative HTML version deleted]]
>
>
> ______
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to parse a string into the symbol for a data frame object

2007-08-19 Thread jim holtman

One way to do it is to pass in the character name of the dataframe you
want to reference and then use 'get' to access the value: e.g.,

df1 <- data.frame(x=seq(0,10), y=seq(10,20))
df2 <- data.frame(a=seq(0,10), b=seq(10,20))
# use the character names for referencing
for (df in c('df1', 'df2')){
# get the data to operate on (read-only)
.val <- get(df)
# now you can reference the object
print(names(.val))
# or construct new objects to store the value in
# or you can use "assign' to store back in the original object
assign(paste('temp.', df, sep=''), .val)
}


On 8/19/07, Darren Weber <[EMAIL PROTECTED]> wrote:
>  I have several data frames, eg:
>
> > df1 <- data.frame(x=seq(0,10), y=seq(10,20))
> > df2 <- data.frame(a=seq(0,10), b=seq(10,20))
>
> It is common to create loops in R like this:
>
> > for(df in list(df1, df2)){ #etc. }
>
> This works fine when you know the name of the objects to put into the
> list.  I assume that the order of the objects in the list is respected
> through the loop.  Inside the loop, the objects of the list are
> 'dereferenced' using 'df' but, to my knowledge, there is no way to
> tell whether 'df' is a current representation of 'df1' or 'df2'
> without some additional book keeping.
>
> In addition, I really want to use 'paste' within the loop to create a
> new string value that will have the symbol name of a data frame to be
> "dereferenced," e.g.:
>
> > for(n in c(1, 2)){ dfString <- paste('df', n, sep=""); 
> > print(eval(dfString)) }
>
> [1] "df1"
> [1] "df2"
>
> This is not what I want.  I have read through the documentation on
> eval and similar commands like substitute and quote.  I program
> regularly, but I do not understand these constructs in R.  I do not
> understand the R framework for parsing and evaluation and I don't have
> a lot of time right now to get lost in this detail.  I could really
> use some help to get the string values in my loop to be parsed into
> symbols that refer to the data frame objects df1 and df2.  How is this
> done?
>
> Best, Darren
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching elements from two vectors

2007-08-17 Thread jim holtman

Also if you want all the matches

> x[x %in% y]
[1] 2 3 3 3


On 8/17/07, Gonçalo Ferraz <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Imagine a vector x with elements (1,2,1,1,3,5,3,3,1) and a vector y with
> elements (2,3). I need to find out what elements of x match any of the
> elements of y.
>
> Is there a simple command that will return a vector with elements
> (F,T,F,F,T,F,T,T,F). Ideally, I would like a solution that works with
> dataframe colums as well.
>
> I have tried x==y and it doesn't work.
> x==any(y) doesn't work either. I realize I could write a foor loop and go
> through each element of y asking if it matches any element of x, but isn't
> there a shorter way?
>
> Thanks,
> Gonçalo
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching elements from two vectors

2007-08-17 Thread jim holtman

> x <- c(1,2,1,1,3,5,3,3,1)
> y <- c(2,3)
> intersect(x,y)
[1] 2 3

On 8/17/07, Gonçalo Ferraz <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Imagine a vector x with elements (1,2,1,1,3,5,3,3,1) and a vector y with
> elements (2,3). I need to find out what elements of x match any of the
> elements of y.
>
> Is there a simple command that will return a vector with elements
> (F,T,F,F,T,F,T,T,F). Ideally, I would like a solution that works with
> dataframe colums as well.
>
> I have tried x==y and it doesn't work.
> x==any(y) doesn't work either. I realize I could write a foor loop and go
> through each element of y asking if it matches any element of x, but isn't
> there a shorter way?
>
> Thanks,
> Gonçalo
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] for plots

2007-08-16 Thread jim holtman

Turn 'Recording" on for the plots.

windows(record=TRUE)

or select from the GUI.

On 8/17/07, Brad Zhang <[EMAIL PROTECTED]> wrote:
> Hi, All,
>
> I am a beginner for R. Now I have installed R 2.5.1 in Window
> environment. After I run a program such as "gam" I would like to display
> a plot for the object. The following is an example. When I did this,
> only the last plot was presented on my screen. How can I get a plot
> before the last plot? I mean if the object has several plots how can I
> get those?
>
>
>
> "gam.object <- gam(y ~ s(x,6) + z,data=gam.data)
>
> plot(gam.object,se=TRUE)"
>
>
>
>
>
> Thank you.
>
>
>
> Brad.
>
>
> Dr. Guicheng (Brad) Zhang
> Senior Research Officer
>
> School of Paediatrics and Child Health
> Telethon Institute for Child Health Research
> 100 Roberts Road, Subiaco
> Western Australia, 6008 AUSTRALIA
>
> Email: [EMAIL PROTECTED]
> Phone: 93407896
> Fax: 93882097
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparison of arrays of strings

2007-08-16 Thread jim holtman

Read them into 2 different vectors and then use 'intersect'.

On 8/17/07, ramakanth reddy <[EMAIL PROTECTED]> wrote:
> Hi
>
> i have two arrays of genes names,one with18 gene names and the other with 
> 24000 gene names,I have to compare both of them for finding common names.
>
> I have both the arrays in .csv format.i loaded the files and tried to compare 
> them using for and if loops
>
> but I got the error Error in Ops.factor(cgh[i, 1], cgh[j, 2]) :
>level sets of factors are different
>
> Please suggest me how to solve this problem or any other alternative procedure
>
> Thanks
> ramakanth
>
>
>
>
>  Get the freedom to save as many mails as you wish. To know how, go to 
> http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] invert 160000x160000 matrix

2007-08-13 Thread jim holtman

You would need 200GB to store a since image, so if you have about 1TB
of physical memory on your computer, it might be possible.

On 8/13/07, Jiao Yang <[EMAIL PROTECTED]> wrote:
> Can R invert a 16x16 matrix with all positive numbers?  Thanks a lot!
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract part of vector

2007-08-13 Thread jim holtman

This should do it:

> txt
[1] 
"\nhttp://www.mysite.com/system/empty.asp?P=2&VID=default&SID=421384237289476&S=1&C=18631";
[2] 
"\nhttp://www.mysite.com/system/empty.asp?P=123&VID=default&SID=421384237289476&S=1&C=18643";
[3] 
"\nhttp://www.mysite.com/system/empty.asp?P=342&VID=default&SID=421384237289476&S=1&C=18634\n";
[4] 
"\nhttp://www.mysite.com/system/empty.asp?P=232&VID=default&SID=421384237289476&S=1&C=18645";
[5] 
"\nhttp://www.mysite.com/system/empty.asp?P=2345&VID=default&SID=421384237289476&S=1&C=18254";
[6] 
"\nhttp://www.mysite.com/system/empty.asp?P=257654&VID=default&SID=421384237289476&S=1&C=18732";
[7] 
"\nhttp://www.mysite.com/system/empty.asp?P=22&VID=default&SID=421384237289476&S=1&C=18637";
[8] 
"\nhttp://www.mysite.com/system/empty.asp?P=2463&VID=default&SID=421384237289476&S=1&C=18575\n";

> gsub("^.*asp.P=([[:digit:]]+).*$", '\\1', txt)
[1] "2"  "123""342""232""2345"   "257654" "22" "2463"
>


On 8/13/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote:
> Dear R-users,
>
> How do I extract numbers between asp?P= and &VID from my txt vector? I have
> tried grep function with no luck.
>
> txt <- c("
> http://www.mysite.com/system/empty.asp?P=2&VID=default&SID=421384237289476&S=1&C=18631";,
> "
> http://www.mysite.com/system/empty.asp?P=123&VID=default&SID=421384237289476&S=1&C=18643";,
> "
> http://www.mysite.com/system/empty.asp?P=342&VID=default&SID=421384237289476&S=1&C=18634
> ","
> http://www.mysite.com/system/empty.asp?P=232&VID=default&SID=421384237289476&S=1&C=18645";,
> "
> http://www.mysite.com/system/empty.asp?P=2345&VID=default&SID=421384237289476&S=1&C=18254";,
> "
> http://www.mysite.com/system/empty.asp?P=257654&VID=default&SID=421384237289476&S=1&C=18732";,
> "
> http://www.mysite.com/system/empty.asp?P=22&VID=default&SID=421384237289476&S=1&C=18637";,
> "
> http://www.mysite.com/system/empty.asp?P=2463&VID=default&SID=421384237289476&S=1&C=18575
> ")
>
> The result should be like
> 2
> 123
> 342
> 232
> 2345
> 257654
> 22
> 2463
>
> Thanks,
> Lauri
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to write to a table column by column?

2007-08-13 Thread jim holtman

Assuming that the daily.incomes are the same lengths, then your loop could be:

Lst <- list()
for (i in 1:count) Lst[[i]] <- list(..)
Lst.col <- do.call('cbind', Lst)

On 8/12/07, Yuchen Luo <[EMAIL PROTECTED]> wrote:
> Dear friends.
> Every loop of my program will result in a list that is very long, with a
> structure similar to the one below:
>
> Lst <- list(name="Fred", wife="Mary", daily.incomes=c(1:850))
>
> Please notice the large size of "daily.incomes".
>
> I need to store all such lists in a csv file so that I can easily view them
> in Excel. Excel cannot display a row of more than 300 elements, therefore, I
> have to store the lists as columns. It is not hard to store one list as a
> column in the csv file. The problem is how to store the second list as a
> second column, so that the two columns will lie side by side to each other
> and I can easily compare their elements. ( If I use 'appened=TRUE', the
> second time series will be stored in the same column. )
>
> Thank you for your tine and your help will be highly appreciated!!
>
> Best
>
> Yuchen Luo
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to control the number format on plot axes ?

2007-08-12 Thread jim holtman

Here is a way that you can put the formatting that you want; you were
not clear on exactly what you were after.  You can setup the 'labels'
argument for whatever you want.

a<-1:10
myTicks<-c(0.1,1,2,5,10)
# set ylim to range of myTicks that you want
plot(x=a,y=a,log="y",type="p",yaxt="n", ylim=range(myTicks))
# change the sprintf to whatever formatting you want
axis(side=2,at=myTicks,
labels=ifelse(myTicks >= 1, sprintf("%.0f", myTicks),
sprintf("%0.1f", myTicks)))





On 8/12/07, Sébastien <[EMAIL PROTECTED]> wrote:
> Dear R-users,
>
> Basically, everything is in the title of my e-mail. I know that some
> threads from the archives have already addressed this question but they
> did not really give a clear solution.
> Here is a series of short codes that will illustrate the problem:
>
> # First
> a<-1:10
> plot(x=a,y=a,log="y",type="p")
>
> # Second
> a<-1:10
> myTicks<-c(1,2,5,10)
> plot(x=a,y=a,log="y",type="p",yaxt="n")
> axis(side=2,at=myTicks)
>
> # Third
> a<-1:10
> myTicks<-c(0.1,1,2,5,10)
> plot(x=a,y=a,log="y",type="p",yaxt="n")
> axis(side=2,at=myTicks)
>
> # Forth
> a<-0.1:10
> plot(x=a,y=a,log="y",type="p")
>
> In the first and second examples, the plots are identical and the tick
> labels are 1, 2, 5 and 10. In the third, the labels are number in the
> x.0 format (1.0, 2.0, 5.0 and 10.0), even if there is no point below 1.
> The only reason I see is because the first element of myTicks is 0.1.
> And, the forth example is self-explanatory.
> Interestingly, the 'scales' argument of xyplot in the lattice package do
> not add these (unnecessary) decimals on labels greater than 1.
>
> Do you know how I could transpose the behavior of the lattice 'scales'
> argument to the 'axis' function ?
>
> Thank you
>
> PS: No offense, but please don't suggest I use lattice. I have to go for
> base R graphics in my full-scale project (it is a speed issue).
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend on graph

2007-08-12 Thread jim holtman

If you are asking to have the values plotted on top of the legend,
then you can do the following:

plot(x, y, type='n', ...) # create plot, but don't plot
legend('topright', ...)
lines(x,y)  # now plot the data

If you want it outside the plot, check the archives for several examples.

On 8/12/07, akki <[EMAIL PROTECTED]> wrote:
> Hi,
> I have a problem when I want to put a legend on the graph.
> I do:
>
> legend("topright", names(o), cex=0.9, col=plot_colors,lty=1:5, bty="n")
>
> but the legend is writen into the graph (graphs' top but into the graph),
> because I have values on this position. How can I write the legend on top
> the graph without the legend writes on graph's values.
>
> Thanks.
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Write values on y axe

2007-08-12 Thread jim holtman

Does this do what you want:

x <- runif(10)
plot(x)
# put min/max in red
axis(2, at=round(range(x), 4), col.axis='red', las=2)



On 8/12/07, akki <[EMAIL PROTECTED]> wrote:
> Hi,
> I have values on y axe from 0.0001 to 3.086. When I do plot I have writen
> values: 0.001, 0.050,1.000 ..., but how I can write on graph  the minimum
> value and maximum value, with all decimals (I don't want to use the format
> 1e-0x)? I am using log scale.
>
> For example, if I have the values:
> 0.0001
> 0.0015
> 0.0256
> 0.0236
> 
> 0.0201
> 2.9668
> 3.0086
>
> I need have each 'x' value put on y axe, and add the value minimum and
> maximum on   my graph.
> How can I do it?
>
> I do:
> plot(o$a, log="y", type="l", col=colors[1], xlab="a_x", ylab="a_y",
> cex.lab=0.8)
> lines(o$b, type="l", pch=1, lty=1, col=colors[2])
> lines(o$c, type="l", pch=2, lty=2, col=colors[3])
>
> to I draw my graph.
>
> Thanks in advance.
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] shell and shell.exec on Windows

2007-08-11 Thread jim holtman

If you are using Windows, then try:

system('cmd /c yourfile.xls')

This will invoke the windows command processor and it should pick the
correct association.

On 8/11/07, Erich Neuwirth <[EMAIL PROTECTED]> wrote:
> Thanks Gabor,
> system() indeed would be the answer, but it does not solve my problem
> because of some inconsistencies in WindowsXP.
> I will explain the story, because perhaps it can help somebody else to
> avoid wasting time.
> On my machine, when I doubleclick an .xlsm file, it is opened in Excel
> 2007. .xls files are opened in Excel 2003.
> shell.exec("file.xls") and shell.exec("file.xlsm")
> also open the files in Excel 2003 and Excel 2007 respectively.
>
> system() does not invoke a shell, so I need to find the application
> associated with Excel to create a string with the name
> of the application and the name of the file to open.
> Then, something like
> system("\"c:\\mypath\\CorrectVersionOfExcel.exe\"
> \"c:\\mydir\\myexcelfile.xlsm\"")
> should work (and run the program invisibly)
>
> There are two helpful shell commands in WinXP
> ASSOC and FTYPE
>
> ASSOC .xls
>   .xls=Excel.Sheet.8
>
> ASSOC .xlsm
>   .xlsm=Excel.SheetMacroEnabled.12
>
> ftype Excel.Sheet.8
>  Excel.Sheet.8="C:\Program Files\Microsoft Office\OFFICE11\EXCEL.EXE" /e
>
> ftype Excel.SheetMacroEnabled.12
>  Excel.SheetMacroEnabled.12="C:\PROGRA~2\MICROS~2\OFFICE11\EXCEL.EXE" /e
>
> So despite the fact that doubleclicking .xlsm files or using
> "shell.exec" opens Excel 2007
> the application reported by assoc and ftype for .xlsm files is Excel 2003.
>
>
> Gabor Grothendieck wrote:
> > The system() function has an invisible= argument.  The ryacas package
> > uses system() to run yacas.  See the runYacas() and
> > yacasInvokeString() functions in yacas.R for examples:
> >http://ryacas.googlecode.com/svn/trunk/R/yacas.R
> >
> > On 8/11/07, Erich Neuwirth <[EMAIL PROTECTED]> wrote:
> >> I have an Excel workbook "MyWorkbook.xls" containing an Auto_Open macro
> >> which I want to be run from R.
> >>
> >> shell.exec("MyWorkbook.xls")
> >> does that.
> >>
> >> shell("start MyWorkbook.xls")
> >> also runs it.
> >>
> >> In both cases, the Excel window is visible on screen when Excel is started.
> >> Is there a way of opening the sheet with a hidden Excel window?
> >> start has some parameters (e.g. /MIN), which should allow this, but
> >> shell("start /MIN MyWorkbook.xls")
> >> also starts Excel visibly.
> >>
> >>
> >>
> >> --
> >> Erich Neuwirth, University of Vienna
> >> Faculty of Computer Science
> >> Computer Supported Didactics Working Group
> >> Visit our SunSITE at http://sunsite.univie.ac.at
> >> Phone: +43-1-4277-39464 Fax: +43-1-4277-39459
> >>
> >> __
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
>
>
> --
> Erich Neuwirth, University of Vienna
> Faculty of Computer Science
> Computer Supported Didactics Working Group
> Visit our SunSITE at http://sunsite.univie.ac.at
> Phone: +43-1-4277-39464 Fax: +43-1-4277-39459
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replace NAs in dataframe: what am I doing wrong

2007-08-11 Thread jim holtman

The problem is that the first column is probably a factor and you are
trying to assign a value that is not already a 'level' in the factor.
One way is to read the data with as.is=TRUE to keep it as character,
replace the NAs and then convert back to factors if you want to:

> x <- read.csv(textConnection("A,B
+ a,3
+ b,4
+ .,.
+ c,5"), na.strings='.', as.is=TRUE)  # keep as character
> # replace NAs
> x[is.na(x[,1]), 1] <- "Missing Value"
> # convert back to factors if you want to
> x[[1]] <- factor(x[[1]])
> str(x)
'data.frame':   4 obs. of  2 variables:
 $ A: Factor w/ 4 levels "a","b","c","Missing Value": 1 2 4 3
 $ B: int  3 4 NA 5
>
>


On 8/11/07, Sébastien <[EMAIL PROTECTED]> wrote:
> Dear R-users,
>
> My script imports a dataset from a csv file, in which missing values are
> represented by ".". This importation is done into a dataframe using the
> read.table function with na.strings = "."  Then I want to replace the
> NAs in the first column of the dataframe by "Missing data". I am using
> the following code to do so :
>
> mydata<-data.frame(read.table(myFile,sep=",",header=TRUE,na.strings="."))
>   # myFile is the full path of the source file
>
> mydata[,1][is.na(mydata[,1])]<-"Missing value"
>
> This code works perfectly fine if this first column contains only
> missing values, i.e. ".". As soon as it contains multiple levels and
> missing values, things start to get wrong. I get the following error
> message and the replacement is not done.
>
> Warning message:
> invalid factor level, NAs generated in: `[<-.factor`(`*tmp*`,
> is.na(mydata[, 1]), value = "Missing value")
>
> Is there an error in my code or is that a bug (I doubt about it) ?
>
> Thanks in advance.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help wit matrices

2007-08-10 Thread jim holtman

Is this what you want:

> x <- matrix(runif(100), 10)
> round(x, 3)
   [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
 [1,] 0.268 0.961 0.262 0.347 0.306 0.762 0.524 0.062 0.028 0.226
 [2,] 0.219 0.100 0.165 0.131 0.578 0.933 0.317 0.109 0.527 0.131
 [3,] 0.517 0.763 0.322 0.374 0.910 0.471 0.278 0.382 0.880 0.982
 [4,] 0.269 0.948 0.510 0.631 0.143 0.604 0.788 0.169 0.373 0.327
 [5,] 0.181 0.819 0.924 0.390 0.415 0.485 0.702 0.299 0.048 0.507
 [6,] 0.519 0.308 0.511 0.690 0.211 0.109 0.165 0.192 0.139 0.681
 [7,] 0.563 0.650 0.258 0.689 0.429 0.248 0.064 0.257 0.321 0.099
 [8,] 0.129 0.953 0.046 0.555 0.133 0.499 0.755 0.181 0.155 0.119
 [9,] 0.256 0.954 0.418 0.430 0.460 0.373 0.620 0.477 0.132 0.050
[10,] 0.718 0.340 0.854 0.453 0.943 0.935 0.170 0.771 0.221 0.929
> ifelse(x > .5, 1, 0)
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]010001100 0
 [2,]000011001 0
 [3,]110010001 1
 [4,]011101100 0
 [5,]011000100 1
 [6,]101100000 1
 [7,]110100000 0
 [8,]010100100 0
 [9,]010000100 0
[10,]101011010 1


On 8/10/07, Lanre Okusanya <[EMAIL PROTECTED]> wrote:
> Hello all,
>
> I am working with a 1000x1000 matrix, and I would like to return a
> 1000x1000 matrix that tells me which value in the matrix is greater
> than a theshold value (1 or 0 indicator).
> i have tried
>  mat2<-as.matrix(as.numeric(mat1>0.25))
> but that returns a 1:10 matrix.
> I have also tried for loops, but they are grossly inefficient.
>
> THanks for all your help in advance.
>
> Lanre
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting by number of observations in a factor

2007-08-09 Thread jim holtman

Here is an even faster way:

> # faster way
> x.mg.size <- table(x$mg)  # count occurance
> x.mg.5 <- names(x.mg.size)[x.mg.size > 5]  # select greater than 5
> x.new1 <- subset(x, x$mg %in% x.mg.5)  # use in the subset
> x.new1
   mg data
1   A1
4   A4
5   D5
6   D6
7   A7
8   D8
12  A   12
13  D   13
14  A   14
16  D   16
17  D   17
18  A   18
20  A   20


On 8/9/07, Ron Crump <[EMAIL PROTECTED]> wrote:
> Jim,
>
> > Does this do what you want?  It creates a new dataframe with those
> > 'mg' that have at least a certain number of observation.
>
> Looks good. I also have an alternative solution which appears to work,
> so I'll see which is quicker on the big data set in question.
>
> My solution:
>
> mgsize <- as.data.frame(table(in$mg))
> in2 <- merge(in,mgsize,by.x="mg",by.y="Var1")
> out <- subset(in2, Freq > 1, select= -Freq)
>
> Thanks for your help.
>
> Ron.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting by number of observations in a factor

2007-08-09 Thread jim holtman

Does this do what you want?  It creates a new dataframe with those
'mg' that have at least a certain number of observation.

> set.seed(2)
> # create some test data
> x <- data.frame(mg=sample(LETTERS[1:4], 20, TRUE), data=1:20)
> # split the data into subsets based on 'mg'
> x.split <- split(x, x$mg)
> str(x.split)
List of 4
 $ A:'data.frame':  7 obs. of  2 variables:
  ..$ mg  : Factor w/ 4 levels "A","B","C","D": 1 1 1 1 1 1 1
  ..$ data: int [1:7] 1 4 7 12 14 18 20
 $ B:'data.frame':  3 obs. of  2 variables:
  ..$ mg  : Factor w/ 4 levels "A","B","C","D": 2 2 2
  ..$ data: int [1:3] 9 15 19
 $ C:'data.frame':  4 obs. of  2 variables:
  ..$ mg  : Factor w/ 4 levels "A","B","C","D": 3 3 3 3
  ..$ data: int [1:4] 2 3 10 11
 $ D:'data.frame':  6 obs. of  2 variables:
  ..$ mg  : Factor w/ 4 levels "A","B","C","D": 4 4 4 4 4 4
  ..$ data: int [1:6] 5 6 8 13 16 17
> # only choose subsets with at 5 observations
> x.5 <- lapply(x.split, function(a) {
+ if (nrow(a) >= 5) return(a)
+ else return(NULL)
+ })
> # create new dataframe with these observations
> x.new <- do.call('rbind', x.5)
> x.new
 mg data
A.1   A1
A.4   A4
A.7   A7
A.12  A   12
A.14  A   14
A.18  A   18
A.20  A   20
D.5   D5
D.6   D6
D.8   D8
D.13  D   13
D.16  D   16
D.17  D   17
>
>


On 8/9/07, Ron Crump <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I generally do my data preparation externally to R, so I
> this is a bit unfamiliar to me, but a colleague has asked
> me how to do certain data manipulations within R.
>
> Anyway, basically I can get his large file into a dataframe.
> One of the columns is a management group code (mg). There may be
> varying numbers of observations per management group, and
> he would like to subset the dataframe such that there are
> always at least n per management group.
>
> I presume I can get to this using table or tapply, then
> (and I'm not sure how on this bit) creating a column nmg
> containing the number of observations that corresponds to
> mg for that row, then simply subsetting.
>
> So, am I on the right track? If so how do I actually do it, and
> is there an easier method than I am considering.
>
> Thanks for your help,
> Ron
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot table with sapply - labeling problems

2007-08-09 Thread jim holtman

Here is a modified script that should work.  In many cases where you
want the names of the element of the list you are processing, you
should work with the names:

test<-as.data.frame(cbind(round(runif(50,0,5)),round(runif(50,0,3)),round(runif(50,0,4
sapply(test, table)->vardist
sapply(test, function(x) round(table(x)/sum(table(x))*100,1) )->vardist1
  par(mfrow=c(1,3))
# you need to use the 'names' and then index into the variable
# your original 'x' did not have a names associated with it
sapply(names(vardist1), function(x) barplot(vardist1[[x]],
ylim=c(0,100),main="Varset1",xlab=x))
  par(mfrow=c(1,1))



On 8/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Hi List,
>
> I am trying to label a barplot group with variable names when using
> sapply unsucessfully.
> I can't seem to extract the names for the indiviual plots:
>
> test<-as.data.frame(cbind(round(runif(50,0,5)),round(runif(50,0,3)),roun
> d(runif(50,0,4
> sapply(test, table)->vardist
> sapply(test, function(x) round(table(x)/sum(table(x))*100,1) )->vardist1
>   par(mfrow=c(1,3))
> sapply(vardist1, function(x) barplot(x,
> ylim=c(0,100),main="Varset1",xlab=names(x)))
>   par(mfrow=c(1,1))
>
> Names don't show up although names(vardist) works.
>
> Also I would like to put a single Title on this plot instead of
> repeating "Varset" three times.
>
> Any hints appreciated.
>
> Thanx
> Herry
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Countvariable for id by date

2007-08-09 Thread jim holtman

This should do what you want:

> x <- read.table(textConnection("id;dg1;dg2;date;
+  1;F28;;1997-11-04;
+  1;F20;F702;1998-11-09;
+  1;F20;;1997-12-03;
+  1;F208;;2001-03-18;
+  2;F32;;1999-03-07;
+  2;F29;F32;2000-01-06;
+  2;F32;;2003-07-05;
+  2;F323;F2800;2000-02-05;"), header=TRUE, sep=";", as.is=TRUE)
> # convert dates
> x$dateP <- unclass(as.POSIXct(x$date))
> # matches for F20
> F20 <- grep("F20", paste(x$dg1, x$dg2))
> # matches for F21 - F29
> F21 <- grep("F2[1-9]", paste(x$dg1, x$dg2))
> # grouping
> x$F20 <- x$F21 <- NA
> x$F20[F20] <- rank(x$dateP[F20])
> x$F21[F21] <- rank(x$dateP[F21])
> x
  id  dg1   dg2   date  X  dateP F21 F20
1  1  F28   1997-11-04 NA  878601600   1  NA
2  1  F20  F702 1998-11-09 NA  910569600  NA   2
3  1  F20   1997-12-03 NA  881107200  NA   1
4  1 F208   2001-03-18 NA  984873600  NA   3
5  2  F32   1999-03-07 NA  920764800  NA  NA
6  2  F29   F32 2000-01-06 NA  947116800   2  NA
7  2  F32   2003-07-05 NA 1057363200  NA  NA
8  2 F323 F2800 2000-02-05 NA  949708800   3  NA


On 8/9/07, David Gyllenberg <[EMAIL PROTECTED]> wrote:
>Best R-users,
>
>  Here's a  newbie question. I have tried to find an answer to this via 
> help and the "ave(x,factor(),FUN=function(y)  rank (z,tie='first')"-function, 
> but without success.
>
>  I have a dataframe  (~8000 observations, registerdata) with four 
> columns: id, dg1, dg2 and date(-MM-DD)  of interest:
>
>  id;dg1;dg2;date;
>  1;F28;;1997-11-04;
>  1;F20;F702;1998-11-09;
>  1;F20;;1997-12-03;
>  1;F208;;2001-03-18;
>  2;F32;;1999-03-07;
>  2;F29;F32;2000-01-06;
>  2;F32;;2003-07-05;
>  2;F323;F2800;2000-02-05;
>  ...
>
>  I would  like o have two additional columns:
>  1. "countF20":  a "countvariable" that shows which in order (by date) 
> the id has if it fulfils  the following logical expression: dg1 = F20* OR dg2 
> = F20*,
>  where *  means F201,F202... F2001,F2002...F20001,F20002...
>  2. "countF2129":  another "countvariable" that shows which in order (by 
> date) the id has if it fulfils  the following logical expression: dg1 = 
> F21*-F29* OR dg2 = F21*-F29*,
>  where F21*-F29*  means F21*, F22*...F29* and
>  where *  means F211,F212... F2101,F2102...F21001,F21002...
>
>  ... so the  dataframe would look like this, where 1 is the first 
> observation for the id with  the right condition, 2 is the second etc.:
>
>  id;dg1;dg2;date;countF20;countF2129;
>  1;F28;;1997-11-04;;1;
>  1;F20;F702;1998-11-09;2;;
>  1;F20;;1997-12-03;1;;
>  1;F208;;2001-03-18;3;;
>  2;F32;;1999-03-07;;;
>  2;F29;F32;2000-01-06;;1;
>  2;F32;;2003-07-05;;;
>  2;F323;F2800;2000-02-05;;2;
>  ...
>
>  Do you know  a convenient way to create these kind of "countvariables"? 
> Thank you in  advance!
>
>  / David (david.gyllenberg  at  yahoo.com
>
>
> -
> Park yourself in front of a world of choices in alternative vehicles.
>
>    [[alternative HTML version deleted]]
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] input data file

2007-08-07 Thread jim holtman

You don't have to name them after numbers.  What I sent was just an
example of a character vector with file names.  If you have all the
files in a directory, then you can set the loop to read in all the
files (or selected one based on a pattern match).  If you are
copy/pasting the 'scan' command, then you must somehow be changing the
file name that is being read and the R object that you are storing the
values in.

You can use list.files(pattern="..") to select a list of file names.
This is much easier than copy/paste.

On 8/8/07, Tiandao Li <[EMAIL PROTECTED]> wrote:
> I thought of loop at first. My data were generated from 32 microarray
> experiments, each had 3 replicates, 96 files in total. I named the files
> based on different conditions or time series, and I really won't want to
> name them after numbers. It will make me confused later when I need to
> refer/compare them.
>
>
> On Tue, 7 Aug 2007, jim holtman wrote:
>
> I would hope that you don't have 100 'scan' statements; you should
> just have a loop that is using a set of file names in a vector to read
> the data.  Are you reading the data into separate objects?  If so,
> have you considered reading the 100 files into a 'list' so that you
> have a single object with all of your data?  This is then easy to save
> with the 'save' function and then you can quickly retrieve it with the
> 'load' statement.
>
> file.names <- c('file1', ..., 'file100')
> input.list <- list()
> for (i in file.names){
>input.list[[i]] <- scan(i, what=)
> }
>
> You can then 'save(input.list, file='save.Rdata')'.  You can access
> the data from the individual files with:
>
> input.list[['file33']]
>
>
>
>
>
> On 8/7/07, Tiandao Li <[EMAIL PROTECTED]> wrote:
> > In the first part of myfile.R, I used scan() 100 times to read data from
> > 100 different tab-delimited files. I want to save this part to another
> > data file, so I won't accidently make mistakes, and I want to re-use/input
> > it like infile statement in SAS or \input(file.tex} in latex. Don't want
> > to copy/paste 100 scan() every time I need to read the same data.
> >
> > Thanks!
> >
> > On Tue, 7 Aug 2007, jim holtman wrote:
> >
> > If you are going to read it back into R, then use 'save'; if it is
> > input to another applicaiton, consider 'write.csv'.  I assume that
> > when you say "save all my data files" you really mean "save all my R
> > objects".
> >
> > On 8/7/07, Tiandao Li <[EMAIL PROTECTED]> wrote:
> > > Hello,
> > >
> > > I am new to R. I used scan() to read data from tab-delimited files. I want
> > > to save all my data files (multiple scan()) in another file, and use it
> > > like infile statement in SAS or \input{tex.file} in latex.
> > >
> > > Thanks!
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem you are trying to solve?
> >
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] input data file

2007-08-07 Thread jim holtman

I would hope that you don't have 100 'scan' statements; you should
just have a loop that is using a set of file names in a vector to read
the data.  Are you reading the data into separate objects?  If so,
have you considered reading the 100 files into a 'list' so that you
have a single object with all of your data?  This is then easy to save
with the 'save' function and then you can quickly retrieve it with the
'load' statement.

file.names <- c('file1', ..., 'file100')
input.list <- list()
for (i in file.names){
input.list[[i]] <- scan(i, what=)
}

You can then 'save(input.list, file='save.Rdata')'.  You can access
the data from the individual files with:

input.list[['file33']]

On 8/7/07, Tiandao Li <[EMAIL PROTECTED]> wrote:
> In the first part of myfile.R, I used scan() 100 times to read data from
> 100 different tab-delimited files. I want to save this part to another
> data file, so I won't accidently make mistakes, and I want to re-use/input
> it like infile statement in SAS or \input(file.tex} in latex. Don't want
> to copy/paste 100 scan() every time I need to read the same data.
>
> Thanks!
>
> On Tue, 7 Aug 2007, jim holtman wrote:
>
> If you are going to read it back into R, then use 'save'; if it is
> input to another applicaiton, consider 'write.csv'.  I assume that
> when you say "save all my data files" you really mean "save all my R
> objects".
>
> On 8/7/07, Tiandao Li <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > I am new to R. I used scan() to read data from tab-delimited files. I want
> > to save all my data files (multiple scan()) in another file, and use it
> > like infile statement in SAS or \input{tex.file} in latex.
> >
> > Thanks!
> >
> > ______
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to convert decimal date to its equivalent date format(YYYY.mm.dd.hr.min.sec)

2007-08-07 Thread jim holtman

Is this what you want?

> x <- scan(textConnection("1979.00
+
+ 1979.020833
+
+ 1979.041667
+
+ 1979.062500"), what=0)
Read 4 items
> # get the year and then determine the number of seconds in the year so you can
> # use the decimal part of the year
> x.year <- floor(x)
> # fraction of the year
> x.frac <- x - x.year
> # number of seconds in each year
> x.sec.yr <- unclass(ISOdate(x.year+1,1,1,0,0,0)) - 
> unclass(ISOdate(x.year,1,1,0,0,0))
> # now get the actual time
> x.actual <- ISOdate(x.year,1,1,0,0,0) + x.frac * x.sec.yr
>
> x.actual
[1] "1979-01-01 00:00:00 GMT" "1979-01-08 14:29:49 GMT" "1979-01-16
05:00:10 GMT"
[4] "1979-01-23 19:30:00 GMT"
>


On 8/7/07, Yogesh Tiwari <[EMAIL PROTECTED]> wrote:
> Hello R Users,
>
> How to convert decimal date to date as .mm.dd.hr.min.sec
>
> For example, I have decimal date in one column , and want to convert and
> write it in equivalent date(.mm.dd.hr.min.sec) in another next six
> columns.
>
> 1979.00
>
> 1979.020833
>
> 1979.041667
>
> 1979.062500
>
>
>
> Is it possible in R ?
>
> Kindly help,
>
> Regards,
>
> Yogesh
>
>
>
>
>
> --
> Dr. Yogesh K. Tiwari,
> Scientist,
> Indian Institute of Tropical Meteorology,
> Homi Bhabha Road,
> Pashan,
> Pune-411008
> INDIA
>
> Phone: 0091-99 2273 9513 (Cell)
> : 0091-20-258 93 600 (O) (Ext.250)
> Fax: 0091-20-258 93 825
>
>[[alternative HTML version deleted]]
>
> __________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] input data file

2007-08-07 Thread jim holtman

If you are going to read it back into R, then use 'save'; if it is
input to another applicaiton, consider 'write.csv'.  I assume that
when you say "save all my data files" you really mean "save all my R
objects".

On 8/7/07, Tiandao Li <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I am new to R. I used scan() to read data from tab-delimited files. I want
> to save all my data files (multiple scan()) in another file, and use it
> like infile statement in SAS or \input{tex.file} in latex.
>
> Thanks!
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sink behavior

2007-08-06 Thread jim holtman

'sink' will capture 'printed' output from your program.  Try:

> # Using a matrix because as a simple example.
> dumpMatrix = function(mat) {
>sink(file = "mat.txt")
>print(mat)
>sink(NULL)
> }

In this case, there is an explicit 'print' statement.  At the command
line, there is an implicit 'print' when you give an object name.


On 8/6/07, Daniel Gatti <[EMAIL PROTECTED]> wrote:
> There is a package called 'safe' that produces an object which I can
> only write to a file using the sink() function.  It works fine if the
> sink() command is not inside of a function, but it does not write
> anything to the file if the command is within a function.
>
> Sample code:
> # Using a matrix because as a simple example.
> dumpMatrix = function(mat) {
>sink(file = "mat.txt")
>mat
>sink(NULL)
> }
>
> # This will write the file correctly.
> x = matrix(100, 10, 10)
> sink(file = "x.txt")
> x
> sink(NULL)
>
> # This will create an empty file.
> dumpMatrix(x)
>
> R 2.5.1
> Windows XP, SP2
>
> The sink() docs are full of warnings, but I'm not clear which one I've
> violated with this example.
>
> Dan
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Secondary axis

2007-08-06 Thread jim holtman

plot()
par(new=TRUE)
plot(...)
axis(4,...)

On 8/5/07, Patrick Martin <[EMAIL PROTECTED]> wrote:
> Dear R help list members,
>
> I am trying to plot two sets of data (both of which are zoo objects)
> in the same graph using two separate y-axes with different scales,
> with the x-axis consisting of dates. I have simply used a plot()
> command to plot first one set of data, and then added the second set
> with lines(). I have also tried to add a further y-axis (at side=4),
> but this simply comes up with the same scale as the first y-axis. I
> somehow need to 'associate' one of the data sets with the second y-
> axis, such that it will scale sensibly (my first data set ranges from
> 0-25, the second one from 0 to 40). The problem is compounded by the
> fact that the two data sets have very different frequencies: one
> consists of twice-monthly measurements, the other of hourly
> measurements. I would be very grateful for advice on how to do this.
>
> Thanks in advance,
> Patrick Martin
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Access an entry after reading a table

2007-08-05 Thread jim holtman

read.table will convert you character columns to factors.  You are
seeing a single value returned ("A"), but it is also reporting the
levels for the factors. One way is to read the data in without
conversion to factors:

 Model=read.table("ModelMat.txt", header=TRUE, as.is=TRUE)

or you can convert the factor to character for output:

as.character(Model[1,1])

On 8/2/07, Gang Chen <[EMAIL PROTECTED]> wrote:
> Sorry about this basic question. After reading a table,
>
> Model=read.table("ModelMat.txt", header=T)
>
> I want to get access to each entry in the table Model. However, if I do
>
>  > Model[1,1]
>
> I get the following,
>
> [1] A
> Levels: A B C
>
> My question is, how can I just get the entry "A" without the 2nd line
> ("Levels: A B C")?
>
> Thanks,
> Gang
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem to remove loops in a routine

2007-08-01 Thread jim holtman

  paste("Subject
> ",levels(factor(subdata$ID))[which.panel],sep=""),cex=trellis.par.get("axis.text")[2])},
>
>key=list(space="bottom",
>   lines = list(pch = as.integer(c(3,rep("",nModel))), type
> = c("p", gl(1,nModel,label="l")),
> col =
> 1:(nModel+1),"cex"=trellis.par.get("axis.text")[2]),
> text=list(mylegend,
> "cex"=trellis.par.get("axis.text")[2])),
>
>xlab="Time (hr)",
>ylab="Concentration (ng/mL)",
>layout=c(nTrellisCol,nTrellisRow),
>
>main=paste(paste(paste("Plot ",i,sep=""),
> paste(paste(", DVID ",j,sep=""),
>   paste(paste(", Occasion ",k,sep=""),
> paste(", Group
> ",l,sep="",sep=""))
>
>
>  trellis.par.set(par.xlab.text=list(cex=trellis.par.get("axis.text")[2]))
>trellis.par.set(par.ylab.text=list(cex=trellis.par.get("axis.text")[2]))
>
>
> print(myplot,panel.width=list(x=(0.75/nTrellisCol),units="npc"),panel.height=list(x=(0.50/nTrellisRow),units="npc"))
>
>
> dev.off()
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple table with frequency variable

2007-08-01 Thread jim holtman

I am not exactly sure what you are asking for.  I am assuming that you
want a vector that represent the combinations that are given
combinations that are present:

> N
 [1] 11 22 31 42 51 12 21 32 41 52
> table(i,j)
   j
i   1 2
  1 1 1
  2 1 1
  3 1 1
  4 1 1
  5 1 1
> z <- table(i,j)
> which(z==1)
 [1]  1  2  3  4  5  6  7  8  9 10
> which(z==1,arr.ind=T)
  row col
1   1   1
2   2   1
3   3   1
4   4   1
5   5   1
1   1   2
2   2   2
3   3   2
4   4   2
5   5   2
> x <- which(z==1,arr.ind=T)
> paste(rownames(z)[x[,'row']], colnames(z)[x[,'col']], sep='')
 [1] "11" "21" "31" "41" "51" "12" "22" "32" "42" "52"
>


On 8/1/07, G. Draisma <[EMAIL PROTECTED]> wrote:
> Hallo,
>
> Im trying to find out how to tabulate frequencies
> of factors when the data have a frequency variable.
>
> e,g:
> i<-rep(1:5,2)
> j<-rep(1:2,5)
> N<-10*i+j
>
> table(i,j) gives a table of ones
> as each combination occurs only once.
> How does one get a table with the corresponding N's?
>
> Thanks!
> Gerrit.
>
>
> --
> Gerrit Draisma
> Department of Public Health
> Erasmus MC, University Medical Center Rotterdam
> Room AE-103
> P.O. Box 2040 3000 CA  Rotterdam The Netherlands
> Phone: +31 10 4087124 Fax: +31 10 4638474
> http://mgzlx4.erasmusmc.nl/pwp/?gdraisma
>
> __________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading Matrices

2007-08-01 Thread jim holtman

  
>   -0.016839   -0.069724   -0.035206   -0.140684   -0.123128   
> -0.058937   -0.165202   -0.023992   -0.086425   -0.075676 
>   1
>
> 0   1.7016341.7329661.6384781.804548
> 1.6799821.8086821.9079091.7820821.820282  
>   1.3932571.5688962.0142731.40849 1.730805
> 1.645146
> 1.7016340   1.121.0721741.238244
> 1.1136781.58344 1.6826671.55684 1.59504 1.423253
> 1.9618982.4072751.8014922.1238072.038148
> 1.7329661.120   0.5481740.977588
> 0.8530221.6147721.7139991.5881721.626372  
>   1.4545851.99323 2.4386071.8328242.155139
> 2.06948
> 1.6384781.0721740.5481740   0.8831  0.758534  
>   1.5202841.6195111.4936841.5318841.360097
> 1.8987422.3441191.7383362.0606511.974992
> 1.8045481.2382440.9775880.8831  0   0.728972  
>   1.6863541.7855811.6597541.6979541.526167
> 2.0648122.5101891.9044062.2267212.141062
> 1.6799821.1136780.8530220.7585340.728972  
>   0   1.5617881.6610151.5351881.573388
> 1.4016011.9402462.3856231.77984 2.102155
> 2.016496
> 1.8086821.58344 1.6147721.5202841.686354
> 1.5617880   0.4774211.1998761.238076
> 1.5303012.0689462.5143231.90854 2.230855
> 2.145196
> 1.9079091.6826671.7139991.6195111.785581  
>   1.6610150.4774210   1.2991031.337303
> 1.6295282.1681732.61355 2.0077672.330082
> 2.244423
> 1.7820821.55684 1.5881721.4936841.659754
> 1.5351881.1998761.2991030   0.823034
> 1.5037012.0423462.4877231.88194 2.204255
> 2.118596
> 1.8202821.59504 1.6263721.5318841.697954
> 1.5733881.2380761.3373030.8230340   
> 1.5419012.0805462.5259231.92014 2.242455
> 2.156796
> 1.3932571.4232531.4545851.3600971.526167  
>   1.4016011.5303011.6295281.5037011.541901
> 0   1.6535212.0988981.4931151.81543 1.729771
> 1.5688961.9618981.99323 1.8987422.064812
> 1.9402462.0689462.1681732.0423462.080546  
>   1.6535210   0.9887471.51847 1.8407851.755126
> 2.0142732.4072752.4386072.3441192.510189  
>   2.3856232.5143232.61355 2.4877232.525923
> 2.0988980.9887470   1.9638472.286162
> 2.200503
> 1.40849 1.8014921.8328241.7383361.904406
> 1.77984 1.90854 2.0077671.88194 1.92014 1.4931151.51847 
> 1.9638470   1.0544050.968746
> 1.7308052.1238072.1551392.0606512.226721  
>   2.1021552.2308552.3300822.2042552.242455
> 1.81543 1.8407852.2861621.0544050   0.722953
> 1.6451462.0381482.06948 1.9749922.141062
> 2.0164962.145196    2.2444232.1185962.156796  
>   1.7297711.7551262.2005030.9687460.722953
> 0
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading and storing files in the workspace

2007-07-31 Thread jim holtman

try:

for (i in test){
assign(gsub(".txt$", "", i), read.table(i, header=TRUE))
}

On 7/31/07, Luis Ridao Cruz <[EMAIL PROTECTED]> wrote:
> R-help,
>
> I have a vector containing (test) some file names.
> The files contents are matrixes.
>
> > test
>
>  [1] "aaOki.txt""aOki.txt" "bOki.txt" "c1Oki.txt"
> "c2Oki.txt""c3Oki.txt""cOki.txt" "dOki.txt" "dyp100.txt"
>  "dyp200.txt"
> [11] "dyp300.txt"   "dyp400.txt"   "dyp500.txt"   "dyp600.txt"
> "dyp700.txt"   "dyp800.txt"   "eOki.txt" "FBdyp100.txt"
> "FBdyp150.txt" "FBdyp200.txt".
>
> What I want to do is to import to R using the same file name
> and remove the ".txt" extension out of the object name.
> Something like this:
>
> for(i in test)
> gsub("\\.", "", paste(i, sep = "")) <- read.table(file = paste(i, sep =
> ""), header = TRUE)
>
> But I get the following message:
>
> Error in gsub("\\.", "", paste(i, sep = "")) <- read.table(file =
> paste(i,  :
>target of assignment expands to non-language object
>
>
> Thanks in advance.
>
>
> > version
>   _
> platform   i386-pc-mingw32
> arch   i386
> os mingw32
> system i386, mingw32
> status
> major  2
> minor  5.1
> year   2007
> month  06
> day27
> svn rev42083
> language   R
> version.string R version 2.5.1 (2007-06-27)
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to combine data of several csv-files

2007-07-31 Thread jim holtman

Here is the modified script for computing the 'sd':

v1 <- NA
v2 <- rnorm(6)
v3 <- rnorm(6)
v4 <- rnorm(6)
v5 <- rnorm(6)
v6 <- rnorm(6)
v7 <- rnorm(6)
v8 <- rnorm(6)
v8 <- NA

list <- list(v1,v2,v3,v4,v5,v6,v7,v8)
categ <- c(NA,"cat1","cat1","cat1","cat2","cat2","cat2",NA)

# create partitioned list
list.cat <- split(list, categ)
# combine each partition into a matrix
list.mat <- lapply(list.cat, function(x) do.call('rbind', x))
# now take the means of each column
lapply(list.mat, colMeans)
# compute the 'sd' by using 'apply' on the columns
lapply(list.mat, apply, 2, sd)



On 7/31/07, 8rino-Luca Pantani <[EMAIL PROTECTED]> wrote:
> > Hi Jim,
> > that's exactly what I'm looking for. Thank you so much. I think, I
> > should look for some further documentation on list handling.
> I think I will do the same...
> Thanks to Jim I learned "textConnection" and "rowMeans".
>
> Jim, could you please go a step further and tell me how to use lapply to
> calculate
> the sd instead of the mean of the same items?
> I mean
> sd(-0.6442149 0.02354036 -1.40362589)
> sd(-1.1829260 1.17099178 -0.046778203)
> sd(-0.2047012 -1.36186952 0.13045724)
> etc
>
> x <- read.table(textConnection("  v1 v2 v3  v4 v5  v6 v7 v8
> NA -0.6442149  0.02354036 -1.40362589 -1.1829260  1.17099178 -0.046778203 NA
> NA -0.2047012 -1.36186952  0.13045724  2.1411553  0.49248118 -0.233788840 NA
> NA -1.1986041 -0.42197792 -0.84651458 -0.1327081 -0.18690065  0.443908897 NA
> NA -0.2097442  1.50445971  1.57005071 -0.1053442  1.50050976 -1.649740180 NA
> NA -0.7343465 -1.76763996  0.06961015 -0.8179396 -0.65552410  0.003991354 NA
> NA -1.3888750  0.53722404  0.25269771 -1.2342698 -0.01243247 -0.228020092 
> NA"), header=TRUE)
>
>
>
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix Multiplication, Floating-Point, etc.

2007-07-30 Thread jim holtman

One thing to realize is that although it appears that the operations
are the same, the code that is being executed is different in the two
cases.  Due to the different sequence of instructions, there may be
round-off errors that are then introduced

On 7/30/07, Talbot Katz <[EMAIL PROTECTED]> wrote:
> Thank you for responding!
>
> I realize that floating point operations are often inexact, and indeed, the
> difference between the two answers is within the all.equal tolerance, as
> mentioned in FAQ 7.31 (cited by Charles):
>
> >(as.numeric(ev1%*%ev2))==(sum(ev1*ev2))
> [1] FALSE
> >all.equal((as.numeric(ev1%*%ev2)),(sum(ev1*ev2)))
> [1] TRUE
> >
>
> I suppose that's good enough for numerical computation.  But I was still
> surprised to see that matrix multiplication (ev1%*%ev2) doesn't give the
> exact right answer, whereas sum(ev1*ev2) does give the exact answer.  I
> would've expected them to perform the same two multiplications and one
> addition.  But I guess that's not the case.
>
> However, I did find that if I multiplied the two vectors by 10, making the
> entries integers (although the class was still "numeric" rather than
> "integer"), both computations gave equal answers of 0:
>
> >xf1<-10*ev1
> >xf2<-10*ev2
> >(as.numeric(xf1%*%xf2))==(sum(xf1*xf2))
> [1] TRUE
> >
>
> Perhaps the moral of the story is that one should exercise caution and keep
> track of significant digits.
>
> --  TMK  --
> 212-460-5430home
> 917-656-5351cell
>
>
>
> >From: "Charles C. Berry" <[EMAIL PROTECTED]>
> >To: Talbot Katz <[EMAIL PROTECTED]>
> >CC: r-help@stat.math.ethz.ch
> >Subject: Re: [R] Matrix Multiplication, Floating-Point, etc.
> >Date: Mon, 30 Jul 2007 09:27:42 -0700
> >
> >
> >
> >7.31 Why doesn't R think these numbers are equal?
> >
> >On Fri, 27 Jul 2007, Talbot Katz wrote:
> >
> >>Hi.
> >>
> >>I recently tried the following in R 2.5.1 on Windows XP:
> >>
> >>>ev2<-c(0.8,-0.6)
> >>>ev1<-c(0.6,0.8)
> >>>ev1%*%ev2
> >>  [,1]
> >>[1,] -2.664427e-17
> >>>sum(ev1*ev2)
> >>[1] 0
> >>>
> >>
> >>(I got the same result with R 2.4.1 on a different Windows XP machine.)
> >>
> >>I expect this issue is very familiar and probably has been discussed in
> >>this
> >>forum before.  Can someone please point me to some documentation or
> >>discussion about this?  Is there some standard way to get the "correct"
> >>answer from %*%?
> >>
> >>Thanks!
> >>
> >>--  TMK  --
> >>212-460-5430  home
> >>917-656-5351  cell
> >>
> >>__
> >>R-help@stat.math.ethz.ch mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >Charles C. Berry(858) 534-2098
> > Dept of Family/Preventive
> >Medicine
> >E mailto:[EMAIL PROTECTED]  UC San Diego
> >http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
> >
> >
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the large dataset problem

2007-07-30 Thread jim holtman

FYI.  I used your script on a Windows machine with 1.5GHZ and using
the CYGWIN software that has the UNIX utilities.  The field as 1000
lines with 10,000 fields on each line.  Here is what it reported:

gawk 'BEGIN{FS=","}{print $(1) "," $(1000) "," $(1275) ","  $(5678)}'
< tempxx.txt > newdata.csv


real0m0.806s
user0m0.640s
sys 0m0.124s

So it took less than a second to process the file, so it still should
be pretty fast on windows.  BTW, the first run took 30 seconds of real
time due to the slow disk that I have.  The run above had the data
already cached in memory.


On 7/30/07, Ted Harding <[EMAIL PROTECTED]> wrote:
> On 30-Jul-07 11:40:47, Eric Doviak wrote:
> > [...]
>
> Sympathies for the constraints you are operating in!
>
> > The "Introduction to R" manual suggests modifying input files with
> > Perl. Any tips on how to get started? Would Perl Data Language (PDL) be
> > a good choice?  http://pdl.perl.org/index_en.html
>
> I've not used SIPP files, but itseems that they are available in
> "delimited" format, including CSV.
>
> For extracting a subset of fields (especially when large datasets may
> stretch RAM resources) I would use awk rather than perl, since it
> is a much lighter program, transparent to code for, efficient, and
> it will do that job.
>
> On a Linux/Unix system (see below), say I wanted to extract fields
> 1, 1000, 1275,  , 5678 from a CSV file. Then the 'awk' line
> that would do it would look like
>
> awk '
>  BEGIN{FS=","}{print $(1) "," $(1000) "," $(1275) "," ... $(5678)
> ' < sippfile.csv > newdata.csv
>
> Awk reads one line at a tine, and does with it what you tell it to do.
> It will not be overcome by a file with an enormous number of lines.
> Perl would be similar. So long as one line fits comfortably into RAM,
> you would not be limited by file size (unless you're running out
> of disk space), and operation will be quick, even for very long
> lines (as an experiment, I just set up a file with 10,000 fields
> and 35 lines; awk output 6 selected fields from all 35 lines in
> about 1 second, on the 366MHz 128MB RAM machine I'm on at the
> moment. After transferring it to a 733MHz 512MB RAM machine, it was
> too quick to estimate; so I duplicated the lines to get a 363-line
> file, and now got those same fields out in a bit less than 1 second.
> So that's over 300 lines/second, 200,000 lines a minute, a million
> lines in 5 minutes; and all on rather puny hardware.).
>
> In practice, you might want to write a separate script which woould
> automatically create the necessary awk script (say if you supply
> the filed names, haing already coded the filed positions corresponding
> to filed names). You could exploit R's system() command to run the
> scripts from within R, and then load in the filtered data.
>
> > I wrote a script which loads large datasets a few lines at a time,
> > writes the dozen or so variables of interest to a CSV file, removes
> > the loaded data and then (via a "for" loop) loads the next few lines
> >  I managed to get it to work with one of the SIPP core files,
> > but it's SLW.
>
> See above ...
>
> > Worse, if I discover later that I omitted a relevant variable,
> > then I'll have to run the whole script all over again.
>
> If the script worked quickly (as with awk), presumably you
> wouldn't mind so much?
>
> Regarding Linux/Unix versus Windows. It is general experience
> that Linux/Unix works faster, more cleanly and efficiently, and
> often more reliably, for similar tasks; and cam do so on low grade
> hardware. Also, these systems come with dozens of file-processing
> utilities (including perl and awk; also many others), each of which
> has been written to be efficient at precisely the repertoire of
> tasks it was designed for. A lot of Windows sotware carries a huge
> overhead of either cosmetic dross, or a pantechnicon of functionality
> of which you are only going to need 0.01% at any one time.
>
> The Unix utilities have been ported to Windows, long since, but
> I have no experience of using them in that environment. Others,
> who have, can advise! But I'd seriously suggest getting hold of them.
>
> Hoping this helps,
> Ted.
>
> 
> E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
> Fax-to-email: +44 (0)870 094 0861
> Date: 30-Jul-07   Time: 18:24:41
> -- XFMail --
>
> __

Re: [R] deriv; loop

2007-07-30 Thread jim holtman

for question 1, is this what you want (BTW allocate 'result' to the
size you want - the example keeps extending it which is OK for small
numbers, but for larger size preallocate):

> result <- numeric(0)
> for (i in 1:6) result[i] <- i
> result
[1] 1 2 3 4 5 6
> prod(result)
[1] 720


On 7/29/07, francogrex <[EMAIL PROTECTED]> wrote:
>
> Hi, 2 questions:
>
> Question 1: example of what I currently do:
>
> for(i in 1:6){sink("temp.txt",append=TRUE)
> dput(i+0)
> sink()}
> x=scan(file="temp.txt")
> print(prod(x))
> file.remove("C:/R-2.5.0/temp.txt")
>
> But how to convert the output of the loop to a vector that I can manipulate
> (by prod or sum etc), without having to write and append to a file?
>
> Question 2:
>
> > deriv(~gamma(x),"x")
>
> expression({
>.expr1 <- gamma(x)
>.value <- .expr1
>.grad <- array(0, c(length(.value), 1), list(NULL, c("x")))
>.grad[, "x"] <- .expr1 * psigamma(x)
>attr(.value, "gradient") <- .grad
>.value
> })
>
> BUT
>
> > deriv3(~gamma(x),"x")
> Error in deriv3.formula(~gamma(x), "x") : Function 'psigamma' is not in the
> derivatives table
>
> What I want is the expression for the second derivative (which I believe is
> trigamma(x), or psigamma(x,1)), how can I obtain that?
>
> Thanks
> --
> View this message in context: 
> http://www.nabble.com/deriv--loop-tf4166283.html#a11853456
> Sent from the R help mailing list archive at Nabble.com.
>
> __________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems saving and loading (PLMset) objects

2007-07-30 Thread jim holtman

you just need to say:

load("expr.RData")

You should not be assigning it to 'expr' since it is already 'load'ed

On 7/30/07, Quin Wills <[EMAIL PROTECTED]> wrote:
> Hi
>
>
>
> I'm running the latest R on a presumably up to date Linux server.
>
>
>
> 'Doing something silly I'm sure, but can't see why my saved PLMset objects
> come out all wrong. To use an example:
>
>
>
> Setting up an example PLMset (I have the same problem no matter what example
> I use)
>
> > library(affyPLM)
>
> > data(Dilution) # affybatch object
>
> > Dilution = updateObject(Dilution)
>
> > options(width=36)
>
> > expr <- fitPLM(Dilution)
>
>
>
>
>
> This works, and I'm able to get the probeset coefficients with coefs(expr).
> until I save and try reloading:
>
> > save(expr, file="expr.RData")
>
> > rm(expr) # just to be sure
>
> > expr <- load(expr.RData)
>
>
>
>
>
> Now, running coefs(expr) says:
>
> > Error in function (classes, fdef, mtable) : unable to find an inherited
> method for function "coefs", for signature "character"
>
>
>
>
>
> Trying str(exp) just gives the following:
>
> > chr "exp"
>
>
>
> expr.Rdata appears to save properly (in that there is an actual file with
> notable size in my working directory).
>
>
>
> Thanks in advance,
>
> Quin
>
>
>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to combine data of several csv-files

2007-07-30 Thread jim holtman

t; >> [[4]]
> >> [1] -1.40362589  0.13045724 -0.84651458  1.57005071  0.06961015
> >> 0.25269771
> >>
> >> [[5]]
> >> [1] -1.1829260  2.1411553 -0.1327081 -0.1053442 -0.8179396 -1.2342698
> >>
> >> [[6]]
> >> [1]  1.17099178  0.49248118 -0.18690065  1.50050976 -0.65552410
> >> -0.01243247
> >>
> >> [[7]]
> >> [1] -0.046778203 -0.233788840  0.443908897 -1.649740180  0.003991354
> >> -0.228020092
> >>
> >> [[8]]
> >> [1] NA
> >>
> >> now, I need the means (and sd) of element 1 of list[2],list[3],list[4]
> >> (because they belong to "cat1") and
> >>
> >> = mean(-0.6442149, 0.02354036, -1.40362589)
> >>
> >> the same for element 2 up to element 6 (--> I would the get a vector
> >> containing the means for "cat1")
> >> the same for the vectors belonging to "cat2".
> >>
> >> does anybody now understand what I mean?
> >>
> >> Antje
> >>
> >>
> >>
> >
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write.csv

2007-07-30 Thread jim holtman

Then you can just write a 'for' loop to write out each submatrix:

for (i in 1:dim(x)[3]){
write.csv(x[,,i], paste("x", i, ".csv", sep=""))
}


On 7/30/07, Dong GUO 郭东 <[EMAIL PROTECTED]> wrote:
> the dim of my results is (26,31,8) -(years, regions and variables). so, if i
> save each (years, regions) in 8 csv files, later, I could connect the
> (26,31) to dbf file in ArcGIS to show in a map. This is what I intend to do.
>
> I dont know a better way to do it directly in R...
>
>
> On 7/31/07, jim holtman <[EMAIL PROTECTED]> wrote:
> > It really depends on how you want it output.  You can use 'write.csv'
> > to write an array out and it will be a 2-dimentional image that you
> > could then reconstruct it from if you know what the dimensions were.
> > What do you want to do with the data?  If you are just going to read
> > it back into R, then use save/load.
> >
> > On 7/29/07, Dong GUO 郭东 < [EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > I want to save an array(say, array[6,7,8]) write a cvs file. How can I
> do
> > > that??? can I write in one file?
> > >
> > > if I could not write in one file, i want to use a loop to save in
> different
> > > files (in the array[6,7,8], should be 8 csv files), such as the filename
> > > structure should be: file ="filename" +str(i) +"." +"csv"
> > >
> > > Many thanks.
> > >
> > >[[alternative HTML version deleted]]
> > >
> > > __________
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem you are trying to solve?
> >
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Order by the columns

2007-07-30 Thread jim holtman

?order

You could do something like:

mat[order(mat[,1], mat[,2], mat[,3]),]

On 7/29/07, Am Stat <[EMAIL PROTECTED]> wrote:
> Dear useR,
>
> I have a data matrix, it has n columns, each column is a two-level variable
> with entires -1 and +1. They are randomly generated, now I want to order
> them like (for example, 5 columns case)
> ---   -   -
> --   -   --
> .
> (first several rows are the samples with all variables in low level)
>
> +   -   --   -
> +   -   ---
> .
>
>
> -   +   --   -
>
>
> +  +   --   -
>
>
>
> + + + + +
>
> Is there any function in R that could let me do this order by Var1 then
> order by Var2 then...order by Var n
>
>
> Thanks very much in advance!
>
>
> Best,
>
> Leon
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looping through all possible combinations of cases

2007-07-30 Thread jim holtman

Here is how to do it for 2; you can extend it:

> # test data
> n <- 100
> x <- data.frame(id=sample(letters[1:4], n, TRUE), values=runif(n))
> # get combinations of 2 at a time
> comb.2 <- combn(unique(as.character(x$id)), 2)
> for (i in 1:ncol(comb.2)){
+ cat(sprintf("%s:%s %f\n",comb.2[1,i], comb.2[2,i],
+ sum(x$value[x$id %in% comb.2[,i]])))
+ }
c:d 25.259988
c:b 21.268737
c:a 21.250933
d:b 26.013253
d:a 25.995450
b:a 22.004198


On 7/27/07, Dimitri Liakhovitski <[EMAIL PROTECTED]> wrote:
> Hello!
>
> I have a regular data frame (DATA) with 10 people and 1 column
> ('variable'). Its cases are people with names ('a', 'b', 'c', 'd',
> 'e', 'f', etc.). I would like to write a function that would sum up
> the values on 'variable' of all possible combinations of people, i.e.
>
> 1. I would like to write a loop - in such a way that it loops through
> each possible pair of cases (i.e., ab, ac, ad, etc.) and sums up their
> respective values on 'variable'
>
> 2. I would like to write a loop - in such a way that it loops through
> each possible trio of cases (i.e., abc, abd, abe, etc.) and sums up
> their respective values on 'variable'.
>
> 3.  I would like to write a loop - in such a way that it loops through
> each possible quartet of cases (i.e., abcd, abce, abcf, etc.) and sums
> up their respective values on 'variable'.
>
> etc.
>
> Then, at the end I want to capture all possible combinations that were
> considered (i.e., what elements were combined in it) and get the value
> of the sum for each combination.
>
> How should I do it?
> Thanks a lot!
> Dimitri
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to combine data of several csv-files

2007-07-30 Thread jim holtman

This should do it:

> v1 <- NA
> v2 <- rnorm(6)
> v3 <- rnorm(6)
> v4 <- rnorm(6)
> v5 <- rnorm(6)
> v6 <- rnorm(6)
> v7 <- rnorm(6)
> v8 <- rnorm(6)
> v8 <- NA
>
> list <- list(v1,v2,v3,v4,v5,v6,v7,v8)
> categ <- c(NA,"cat1","cat1","cat1","cat2","cat2","cat2",NA)
>
> # create partitioned list
> list.cat <- split(list, categ)
> # combine each partition into a matrix
> list.mat <- lapply(list.cat, function(x) do.call('rbind', x))
> # now take the means of each column
> lapply(list.mat, colMeans)
$cat1
[1] -0.5699080  0.3855693  1.1051809  0.2379324  0.6684713  0.3240003

$cat2
[1]  0.38160462 -0.10559496 -0.40963090 -0.09507354  0.95021406 -0.31491450

>


On 7/30/07, Antje <[EMAIL PROTECTED]> wrote:
> okay, I played a bit around and now I have some kind of testcase for you:
>
> v1 <- NA
> v2 <- rnorm(6)
> v3 <- rnorm(6)
> v4 <- rnorm(6)
> v5 <- rnorm(6)
> v6 <- rnorm(6)
> v7 <- rnorm(6)
> v8 <- rnorm(6)
> v8 <- NA
>
> list <- list(v1,v2,v3,v4,v5,v6,v7,v8)
> categ <- c(NA,"cat1","cat1","cat1","cat2","cat2","cat2",NA)
>
>  > list
> [[1]]
> [1] NA
>
> [[2]]
> [1] -0.6442149 -0.2047012 -1.1986041 -0.2097442 -0.7343465 -1.3888750
>
> [[3]]
> [1]  0.02354036 -1.36186952 -0.42197792  1.50445971 -1.76763996  0.53722404
>
> [[4]]
> [1] -1.40362589  0.13045724 -0.84651458  1.57005071  0.06961015  0.25269771
>
> [[5]]
> [1] -1.1829260  2.1411553 -0.1327081 -0.1053442 -0.8179396 -1.2342698
>
> [[6]]
> [1]  1.17099178  0.49248118 -0.18690065  1.50050976 -0.65552410 -0.01243247
>
> [[7]]
> [1] -0.046778203 -0.233788840  0.443908897 -1.649740180  0.003991354 
> -0.228020092
>
> [[8]]
> [1] NA
>
> now, I need the means (and sd) of element 1 of list[2],list[3],list[4] 
> (because they belong to "cat1") and
>
> = mean(-0.6442149, 0.02354036, -1.40362589)
>
> the same for element 2 up to element 6 (--> I would the get a vector 
> containing the means for "cat1")
> the same for the vectors belonging to "cat2".
>
> does anybody now understand what I mean?
>
> Antje
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regular expressions : extracting numbers

2007-07-30 Thread jim holtman

Is this what you want:

> x
 [1] "lema, rb 2%"   "rb 2%" "rb 3%" "rb 4%"
"rb 3%" "rb 2%,mineuse"
 [7] "rb""rb""rb 12" "rb"
"rj 30%""rb"
[13] "rb""rb 25%""rb""rb"
"rb""rj, rb"
> gsub("[^0-9]*([0-9]*)[^0-9]*", "\\1", x)
 [1] "2"  "2"  "3"  "4"  "3"  "2"  ""   ""   "12" ""   "30" ""   ""
"25" ""   ""   ""   ""
>


On 7/30/07, GOUACHE David <[EMAIL PROTECTED]> wrote:
> Hello all,
>
> I have a vector of character strings, in which I have letters, numbers, and 
> symbols. What I wish to do is obtain a vector of the same length with just 
> the numbers.
> A quick example -
>
> extract of the original vector :
> "lema, rb 2%" "rb 2%" "rb 3%" "rb 4%" "rb 3%" "rb 2%,mineuse" "rb" "rb" "rb 
> 12" "rb" "rj 30%" "rb" "rb" "rb 25%" "rb" "rb" "rb" "rj, rb"
>
> and the type of thing I wish to end up with :
> "2" "2" "3" "4" "3" "2" "" "" "12" "" "30" "" "" "25" "" "" "" ""
>
> or, instead of "", NA would be acceptable (actually it would almost be better 
> for me)
>
> Anyways, I've been battling with gsub() and things of the sort, but I'm 
> drowning in the regular expressions, despite a few hours of looking at Perl 
> tutorials...
> So if anyone can help me out, it would be greatly appreciated!!
>
> In advance, thanks very much.
>
> David Gouache
> Arvalis - Institut du Végétal
> Station de La Minière
> 78280 Guyancourt
> Tel: 01.30.12.96.22 / Port: 06.86.08.94.32
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] get() with complex objects?

2007-07-27 Thread jim holtman

'get' tries to retrieve the object given by the character string.  The
error message says that object can not be found.  You actually have to
'evaluate' the character string.  See the example below:

> x <- data.frame(a=1:10, b=11:20)
> x$a
 [1]  1  2  3  4  5  6  7  8  9 10
> z <- 'x$a'
> get(z)
Error in get(x, envir, mode, inherits) : variable "x$a" was not found
> # parse and evaluate the character string 'x$a'
> eval(parse(text=z))
 [1]  1  2  3  4  5  6  7  8  9 10

Does this make sense?


On 7/27/07, Mark Orr <[EMAIL PROTECTED]> wrote:
> Hello R-listers,
> I'm having trouble accessing "sub" objects ("attributes"?), e.g.,
> "x$silinfo$avg.width" using the /get() /command;  I'm using/ get()/ in a
> loop as illustrated in the following code:
>
> #FIRST MAKE CLUSTERS of VARYING  k
> /for (i in 1:300){
>  assign(paste("x.",i,sep=""),pam(x,i))  #WORKS FINE
> }/
>
> #NEXT, TAKE LOOK AT AVE. SILHOUETTE VALUE FOR EACH k
>
> #PART 1, MAKE LIST OF OBJECTS NEEDED
> /gen.list <- rep("t",300)
> for (i in 1:300){
>  assign(gen.list[i],paste("x.",i,"$silinfo$avg.width",sep=""))
> }
> #WORKS FINE
>
> /#PART 2, USE LIST IN LOOP TO ACCESS OBJECT.
> /si//l.collector <- rep(99,300)
> for(i in 1:300){
>  sil.collector <- get(gen.list[i])
> }/
> #HERE IS THE ERROR
> /Error in get(x, envir, mode, inherits) : variable
> "x.1$silinfo$avg.width" was not found
>
> /So, I get the gist of this error; x.1 is an object findable from get(),
> but the "attribute"  levels are not accessible.  Any suggestions on how
> to get get() to access these levels?  From reading the get()'s help
> page, I don't think it will access the attributes. (my apologies for
> loosely using the term attributes, but I hope it is clear).
>
> Thanks,
>
> Mark Orr
>
> --
> ***
> Mark G. Orr, PhD
> Heilbrunn Dept. of Population and Family Health
> Columbia University
> 60 Haven Ave., B-2
> New York, NY 10032
>
> Tele: 212-304-7823
> Fax:  212-305-7024
>
> www.columbia.edu/~mo2259
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-27 Thread jim holtman

results=()#character()
myVariableNames=names(x.val)
results[length(myVariableNames)]<-NA

for (i in myVariableNames){
results[i]<-names(x.val[[i]])# this does not work it returns a
NULL (how can i convert this to x.val$"somevalue" ? )
}



On 7/27/07, Allan Kamau <[EMAIL PROTECTED]> wrote:
> Hi All,
> I am having difficulties finding a way to find a substitute to the command 
> "names(v.val$PR14)" so that I could generate the command on the fly for all 
> PR14 to PR200 (please see the previous discussion below to understand what 
> the object x.val contains) . I have tried the following
>
> >results=()#character()
> >myVariableNames=names(x.val)
> >results[length(myVariableNames)]<-NA
>
> >for as.vector(unlist(strsplit(str,",")),mode="list")
> +results[i]<-names(x.val$i)# this does not work it returns a NULL 
> (how can i convert this to x.val$"somevalue" ? )
> >}
>
> Allan.
>
>
> - Original Message 
> From: Allan Kamau <[EMAIL PROTECTED]>
> To: r-help@stat.math.ethz.ch
> Sent: Thursday, July 26, 2007 10:03:17 AM
> Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
> variable in a multivariate dataset.
>
> Thanks so much Jim, Andaikalavan, Gabor and others for the help and 
> suggestions.
> The solution will result in a matrix containing nested matrices to enable 
> each variable name, each variables distinct value and the count of the 
> distinct value to be accessible individually.
> The main matrix will contain the variable names, the first level nested 
> matrices will consist of the variables unique values, and each such variable 
> entry will contain a one element vector to contain the count or occurrence 
> frequency.
> This matrix can now be used in comparing other similar datasets for variable 
> values and their frequencies.
>
> Building on the input received so far, a probable solution in building the 
> matrix will include the following.
>
>
> 1)I reading the csv file (containing column headers)
> >my_data=read.table("",header=TRUE,sep=",",dec=".",fill=TRUE)
>
> 2)I group the values in each variable producing an occurrence count(frequency)
> >x.val<-apply(my_data,2,table)
>
> 3)I obtain a vector of the names of the variables in the table
> >names(x.val)
>
> 4)Now I make use of the names (obtained in step 3) to obtain a vector of 
> distinct values in a given variable (in the example below the variable name 
> is $PR14)
> >names(v.val$PR14)
>
> 5)I obtain a vector (with one element) of the frequency of a value obtained 
> from the step above (in our example the value is "V")
> >as.vector(x.val$PR14["V"])
>
> Todo:
> Now I will need to place the steps above in a script (consisting of loops) to 
> build the matrix, step 4 and 5 seem tricky to do programatically.
>
> Allan.
>
>
> - Original Message 
> From: jim holtman <[EMAIL PROTECTED]>
> To: Allan Kamau <[EMAIL PROTECTED]>
> Cc: Adaikalavan Ramasamy <[EMAIL PROTECTED]>; r-help@stat.math.ethz.ch
> Sent: Wednesday, July 25, 2007 1:50:55 PM
> Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
> variable in a multivariate dataset.
>
> Also if you want to access the individual values, you can just leave
> it as a list:
>
> > x.val <- apply(x, 2, table)
> > # access each value
> > x.val$PR14["V"]
> V
> 8
>
>
>
> On 7/25/07, Allan Kamau <[EMAIL PROTECTED]> wrote:
> > A subset of the data looks as follows
> >
> > > df[1:10,14:20]
> >   PR10 PR11 PR12 PR13 PR14 PR15 PR16
> > 1 VTIKVGD
> > 2 VSIKVGG
> > 3 VTIRVGG
> > 4 VSIKIGG
> > 5 VSIKVGG
> > 6 VSIRVGG
> > 7 VTIKIGG
> > 8 VSIKVEG
> > 9 VSIKVGG
> > 10VSIKVGG
> >
> > The result I would like is as follows
> >
> > PR10PR11  PR12   ...
> > [V:10][S:7,T:3][I:10]
> >
> > The result can be in a matrix or a vector and each variablename, value and 
> > frequency should be accessible so as to be used for comparisons with 
> > another dataset later.
> > The frequency can be a count or a percentage.
> >
> >
> > Allan.
> >
> >
> > - Original Message 
> > From: Adaikalavan Ramasamy <[E

Re: [R] Convert string to list?

2007-07-26 Thread jim holtman

Is this what you want:

> str <- "P = 0.0, T = 0.0, Q = 0.0"
> x <- eval(parse(text=paste('list(', str, ')')))
> str(x)
List of 3
 $ P: num 0
 $ T: num 0
 $ Q: num 0
>


On 7/26/07, Manuel Morales <[EMAIL PROTECTED]> wrote:
> Let's say I have the following string:
>
> str <- "P = 0.0, T = 0.0, Q = 0.0"
>
> I'd like to find a function that generates the following object from
> 'str'.
>
> list(P = 0.0, T = 0.0, Q = 0.0)
>
> Thanks!
>
> --
> http://mutualism.williams.edu
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create Strings of Column Id's

2007-07-26 Thread jim holtman

Is this what you want:

> paste("-", paste(colnames(MyMatrix)[COL], collapse='-'), sep='')
[1] "-E-T"


On 7/26/07, Tom.O <[EMAIL PROTECTED]> wrote:
>
> Does anyone know how this is don?
>
> I have a large matrix where I extract specific columns into txt files for
> further use. To be able to keep track of which txt files contain which
> columns I want to name the filenames with the column Id's.
>
> The most basic example would be to use an for() loop together with paste(),
> but the result is blank. Not even NULL.
>
> this is the concept of thecode i use:
>
> for example
>
> MyMatrix <- matrix(NA,ncol=4,nrow=1,dimnames=list(NULL,c("E","R","T","Y")))
> COL <- c(1,3) # a vector of columns I want to extract,
>
> Filename <- NULL # the starting variable, so I can use paste
> Filename <- for(i in colnames(MyMatrix)[COL]) {paste(Filename,"-",i,sep="")}
>
> The result is "-T", but I want it to be "-E-T"
>
> Anyone have a clue?
>
> Thanks Tom
>
>
> --
> View this message in context: 
> http://www.nabble.com/Create-Strings-of-Column-Id%27s-tf4153354.html#a11816439
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding matches in 2 files

2007-07-26 Thread jim holtman

Is this what you want?

> g1<-c("gene1", "gene2", "gene3", "gene4", "gene5", "gene9", "gene10",
+ "geneA")
> g2<-c("gene6", "gene9", "gene1", "gene2", "gene7", "gene8", "gene9",
+ "gene1", "gene10")
> intersect(g1,g2)
[1] "gene1"  "gene2"  "gene9"  "gene10"


On 7/25/07, jenny tan <[EMAIL PROTECTED]> wrote:
>
>
> I have 2 files containing data analysed by 2 different methods. I would like 
> to find out which genes appear in both analyses. Can someone show me how to 
> do this?
> _
> [[trailing spam removed]]
>
>[[alternative HTML version deleted]]
>
> __________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] if - else

2007-07-25 Thread jim holtman

try:

Start <- ifelse (DateFirstEven < DateSecondEvent,
(DateFirstEvent+DateSecondEvent)/2, DateFound)



On 7/25/07, James J. Roper <[EMAIL PROTECTED]> wrote:
> Greetings,
>
> I have some confusion with the use of if - else.  Let's say I have a
> four variables as follows:
>
> Condition   DateFound  DateFirstEvent
> DateSecondEvent
> NA10Jan2000  NA NA
> 0   05Jan2000  07Jan2000
>   10Jan2000
> 1   07Jan2000  07Jan2000
>   08Jan2000
> 2   09Jan2000  NA 
> NA
>
> Now, what I need to do is make a new variable that is either the
> midpoint of the first and second event dates, or the date found (I
> will call Start).
>
> I tried an if - else condition as follows:
>
> Start <- if (DateFirstEven < DateSecondEvent)
> (DateFirstEvent+DateSecondEvent)/2 else DateFound
>
> I also tried
>
> Start <- if (any(DateFirstEven < DateSecondEvent))
> (DateFirstEvent+DateSecondEvent)/2 else DateFound
>
> Only the first half of the expression was ever evaluated.
>
> I hope I have not been to brief, and will certainly appreciate any help.
>
> Thanks,
>
> Jim
>
> --
> James J. Roper
> Population Dynamics and Conservation of
> Terrestrial Vertebrates
> Caixa Postal 19034
> 81531-990 Curitiba, Paraná, Brasil
> ===
> E-mail:   [EMAIL PROTECTED]
> Phone/Fone/Teléfono: 55 41 33611764
> celular: 55 41 99870543
> Casa:   55 41 33857249
> ===
> Ecologia e Conservação na UFPR
> http://www.bio.ufpr.br/ecologia/
> ---
> http://jjroper.googlepages.com/
> http://arsartium.googlepages.com/
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-25 Thread jim holtman

Also if you want to access the individual values, you can just leave
it as a list:

> x.val <- apply(x, 2, table)
> # access each value
> x.val$PR14["V"]
V
8



On 7/25/07, Allan Kamau <[EMAIL PROTECTED]> wrote:
> A subset of the data looks as follows
>
> > df[1:10,14:20]
>   PR10 PR11 PR12 PR13 PR14 PR15 PR16
> 1 VTIKVGD
> 2 VSIKVGG
> 3 VTIRVGG
> 4 VSIKIGG
> 5 VSIKVGG
> 6 VSIRVGG
> 7 VTIKIGG
> 8 VSIKVEG
> 9 VSIKVGG
> 10VSIKVGG
>
> The result I would like is as follows
>
> PR10PR11  PR12   ...
> [V:10][S:7,T:3][I:10]
>
> The result can be in a matrix or a vector and each variablename, value and 
> frequency should be accessible so as to be used for comparisons with another 
> dataset later.
> The frequency can be a count or a percentage.
>
>
> Allan.
>
>
> - Original Message 
> From: Adaikalavan Ramasamy <[EMAIL PROTECTED]>
> To: Allan Kamau <[EMAIL PROTECTED]>
> Cc: r-help@stat.math.ethz.ch
> Sent: Tuesday, July 24, 2007 10:21:51 PM
> Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
> variable in a multivariate dataset.
>
> The name of the table should give you the "value". And if you have a
> matrix, you just need to convert it into a vector first.
>
>  > m <- matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 )
>  > m
>  [,1] [,2] [,3]
> [1,] "A"  "C"  "B"
> [2,] "B"  "D"  "C"
> [3,] "C"  "E"  "D"
>  > tb <- table( as.vector(m) )
>  > tb
>
> A B C D E
> 1 2 3 2 1
>  > paste( names(tb), ":", tb, sep="" )
> [1] "A:1" "B:2" "C:3" "D:2" "E:1"
>
> If this is not what you want, then please give a simple example.
>
> Regards, Adai
>
>
>
> Allan Kamau wrote:
> > Hi all,
> > If the question below as been answered before I
> > apologize for the posting.
> > I would like to get the frequencies of occurrence of
> > all values in a given variable in a multivariate
> > dataset. In short for each variable (or field) a
> > summary of values contained with in a value:frequency
> > pair, there can be many such pairs for a given
> > variable. I would like to do the same for several such
> > variables.
> > I have used table() but am unable to extract the
> > individual value and frequency values.
> > Please advise.
> >
> > Allan.
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-25 Thread jim holtman

Is this what you want:

> x <- read.table(textConnection("  PR10 PR11 PR12 PR13 PR14 PR15 PR16
+ 1 VTIKVGD
+ 2 VSIKVGG
+ 3 VTIRVGG
+ 4 VSIKIGG
+ 5 VSIKVGG
+ 6 VSIRVGG
+ 7 VTIKIGG
+ 8 VSIKVEG
+ 9 VSIKVGG
+ 10VSIKVGG"), header=TRUE)
> x.t <- apply(x, 2, function(.col){
+ .tab <- table(.col)
+ paste('[', paste(names(.tab), .tab, sep=":", collapse=','), ']', sep='')
+ })
>
>
> x.t
   PR10PR11PR12PR13PR14PR15
   "[V:10]" "[S:7,T:3]""[I:10]" "[K:8,R:2]" "[I:2,V:8]" "[E:1,G:9]"
   PR16
"[D:1,G:9]"
>


On 7/25/07, Allan Kamau <[EMAIL PROTECTED]> wrote:
> A subset of the data looks as follows
>
> > df[1:10,14:20]
>   PR10 PR11 PR12 PR13 PR14 PR15 PR16
> 1 VTIKVGD
> 2 VSIKVGG
> 3 VTIRVGG
> 4 VSIKIGG
> 5 VSIKVGG
> 6 VSIRVGG
> 7 VTIKIGG
> 8 VSIKVEG
> 9 VSIKVGG
> 10VSIKVGG
>
> The result I would like is as follows
>
> PR10PR11  PR12   ...
> [V:10][S:7,T:3][I:10]
>
> The result can be in a matrix or a vector and each variablename, value and 
> frequency should be accessible so as to be used for comparisons with another 
> dataset later.
> The frequency can be a count or a percentage.
>
>
> Allan.
>
>
> - Original Message 
> From: Adaikalavan Ramasamy <[EMAIL PROTECTED]>
> To: Allan Kamau <[EMAIL PROTECTED]>
> Cc: r-help@stat.math.ethz.ch
> Sent: Tuesday, July 24, 2007 10:21:51 PM
> Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
> variable in a multivariate dataset.
>
> The name of the table should give you the "value". And if you have a
> matrix, you just need to convert it into a vector first.
>
>  > m <- matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 )
>  > m
>  [,1] [,2] [,3]
> [1,] "A"  "C"  "B"
> [2,] "B"  "D"  "C"
> [3,] "C"  "E"  "D"
>  > tb <- table( as.vector(m) )
>  > tb
>
> A B C D E
> 1 2 3 2 1
>  > paste( names(tb), ":", tb, sep="" )
> [1] "A:1" "B:2" "C:3" "D:2" "E:1"
>
> If this is not what you want, then please give a simple example.
>
> Regards, Adai
>
>
>
> Allan Kamau wrote:
> > Hi all,
> > If the question below as been answered before I
> > apologize for the posting.
> > I would like to get the frequencies of occurrence of
> > all values in a given variable in a multivariate
> > dataset. In short for each variable (or field) a
> > summary of values contained with in a value:frequency
> > pair, there can be many such pairs for a given
> > variable. I would like to do the same for several such
> > variables.
> > I have used table() but am unable to extract the
> > individual value and frequency values.
> > Please advise.
> >
> > Allan.
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Passing equations as arguments

2007-07-24 Thread jim holtman

Here is one possible solution:

ifun <-
function(a, b, FUN){
evala <- FUN(a)
evalb <- FUN(b)
if (evala > evalb) return(evala) else return(evalb)
}
ifun(1,2,function(x) (x*x) - 2)



On 7/24/07, Anup Nandialath <[EMAIL PROTECTED]> wrote:
> Friends,
>
> I'm trying to pass an equation as an argument to a function. The idea is as 
> follows. Let us say i write an independent function
>
> Ideal Situation:
>
> ifunc <- function(x)
> {
> return((x*x)-2)
> }
>
> mainfunc <- function(a,b)
> {
> evala <- ifunc(a)
> evalb <- ifunc(b)
> if (evala>evalb){return(evala)}
> else
> return(evalb)
> }
>
> Now I want to try and write this entire program in a single function with the 
> user specifying the equation as an argument to the function.
>
> myfunc <- function(a, b, eqn)
> {
>func1 <- function (x) ??
>{
>return(eqn in terms of x)  ??
>   }
>
> Further arguments to check
>
> The  imply that this does not seem to be correct. The idea is how to 
> assign the equation expression from the main equation into the inner 
> function. Is there anyway to do that within this set up?
>
>
> Thanks in advance
> Regards
>
> Anup
>
>
> -
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] code optimization tips

2007-07-23 Thread jim holtman

The quote "what is the problem you are trying to solve" is just part
of my signature.  I used to review projects for performance and
architecture and that was the first question I always asked them.

To pass the argument, if you notice the definition of apply:

apply(X, MARGIN, FUN, ...)

the ... are optional argument, so for your function:

sij <-function(rj,ri,k){
rij=mod(rj-ri)
cos_ij=rj[1]/rij
sin_ij=rj[2]/rij
A<-(1-1i*k*rij)*(3*cos_ij^2-1)*exp(1i*k*rij)/(rij^3)
B<-k^2*sin_ij^2*exp(1i*k*rij)/rij
sij<-A+B
}

you would call apply with the following:

s_ij<-apply(rj,2,sij, ri=ri, k=k)


On 7/23/07, baptiste Auguié <[EMAIL PROTECTED]> wrote:
> Thanks for your reply,
>
> On 23 Jul 2007, at 15:19, jim holtman wrote:
>
> > First question is why are you defining the functions within the main
> > function each time?  Why don't you define them once outside?
> >
>
>
> Fair enough!
>
> As said, I'm new to R and don't know whether it is best to define
> functions outside and pass to them all necessary arguments, or nest
> them and get variables in the scope from parents. In any case, I'd
> agree my positions(), mod() and sij() functions would be better
> outside. Here is a corrected version (untested as something else is
> running),
>
> >
> > positions <- function(N) {
> >reps <- 2*N+1
> >matrix(c(rep(-N:N, each = reps), rep(-N:N, times = reps)),
> >   nrow = 2, byrow = TRUE)
> > }
>
> > mod<-function(x){sqrt(x[1]^2+x[2]^2)} # modulus
>
> > sij <-function(rj,ri,k){
> > rij=mod(rj-ri)
> > cos_ij=rj[1]/rij
> > sin_ij=rj[2]/rij
> >
> > A<-(1-1i*k*rij)*(3*cos_ij^2-1)*exp(1i*k*rij)/(rij^3)
> > B<-k^2*sin_ij^2*exp(1i*k*rij)/rij
> >
> > sij<-A+B
> > }
>
> > alpha_c <- function(lambda=600e-9,alpha_s=1e-14,N=400,spacing=1e-7){
> >
> > k<-2*pi/lambda
> > ri<-c(0,0) # particle at the origin
> >
> > rj<-positions(N)*spacing # all positions in the 2N x 2N array
> > rj<-rj[1:2,-((dim(rj)[2]-1)/2+1)] # remove rj=(0,0)
> >
> > s_ij<-apply(rj,2,sij)
>
> *** Now, how do I pass k and ri to this function ? ***
>
> > S<-sum(s_ij)
> > alpha_s/(1-alpha_s*S)
> > }
> > alpha_c()
> >
>
>
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem you are trying to solve?
>
>
> Wondering whether that's part of the signature?
>
> the problem is related to scattering by arrays of particles, more
> specifically to evaluate the array influence on the effective
> polarizability (alpha) of a particle via dipolar radiative coupling.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] code optimization tips

2007-07-23 Thread jim holtman

First question is why are you defining the functions within the main
function each time?  Why don't you define them once outside?

On 7/23/07, baptiste Auguié <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Being new to R I'm asking for some advice on how to optimize the
> performance of the following piece of code:
>
>
> > alpha_c <- function(lambda=600e-9,alpha_s=1e-14,N=400,spacing=1e-7){
> >
> > k<-2*pi/lambda
> > ri<-c(0,0) # particle at the origin
> > x<-c(-N:N)
> > positions <- function(N) {
> >reps <- 2*N+1
> >matrix(c(rep(-N:N, each = reps), rep(-N:N, times = reps)),
> >   nrow = 2, byrow = TRUE)
> > }
> > rj<-positions(N)*spacing # all positions in the 2N x 2N array
> > rj<-rj[1:2,-((dim(rj)[2]-1)/2+1)] # remove rj=(0,0)
> >
> > mod<-function(x){sqrt(x[1]^2+x[2]^2)} # modulus
> >
> > sij <-function(rj){
> > rij=mod(rj-ri)
> > cos_ij=rj[1]/rij
> > sin_ij=rj[2]/rij
> >
> > A<-(1-1i*k*rij)*(3*cos_ij^2-1)*exp(1i*k*rij)/(rij^3)
> > B<-k^2*sin_ij^2*exp(1i*k*rij)/rij
> >
> > sij<-A+B
> > }
> >
> > s_ij<-apply(rj,2,sij)
> > S<-sum(s_ij)
> > alpha_s/(1-alpha_s*S)
> > }
> > alpha_c()
>
>
> This function is to be called for a few tens of values of lambda in a
> 'for' loop, and possibly a couple of different N and spacing (their
> magnitude is typically around the default one).
>
> This can be a bit slow ––– not that I would expect otherwise --- and
> I wonder if there is something I could do to optimize it (vectorize
> with respect to the lambda parameter?, change the units of the
> problem to deal with numbers closer to unity?,...)
>
> Best regards,
>
> baptiste
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dataframe of factors transform speed?

2007-07-21 Thread jim holtman

The problem is in the way that 'as.data.frame' works.  Use Rprof on a
small list and you will see where it is spending its time.

Now if you are really sure that all your data is consistent with being
a data frame,
you can create your own dataframe structure your self.  Not that I
would advocate it, but if you look at the output of 'dput' on a
dataframe, you can construct your own.

Here it took 20 seconds to create the test data with a list of 50,000
and only 2 seconds to create the data frame from that.

> set.seed(123)
> n <- 5
> system.time({
+ genoT <- lapply(1:n, function(i) factor(sample(c("AA",
+ "AB", "BB"), 1000, prob=c(1000, 1, 1), rep=T)))
+ })
   user  system elapsed
  20.850.12   22.83
> names(genoT) = paste("snp", 1:n, sep="")
>
> # create your own data frame structure -- if you are real sure of your data
>
> system.time(genoTz <- structure(genoT, .Names=names(genoT),
+ row.names=c(NA, -length(genoT[[1]])), class='data.frame'))
   user  system elapsed
   2.000.082.11
> str(genoTz)
'data.frame':   1000 obs. of  5 variables:
 $ snp1: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp2: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp3: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp4: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp5: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp6: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp7: Factor w/ 1 level "AA": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp8: Factor w/ 2 levels "AA","BB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp9: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp10   : Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1 1 1 1 ...
 $ snp11   : Factor w/ 1 level "AA": 1 1 1 1 1 1 1 1 1 1 ...
>


On 7/21/07, Latchezar Dimitrov <[EMAIL PROTECTED]> wrote:
> Jim,
>
> No, this is _not the problem. If you go to my 1st mail I have a monster
> (at least was when I purchased it) with 32GB (sic :-) of RAM and 4 dual
> core AMD64 285 (the fastest at that time and still pretty fast now :-)
>
> The machine stats paging when I run 2 copies of R working on two things
> like that :-). If you look at my last e-mail I found a solution but
> still have no clue why the heck x<-as.data.frame(y) where why is a list
> of the same columns take real for ever and this the thing that killed me
> before.
>
> Thanks,
> Latchezar
>
> > -Original Message-
> > From: jim holtman [mailto:[EMAIL PROTECTED]
> > Sent: Saturday, July 21, 2007 5:33 PM
> > To: Latchezar Dimitrov
> > Cc: Benilton Carvalho; r-help@stat.math.ethz.ch
> > Subject: Re: [R] Dataframe of factors transform speed?
> >
> > One of the problems is that you are probably paging on your
> > system with an object that size (24 x 1000).  This is
> > about 1GB for a single object:
> >
> > > set.seed(123)
> > > n <- 24
> > > system.time({
> > + genoT <- lapply(1:n, function(i) factor(sample(c("AA", "AB", "BB"),
> > + 1000, prob=c(1000, 1, 1), rep=T)))
> > + })
> >user  system elapsed
> >   95.000.61  104.71
> > > names(genoT) = paste("snp", 1:n, sep="")
> > >
> > > object.size(genoT)
> > [1] 1045258752
> > >
> >
> > I can create it on my 2GB machine as a list, but have
> > problems converting it to a dataframe because I don't have
> > enough memory.
> >
> > So unless you have at least 4GB on your system, it might take
> > a long time.  Look at your performance measurements on your
> > system and see if you have run out of physical memory and are paging.
> >
> > On 7/21/07, Latchezar Dimitrov <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > Thanks for the help. My 1st question still unanswered though :-)
> > > Please see bellow
> > >
> > > > -Original Message-
> > > > From: Benilton Carvalho [mailto:[EMAIL PROTECTED]
> > > > Sent: Friday, July 20, 2007 3:30 AM
> > > > To: Latchezar Dimitrov
> > > > Cc: r-help@stat.math.ethz.ch
> > > > Subject: Re: [R] Dataframe of factors transform speed?
> > > >
> > > > set.seed(123)
> > > > genoT = lapply(1:24, function(i) factor(sample(c("AA", &quo

Re: [R] Dataframe of factors transform speed?

2007-07-21 Thread jim holtman

gt;> [1]   1002 238304
> > >>>
> > >>>> str(genoT)
> > >>> 'data.frame':   1002 obs. of  238304 variables:
> > >>>  $ SNP_A.4261647: Factor w/ 3 levels "0","1","2": 3 3 3 3 3
> > >> 3 3 3 3 3
> > >>> ...
> > >>>  $ SNP_A.4261610: Factor w/ 3 levels "0","1","2": 1 1 3 3 1
> > >> 1 1 2 2 2
> > >>> ...
> > >>>  $ SNP_A.4261601: Factor w/ 3 levels "0","1","2": 1 1 1 1 1
> > >> 1 1 1 1 1
> > >>> ...
> > >>>  $ SNP_A.4261704: Factor w/ 3 levels "0","1","2": 3 3 3 3 3
> > >> 3 3 3 3 3
> > >>> ...
> > >>>  $ SNP_A.4261563: Factor w/ 3 levels "0","1","2": 3 1 2 1 2
> > >> 3 2 3 3 1
> > >>> ...
> > >>>  $ SNP_A.4261554: Factor w/ 3 levels "0","1","2": 1 1 NA
> > 1 NA 2 1 1
> > >>> 2 1
> > >>> ...
> > >>>  $ SNP_A.4261666: Factor w/ 3 levels "0","1","2": 1 1 2 1 1
> > >> 1 1 1 1 2
> > >>> ...
> > >>>  $ SNP_A.4261634: Factor w/ 3 levels "0","1","2": 3 3 2 3 3
> > >> 3 3 3 3 2
> > >>> ...
> > >>>  $ SNP_A.4261656: Factor w/ 3 levels "0","1","2": 1 1 2 1 1
> > >> 1 1 1 1 2
> > >>> ...
> > >>>  $ SNP_A.4261637: Factor w/ 3 levels "0","1","2": 1 3 2 3 2
> > >> 1 2 1 1 3
> > >>> ...
> > >>>  $ SNP_A.4261597: Factor w/ 3 levels "AA","AB","BB": 2 2 3 3 3 2 1
> > >>> 2 2 3
> > >>> ...
> > >>>  $ SNP_A.4261659: Factor w/ 3 levels "AA","AB","BB": 3 3 3 3 3 3 3
> > >>> 3 3 3
> > >>> ...
> > >>>  $ SNP_A.4261594: Factor w/ 3 levels "AA","AB","BB": 2 2 2 1 1 1 2
> > >>> 2 2 2
> > >>> ...
> > >>>  $ SNP_A.4261698: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1
> > >>> 1 ...
> > >>>  $ SNP_A.4261538: Factor w/ 3 levels "AA","AB","BB": 2 3 2 2 3 2 2
> > >>> 1 1 2
> > >>> ...
> > >>>  $ SNP_A.4261621: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1
> > >>> 1 1 1
> > >>> ...
> > >>>  $ SNP_A.4261553: Factor w/ 3 levels "AA","AB","BB": 1 1 2 1 1 1 1
> > >>> 1 1 1
> > >>> ...
> > >>>  $ SNP_A.4261528: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1
> > >>> 1 ...
> > >>>  $ SNP_A.4261579: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 2 1
> > >>> 1 1 2
> > >>> ...
> > >>>  $ SNP_A.4261513: Factor w/ 3 levels "AA","AB","BB": 2 1 2
> > >> 2 2 NA 1 NA
> > >>> 2
> > >>> 1 ...
> > >>>  $ SNP_A.4261532: Factor w/ 3 levels "AA","AB","BB": 1 2 2 1 1 1 3
> > >>> 1 1 1
> > >>> ...
> > >>>  $ SNP_A.4261600: Factor w/ 2 levels "AB","BB": 2 2 2 2 2 2 2 2 2
> > >>> 2 ...
> > >>>  $ SNP_A.4261706: Factor w/ 2 levels "AA","BB": 1 1 1 1 1 1 1 1 1
> > >>> 1 ...
> > >>>  $ SNP_A.4261575: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1
> > >>> 2 2 1
> > >>> ...
> > >>>
> > >>> Its columns are factors with different number of levels
> > >> (from 1 to 3 -
> > >>> that's what I got from read.table, i.e., it dropped missing
> > >> levels). I
> > >>> want to convert it to uniform factors with 3 levels. The
> > >> 1st 10 rows
> > >>> above show already converted columns and the rest are not yet
> > >>> converted.
> > >>> Here's my attempt wich is a complete failure as speed:
> > >>>
> > >>>> system.time(
> > >>> + for(j in 1:(10 )){ #-- this is to try 1st
> > 10 cols and
> > >>> measure the time, it otherwise is ncol(genoT) instead of 10
> > >>>
> > >>> +gt<-genoT[[j]]  #-- this is to avoid 2D indices
> > >>> +for(l in 1:length([EMAIL PROTECTED])){
> > >>> +  levels(gt)[l] <-
> > >> switch([EMAIL PROTECTED],AA="0",AB="1",BB="2")
> > >>> #-- convert levels to "0","1", or "2"
> > >>> +  genoT[[j]]<-factor(gt,levels=0:2)   #-- make a 3-level
> > >>> factor
> > >>> and put it back
> > >>> +}
> > >>> + }
> > >>> + )
> > >>> [1] 785.085   4.358 789.454   0.000   0.000
> > >>>
> > >>> 789s for 10 columns only!
> > >>>
> > >>> To me it seems like replacing 10 x 3 levels and then making
> > >> a factor
> > >>> of
> > >>> 1002 element vector x 10 is a "negligible" amount of operations
> > >>> needed.
> > >>>
> > >>> So, what's wrong with me? Any idea how to accelerate
> > >> significantly the
> > >>> transformation or (to go to the very beginning) to make
> > >> read.table use
> > >>> a fixed set of levels ("AA","AB", and "BB") and not to drop any
> > >>> (missing)
> > >>> level?
> > >>>
> > >>> R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit
> > >>>
> > >>> The machine is with 32G RAM and AMD Opteron 285 (2.? GHz)
> > >> so it's not
> > >>> it.
> > >>>
> > >>> Thank you very much for the help,
> > >>>
> > >>> Latchezar Dimitrov,
> > >>> Analyst/Programmer IV,
> > >>> Wake Forest University School of Medicine, Winston-Salem, North
> > >>> Carolina, USA
> > >>>
> > >>> __
> > >>> R-help@stat.math.ethz.ch mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > http://www.R-project.org/posting-
> > >>> guide.html and provide commented, minimal, self-contained,
> > >>> reproducible code.
> > >>
> >
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SOS

2007-07-20 Thread jim holtman

You can use sprintf:

> x <- runif(5)
> x
[1] 0.89838968 0.94467527 0.66079779 0.62911404 0.06178627
> cat(sprintf("%.2f%% ", x * 100))
89.84%  94.47%  66.08%  62.91%  6.18% >


On 7/20/07, Fabrice McShort <[EMAIL PROTECTED]> wrote:
> Hi Julian,
>
> Thank you very much. Please let me know how to get 2 numbers after the decim.
>
> Best regards,
>
> Fabrice
>
>
>
> > Date: Fri, 20 Jul 2007 08:15:42 -0700> From: [EMAIL PROTECTED]> To: [EMAIL 
> > PROTECTED]> CC: r-help@stat.math.ethz.ch> Subject: Re: [R] SOS> > Multiply 
> > by 100? Add> > R=R*100> > Fabrice McShort wrote:> > Dear all, I am a new 
> > user of R. I would like to know how to get fund's returns in percentage 
> > (%). For example, I use: R <- ts(read.xls("FundData"), frequency = 12, 
> > start = c(1996, 1)) Whith this program, the returns are like 0.0152699. 
> > But, I would like to have 1.52%. Please advise me about the function. 
> > Thanks! Fabrice> > 
> > _> >> > 
> > [[trailing spam removed]]> >> > [[alternative HTML version deleted]]> >> > 
> > __> > R-help@stat.math.ethz.ch 
> > mailing list> > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do 
> > read the posting guide http://www.R-project.org/posting-guide.html> > and 
> > provide commented, minimal, self-contained, reproducible code.> >> > > !
 > !
>  -- > Julian M. Burgos> > Fisheries Acoustics Research Lab> School of Aquatic 
> and Fishery Science> University of Washington> > 1122 NE Boat Street> 
> Seattle, WA 98105 > > Phone: 206-221-6864> >
> _
>
>
>
>    [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] binned column in a data.frame

2007-07-20 Thread jim holtman

You can also use 'cut' to break the bins:

> x <- c(1,2,6,8,13,0,5,10, runif(10) * 100)
> x.bins <- seq(0, max(x)+5, 5)
> x.cut <- cut(x, breaks=x.bins, include.lowest=TRUE)
> x.names <- paste(head(x.bins, -1), tail(x.bins, -1), sep='-')
> data.frame(x, bins=x.names[x.cut])
  x  bins
1   1.0   0-5
2   2.0   0-5
3   6.0  5-10
4   8.0  5-10
5  13.0 10-15
6   0.0   0-5
7   5.0   0-5
8  10.0  5-10
9  75.85256 75-80
10 38.20424 35-40
11 77.30647 75-80
12 62.02278 60-65
13 73.42095 70-75
14 78.69244 75-80
15 66.52972 65-70
16 61.64897 60-65
17 23.99252 20-25
18 42.08632 40-45


On 7/20/07, João Fadista <[EMAIL PROTECTED]> wrote:
> Dear all,
>
> I would like to know how can I create a binned column in a data.frame. The 
> output that I would like is something like this:
>
> Start  Binned_Start
> 10-5
> 20-5
> 65-10
> 85-10
> 13  10-15
> ...
>
>
>
>
> Best regards
>
> João Fadista
> Ph.d. student
>
>
>
> UNIVERSITY OF AARHUS
> Faculty of Agricultural Sciences
> Dept. of Genetics and Biotechnology
> Blichers Allé 20, P.O. BOX 50
> DK-8830 Tjele
>
> Phone:   +45 8999 1900
> Direct:  +45 8999 1900
> E-mail:  [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
> Web: www.agrsci.org <http://www.agrsci.org/>
> 
>
> News and news media <http://www.agrsci.org/navigation/nyheder_og_presse> .
>
> This email may contain information that is confidential. Any use or 
> publication of this email without written permission from Faculty of 
> Agricultural Sciences is not allowed. If you are not the intended recipient, 
> please notify Faculty of Agricultural Sciences immediately and delete this 
> email.
>
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dataframe of factors transform speed?

2007-07-19 Thread jim holtman

: 3 1 2 1 2 3 2 3 3 1
> ...
>  $ SNP_A.4261554: Factor w/ 3 levels "0","1","2": 1 1 NA 1 NA 2 1 1 2 1
> ...
>  $ SNP_A.4261666: Factor w/ 3 levels "0","1","2": 1 1 2 1 1 1 1 1 1 2
> ...
>  $ SNP_A.4261634: Factor w/ 3 levels "0","1","2": 3 3 2 3 3 3 3 3 3 2
> ...
>  $ SNP_A.4261656: Factor w/ 3 levels "0","1","2": 1 1 2 1 1 1 1 1 1 2
> ...
>  $ SNP_A.4261637: Factor w/ 3 levels "0","1","2": 1 3 2 3 2 1 2 1 1 3
> ...
>  $ SNP_A.4261597: Factor w/ 3 levels "AA","AB","BB": 2 2 3 3 3 2 1 2 2 3
> ...
>  $ SNP_A.4261659: Factor w/ 3 levels "AA","AB","BB": 3 3 3 3 3 3 3 3 3 3
> ...
>  $ SNP_A.4261594: Factor w/ 3 levels "AA","AB","BB": 2 2 2 1 1 1 2 2 2 2
> ...
>  $ SNP_A.4261698: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1 1 ...
>  $ SNP_A.4261538: Factor w/ 3 levels "AA","AB","BB": 2 3 2 2 3 2 2 1 1 2
> ...
>  $ SNP_A.4261621: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1 1 1 1
> ...
>  $ SNP_A.4261553: Factor w/ 3 levels "AA","AB","BB": 1 1 2 1 1 1 1 1 1 1
> ...
>  $ SNP_A.4261528: Factor w/ 2 levels "AA","AB": 1 1 1 1 1 1 1 1 1 1 ...
>  $ SNP_A.4261579: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 2 1 1 1 2
> ...
>  $ SNP_A.4261513: Factor w/ 3 levels "AA","AB","BB": 2 1 2 2 2 NA 1 NA 2
> 1 ...
>  $ SNP_A.4261532: Factor w/ 3 levels "AA","AB","BB": 1 2 2 1 1 1 3 1 1 1
> ...
>  $ SNP_A.4261600: Factor w/ 2 levels "AB","BB": 2 2 2 2 2 2 2 2 2 2 ...
>  $ SNP_A.4261706: Factor w/ 2 levels "AA","BB": 1 1 1 1 1 1 1 1 1 1 ...
>  $ SNP_A.4261575: Factor w/ 3 levels "AA","AB","BB": 1 1 1 1 1 1 1 2 2 1
> ...
>
> Its columns are factors with different number of levels (from 1 to 3 -
> that's what I got from read.table, i.e., it dropped missing levels). I
> want to convert it to uniform factors with 3 levels. The 1st 10 rows
> above show already converted columns and the rest are not yet converted.
> Here's my attempt wich is a complete failure as speed:
>
> > system.time(
> + for(j in 1:(10 )){ #-- this is to try 1st 10 cols and
> measure the time, it otherwise is ncol(genoT) instead of 10
>
> +gt<-genoT[[j]]  #-- this is to avoid 2D indices
> +for(l in 1:length([EMAIL PROTECTED])){
> +  levels(gt)[l] <- switch([EMAIL PROTECTED],AA="0",AB="1",BB="2")
> #-- convert levels to "0","1", or "2"
> +  genoT[[j]]<-factor(gt,levels=0:2)   #-- make a 3-level factor
> and put it back
> +}
> + }
> + )
> [1] 785.085   4.358 789.454   0.000   0.000
>
> 789s for 10 columns only!
>
> To me it seems like replacing 10 x 3 levels and then making a factor of
> 1002 element vector x 10 is a "negligible" amount of operations needed.
>
> So, what's wrong with me? Any idea how to accelerate significantly the
> transformation or (to go to the very beginning) to make read.table use a
> fixed set of levels ("AA","AB", and "BB") and not to drop any (missing)
> level?
>
> R-devel_2006-08-26, Sun Solaris 10 OS - x86 64-bit
>
> The machine is with 32G RAM and AMD Opteron 285 (2.? GHz) so it's not
> it.
>
> Thank you very much for the help,
>
> Latchezar Dimitrov,
> Analyst/Programmer IV,
> Wake Forest University School of Medicine,
> Winston-Salem, North Carolina, USA
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can I paste 'newline'?

2007-07-19 Thread jim holtman

Notice the difference:

> cat ('I need to move on to a new line', '\n', 'at here') # change line!
I need to move on to a new line
 at here> paste ('I need to move on to a new line', '\n', 'at here') #
'\n' is just a
[1] "I need to move on to a new line \n at here"
> cat(paste ('I need to move on to a new line', '\n', 'at here'))
I need to move on to a new line
 at here>

> paste("a long string
+ with carriage
+ returns")
[1] "a long string\nwith carriage\nreturns"
>
>
> cat(paste("a long string
+ with carriage
+ returns"))
a long string
with carriage
returns>


paste is showing you the characters in the string; cat is acutally
outputting to a print device where '\n' is a line feed.

On 7/19/07, runner <[EMAIL PROTECTED]> wrote:
>
> It is ok to bury a reg expression '\n' when using 'cat', but not 'paste'.
> e.g.
>
> cat ('I need to move on to a new line', '\n', 'at here') # change line!
> paste ('I need to move on to a new line', '\n', 'at here') # '\n' is just a
> character as it is.
>
> Is there a way around pasting '\n' ? Thanks a lot.
> --
> View this message in context: 
> http://www.nabble.com/can-I-paste-%27newline%27--tf4114350.html#a11699845
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with Dates

2007-07-19 Thread jim holtman

Try some of the following:

head(subset(df, Yr %in% c("00","01","02","03")))

subset(df, (Yr >= '00') & (Yr <= '03'))  # same as above

subset(df, (Yr == '00') | (Yr == '01') | (Yr == '02') |(Yr == '03'))  # same


On 7/19/07, Alex Park <[EMAIL PROTECTED]> wrote:
> R
>
> I am taking an excel dataset and reading it into R using read.table.
> (actually I am dumping the data into a .txt file first and then reading data
> in to R).
>
> Here is snippet:
>
> > head(data);
>   Date  Price Open.Int. Comm.Long Comm.Short net.comm
> 1 15-Jan-86 673.25175645 65910  2842537485
> 2 31-Jan-86 677.00167350 54060  2712026940
> 3 14-Feb-86 680.25157985 37955  2542512530
> 4 28-Feb-86 691.75162775 49760  1603033730
> 5 14-Mar-86 706.50163495 54120  2799526125
> 6 31-Mar-86 709.75164120 54715  3039024325
>
> The dataset runs from 1986 to 2007.
>
> I want to be able to take subsets of my data based on date e.g. data between
> 2000 - 2005.
>
> As it stands, I can't work with the dates as they are not in correct format.
>
> I tried successfully converting the dates to just the year using:
>
> transform(data, Yr = format(as.Date(as.character(Date),format = '%d-%b-%y'),
> "%y")))
>
> This gives the following format:
>
>   Date  Price Open.Int. Comm.Long Comm.Short net.comm Yr
> 1 15-Jan-86 673.25175645 65910  2842537485 86
> 2 31-Jan-86 677.00167350 54060  2712026940 86
> 3 14-Feb-86 680.25157985 37955  2542512530 86
> 4 28-Feb-86 691.75162775 49760  1603033730 86
> 5 14-Mar-86 706.50163495 54120  2799526125 86
> 6 31-Mar-86 709.75164120 54715  3039024325 86
>
> I can subset for a single year e.g:
>
> head(subset(df, Yr =="00")
>
> But how can I subset for multiple periods e.g 00- 05? The following won't
> work:
>
> head(subset(df, Yr =="00" & Yr=="01")
>
> or
>
> head(subset(df, Yr = c("00","01","02","03")
>
> I can't help but feeling that I am missing something and there is a simpler
> route.
>
> I leafed through R newletter 4.1 which deals with dates and times but it
> seemed that strptime and POSIXct / POSIXlt are not what I need either.
>
> Can anybody help me?
>
> Regards
>
>
> Alex
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] linear interpolation of multiple random time series

2007-07-19 Thread jim holtman

This should do it for you:

> x <- read.table(textConnection("trial   timex
+ 1   1   1
+ 1   5   4
+ 1   7   9
+ 1   12  20
+ 2   1   0
+ 2   3   5
+ 2   9   10
+ 2   13  14
+ 2   19  22
+ 2   24  32"), header=TRUE)
> # compute for each trial
> trial.list <- lapply(split(x, x$trial), function(set){
+ .xval <- seq(min(set$time), max(set$time))
+ .yval <- approx(set$time, set$x, xout=.xval)$y
+ cbind(trial=set$trial[1], time=.xval, x=.yval)
+ })
> do.call('rbind', trial.list)
  trial time x
 [1,] 11  1.00
 [2,] 12  1.75
 [3,] 13  2.50
 [4,] 14  3.25
 [5,] 15  4.00
 [6,] 16  6.50
 [7,] 17  9.00
 [8,] 18 11.20
 [9,] 19 13.40
[10,] 1   10 15.60
[11,] 1   11 17.80
[12,] 1   12 20.00
[13,] 21  0.00
[14,] 22  2.50
[15,] 23  5.00
[16,] 24  5.83
[17,] 25  6.67
[18,] 26  7.50
[19,] 27  8.33
[20,] 28  9.17
[21,] 29 10.00
[22,] 2   10 11.00
[23,] 2   11 12.00
[24,] 2   12 13.00
[25,] 2   13 14.00
[26,] 2   14 15.33
[27,] 2   15 16.67
[28,] 2   16 18.00
[29,] 2   17 19.33
[30,] 2   18 20.67
[31,] 2   19 22.00
[32,] 2   20 24.00
[33,] 2   21 26.00
[34,] 2   22 28.00
[35,] 2   23 30.00
[36,] 2   24 32.00
>


On 7/19/07, Mike Lawrence <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> Looking for tips on how I might more optimally solve this. I have
> time series data (samples from a force sensor) that are not
> guaranteed to be sampled at the same time values across trials. ex.
>
> trial   timex
> 1   1   1
> 1   5   4
> 1   7   9
> 1   12  20
> 2   1   0
> 2   3   5
> 2   9   10
> 2   13  14
> 2   19  22
> 2   24  32
>
> Within each trial I'd like to use linear interpolation between each
> successive time sample to fill in intermediary timepoints and x-
> values, ex.
>
> trial   timex
> 1   1   1
> 1   2   1.75
> 1   3   2.5
> 1   4   3.25
> 1   5   4
> 1   6   6.5
> 1   7   9
> 1   8   11.2
> 1   9   13.4
> 1   10  15.6
> 1   11  17.8
> 1   12  20
> 2   1   0
> 2   2   2.5
> 2   3   5
> 2   4   5.83
> 2   5   6.67
> 2   6   7.5
> 2   7   8.33
> 2   8   9.17
> 2   9   10
> 2   10  11
> 2   11  12
> 2   12  13
> 2   13  14
> 2   14  15.3
> 2   15  16.7
> 2   16  18
> 2   17  19.3
> 2   18  20.7
> 2   19  22
> 2   20  24
> 2   21  26
> 2   22  28
> 2   23  30
> 2   24  32
>
>
> The solution I've coded (below) involves going through the original
> data frame line by line and is thus very slow (indeed, I had to
> resort to writing to file as with a large data set I started running
> into memory issues if I tried to create the new data frame in
> memory). Any suggestions on a faster way to achieve what I'm trying
> to do?
>
> #assumes the first data frame above is stored as 'a'
> arows = (length(a$x)-1)
> write('', 'temp.txt')
> for(i in 1:arows){
>if(a$time[i+1] > a$time[i]){
>write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
> append
> = T)
>x1 = a$time[i]
>x2 = a$time[i+1]
>dx = x2-x1
>if(dx != 1){
>y1 = a$x[i]
>y2 = a$x[i+1]
>dy = y2-y1
>slope = dy/dx
>int = -slope*x1+y1
>temp=a[i,]
>for(j in (x1+1):(x2-1)){
>temp$time = j
>temp$x = slope*j+int
>write.table(temp, 'temp.txt', row.names = F, 
> col.names = F,
> append = T)
>}
>}
>}else{
>write.table(a[i,], 'temp.txt', row.names = F, col.names = F, 
> append
> = T)
>}
> }
> i=i+1
> write.table(a[i,],

Re: [R] plot3d labels

2007-07-19 Thread jim holtman

The documentation has:

text3d(x, y = NULL, z = NULL, texts, adj = 0.5, justify, ...)

Do this do it for you?

On 7/19/07, Birgit Lemcke <[EMAIL PROTECTED]> wrote:
> Hello R users,
>
> I am a newby using R  2.5.0 on a Apple Power Book G4 with Mac OS X
> 10.4.10.
>
> Sorry that I ask again such stupid questions, but I haven´t found how
> to label the points created with plot3d (rgl).
> Hope somebody can help me.
>
> Thanks in advance.
>
> Birgit
>
>
> Birgit Lemcke
> Institut für Systematische Botanik
> Zollikerstrasse 107
> CH-8008 Zürich
> Switzerland
> Ph: +41 (0)44 634 8351
> [EMAIL PROTECTED]
>
>
>
>
>
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory error with 64-bit R in linux

2007-07-18 Thread jim holtman


The output from gc() indicates that you had a maximum usage of
476MB+119MB=~600MB.  If you look at the output of ps you will notice
that the process size is 523MB (or about 500MB if you want to be
exact).  So you are using about 25% of the 2GB that you have
available.

mem.limit just shows the current value of the parameters, and as the
help file says:

"Value
mem.limits() returns an integer vector giving the current settings of
the maxima, possibly NA."



On 7/18/07, zhihua li <[EMAIL PROTECTED]> wrote:

Thanks for replying!
i don't think i'm paging. i tried to use a smaller version of my matrix and
do all the checkings as suggested by jim. The smaller matrix caused another
problem, for which I've opened another thread. But i've found something
about memory that I don't understand.
> gc()
 used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  269577 14.45570995 297.6  8919855 476.4
Vcells 3353395 25.69493567  72.5 15666095 119.6

Does this mean the maximum memory I can use for variables is only 120 M?
However, when I tried to check the memory limits:
> mem.limits()
nsize vsize
  NANA

Here it seems the maximum memory is not limited?

When there is no R function is being executed, I checked the system process
by:
ps u

PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
7821  0.0  0.1  10048  2336 pts/0Ss   Jul18   0:00 -bash
8076  2.9 24.5 523088 504004 pts/0   S+   Jul18   2:46 /usr/lib64/R/bi
8918  1.5  0.1   9912  2328 pts/1Ss   00:44   0:00 -bash
8962  0.0  0.0   3808   868 pts/1R+   00:45   0:00 ps u

Does this mean R is using 25% of my memory? But my RAM is 2 GB and the
objects in R only occupy 40 MB from gc().

Did I interpret it wrong?

Thanks a lot!



>From: "jim holtman" <[EMAIL PROTECTED]>
>To: "zhihua li" <[EMAIL PROTECTED]>
>CC: r-help@stat.math.ethz.ch
>Subject: Re: [R] memory error with 64-bit R in linux
>Date: Wed, 18 Jul 2007 17:50:31 -0500
>
>Are you paging?  That might explain the long run times. How much
>space
>are your other objects taking up?  The matrix by itself should only
>require about 13MB if it is numeric.  I would guess it is some of
>the
>other objects that you have in your working space.  Put some gc() in
>your loop to see how much space is being used.  Run it with a subset
>of the data and see how long it takes.  This might give you an
>estimate of the time, and space, that might be needed for the entire
>dataset.
>
>Do a 'ps' to see how much memory your process is using.  Do one
>every
>couple of minutes to see if it is growing.  You can alway use
>Rprof()
>to get an idea of where time is being spent (use it on a small
>subset).
>
>On 7/18/07, zhihua li <[EMAIL PROTECTED]> wrote:
>>Hi netters,
>>
>>I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB.
>>The
>>operating system is SUSE 10.
>>The system information is:
>>-uname -a
>>Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007
>>x86_64
>>x86_64 x86_64 GNU/Linux
>>
>>I used heatmap to process a matrix of the dim [16000,100].  After 3
>>hours
>>of desperating waiting, R told me:
>>cannot allocate vector of size 896 MB.
>>
>>I know the matrix is very big, but since I have 2 GB of RAM and in
>>a 64-bit
>>system, there should be no problem to deal with a vector smaller
>>than 1 GB?
>>(I was not running any other applications in my system)
>>
>>Does anyone know what's going on?  Is there a hardware limit where
>>I have
>>to add more RAM, or is there some way to resolve it softwarely?
>>Also is it
>>possible to speed up the computing (I don't wanna wait another 3
>>hours to
>>know I get another error message)
>>
>>Thank you in advance!
>>
>>_________
>>享用世界上最大的电子邮件系统― MSN Hotmail。
>>http://www.hotmail.com
>>
>>
>>__
>>R-help@stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
>--
>Jim Holtman
>Cincinnati, OH
>+1 513 646 9390
>
>What is the problem you are trying to solve?

_
与联机的朋友进行交流，请使用 MSN Messenger:  http://messenger.msn.com/cn





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saving a dataset permanently in R

2007-07-18 Thread jim holtman

Where are you trying to copy data from?  I would assume that with that
script you are typing all the data in by hand.  Why don't you put it
in a text file and use read.table?  By default, R will save your
workspace on exit and then reload it on startup.  Is this enough to
save your data?  You can also use the 'save' function to store
explicit objects.

On 7/18/07, Felipe Carrillo <[EMAIL PROTECTED]> wrote:
> HI:
> I'm still struggling with datasets, the more I read
> about it the more confussed I get. This is the
> scenario... In R console|Edit|Data Editor, I can find
> all the datasets available with the different
> packages, So to create a new dataset in the R console
> I use the following commands to create an empty data
> frame.
> My_Dataset <- data.frame()
> My_Dataset <- edit(My_dataset)
>
> The problem is that I can't copy my data into the
> dataframe. Is there any suggestions as of how I can
> transfer the data and how it can be saved so everytime
> I open R the dataset would be available.?
> Thanks
>
>  Felipe D. Carrillo
>  Fishery Biologist
>  US Fish & Wildlife Service
>  Red Bluff, California 96080
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory error with 64-bit R in linux

2007-07-18 Thread jim holtman


Are you paging?  That might explain the long run times. How much space
are your other objects taking up?  The matrix by itself should only
require about 13MB if it is numeric.  I would guess it is some of the
other objects that you have in your working space.  Put some gc() in
your loop to see how much space is being used.  Run it with a subset
of the data and see how long it takes.  This might give you an
estimate of the time, and space, that might be needed for the entire
dataset.

Do a 'ps' to see how much memory your process is using.  Do one every
couple of minutes to see if it is growing.  You can alway use Rprof()
to get an idea of where time is being spent (use it on a small
subset).

On 7/18/07, zhihua li <[EMAIL PROTECTED]> wrote:

Hi netters,

I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB.  The
operating system is SUSE 10.
The system information is:
-uname -a
Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007 x86_64
x86_64 x86_64 GNU/Linux

I used heatmap to process a matrix of the dim [16000,100].  After 3 hours
of desperating waiting, R told me:
cannot allocate vector of size 896 MB.

I know the matrix is very big, but since I have 2 GB of RAM and in a 64-bit
system, there should be no problem to deal with a vector smaller than 1 GB?
(I was not running any other applications in my system)

Does anyone know what's going on?  Is there a hardware limit where I have
to add more RAM, or is there some way to resolve it softwarely? Also is it
possible to speed up the computing (I don't wanna wait another 3 hours to
know I get another error message)

Thank you in advance!

_
享用世界上最大的电子邮件系统― MSN Hotmail。  http://www.hotmail.com


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] set up automatic running of R

2007-07-18 Thread jim holtman

Create a .bat file with the commands to execute R BATCH and then
create a scheduled task that will run at the desired time to call the
batch file.

On 7/18/07, Am Stat <[EMAIL PROTECTED]> wrote:
> Hi useR,
>
> I am trying to find how to schedule an automatic run of R periodically, I
> have written some scripts to extract data which are updated monthly on
> another server, my os is xp. The goal is that my script will run at a
> scheduled time every month and record the results to some directories.
>
> Now the scripts are done, only thing I need is to know how to let R run my
> scripts at a certain time, say the first Sunday of each months.
>
> Could anyone give me some clues?
>
> Thanks a million in advance!
>
>
> Best,
>
> Leon
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nested for loop

2007-07-18 Thread jim holtman

This should create your files for you:

x <- 1:1080  # test data
# create a vector of 30 consecutive values for spliting the data
breaks <- rep(1:ceiling(length(x) / 30), each=30)[1:length(x)]
# now partition the data into 30 values and write them
fileNo <- 1  # initialize the file number
invisible(lapply(split(x, breaks), function(.values){
write(.values, file=sprintf("NWRxx.%03d.txt", fileNo))
fileNo <<- fileNo + 1   # update the file number
}))


On 7/18/07, Sherri Heck <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am new to programming and R.  I am reading the manual and R books by 
> Dalgaard and Veranzo to help answer my questions but I am unable to figure 
> out the following:
>
> I have a data file that contains 1080 data points. Here's a snippet of the 
> file:
>
> [241]  0.3603704000  0.1640741000  0.2912963000   NA  0.0159259300  
> 0.0474074100
>
> I would like to break the file up into 30 consecutive data point segments and 
> then write each segment into a separate data file.  This is one version of 
> code that I've tried.
>
> mons = c(1:12)
>
> data = scan(paste("C:/R/NWR.txt"))
> for (mon in mons)  {
>  for (i in c(1:30)) {
>  for (j in data){
>
> write((data),paste(mon,'NWR dc_dt_zi ppm meters per sec.txt',sep=''),ncol=1)
>
> }
>  }
>  }
>
> I think I'm really close, but no cigar.  Thanks in advance for any help-
>
> S.Heck
> Graduate Research Assistant
> University of Colorado, Boulder
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Classification

2007-07-18 Thread jim holtman

You can use 'cut':

> x
 MD
1  0.20
2  0.10
3  0.80
4  0.30
5  0.70
6  0.60
7  0.01
8  0.20
9  0.50
10 1.00
11 1.00
> cut(x$MD, breaks=seq(0,1,.2), include.lowest=TRUE, labels=LETTERS[1:5])
 [1] A A D B D C A A C E E
Levels: A B C D E
>


On 7/18/07, Ing. Michal Kneifl, Ph.D. <[EMAIL PROTECTED]> wrote:
> Hi,
> I am also a quite new user of R and would like to ask you for help:
> I have a data frame where all columns are numeric variables. My aim is
> to convert one columnt in factors.
> Example:
> MD
> 0.2
> 0.1
> 0.8
> 0.3
> 0.7
> 0.6
> 0.01
> 0.2
> 0.5
> 1
> 1
>
>
> I want to make classes:
> 0-0.2 A
> 0.21-0.4 B
> 0.41-0.6 C
> . and so on
>
> So after classification I wil get:
> MD
> A
> A
> D
> B
> .
> .
> .
> and so on
>
> Please could you give an advice to a newbie?
> Thanks a lot in advance..
>
> Michael
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remove columns having a partial match name

2007-07-18 Thread jim holtman

DATA_OK <- DATA[-grep("^Start", names(DATA)),]

On 7/18/07, João Fadista <[EMAIL PROTECTED]> wrote:
> Dear all,
>
> I would like to know how can I retrieve a data.frame without the columns that 
> have a partial match name. Let´s say that I have a data.frame with 200 
> columns and 100 of them have the name "StartX", with X being the unique part 
> for each column name. I want to delete all columns that have the name 
> starting with "Start". I´ve tried to do this but it doesn´t work:
>
> > DATA_OK <- DATA[,-match(("Start*"),names(DATA))]
> > dim(DATA_OK)
> NULL
>
>
> Thanks in advance.
> Best regards
>
> João Fadista
> Ph.d. student
>
>
>
> UNIVERSITY OF AARHUS
> Faculty of Agricultural Sciences
> Dept. of Genetics and Biotechnology
> Blichers Allé 20, P.O. BOX 50
> DK-8830 Tjele
>
> Phone:   +45 8999 1900
> Direct:  +45 8999 1900
> E-mail:  [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
> Web: www.agrsci.org <http://www.agrsci.org/>
> 
>
> News and news media <http://www.agrsci.org/navigation/nyheder_og_presse> .
>
> This email may contain information that is confidential. Any use or 
> publication of this email without written permission from Faculty of 
> Agricultural Sciences is not allowed. If you are not the intended recipient, 
> please notify Faculty of Agricultural Sciences immediately and delete this 
> email.
>
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] poor rbind performance

2007-07-17 Thread jim holtman

Read the data into a list and then:

do.call('rbind', myList)

at the end so you do it only once.  You are having to reallocate
memory each iteration, so no wonder it is slow.

On 7/17/07, Aydemir, Zava (FID) <[EMAIL PROTECTED]> wrote:
> Hi
>
> I rbind data frames in a loop in a cumulative way and the performance
> detriorates very quickly.
>
> My code looks like this:
>
> for( k in 1:N)
> {
>filename <- paste("/tmp/myData_",as.character(k),".txt",sep="")
>myDataTmp <- read.table(filename,header=TRUE,sep=",")
>if( k == 1) {
>myData <- myDataTmp
>}
>else{
>myData <- rbind(myData,myDataTmp)
>}
> }
>
> Some more details:
> - the size of the stored text files is about 100,000 rows and 50 columns
> each
> - for k=1: rbind takes 0.0004 seconds
> - for k=2: rbind takes 13 seconds
> - for k=3: rbind takes 30 seconds
> - for k=4: rbind takes 36 seconds
> etc
>
> Any suggestions to improve speed?
>
> Thanks
>
> Zava
> 
>
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with length()

2007-07-16 Thread jim holtman

POSIXlt is a list structure of 9 elements (see ?POSIXlt).  You can see
that in the data below:

> x <- as.POSIXlt(c('2007-01-01','2007-02-01','2007-03-31'))
> length(x)
[1] 9
> unclass(x)
$sec
[1] 0 0 0

$min
[1] 0 0 0

$hour
[1] 0 0 0

$mday
[1]  1  1 31

$mon
[1] 0 1 2

$year
[1] 107 107 107

$wday
[1] 1 4 6

$yday
[1]  0 31 89

$isdst
[1] 0 0 0

attr(,"tzone")
[1] "GMT"
> length(as.POSIXct(x))
[1] 3


What you probably want to do is to use the POSIXct class.

On 7/16/07, Jacob Etches <[EMAIL PROTECTED]> wrote:
> In the following, can anyone tell me why length(eee) returns 9?  I
> was expecting 15398, and when I try to add this vector to a data
> frame with that many rows, it fails complaining that the vector is of
> length 9.  In what I thought was an identical situation with a
> related dataset, the same code worked as expected.
>
>  > length(fff)
> [1] 15398
>  > str(fff)
> int [1:15398] 20010102 20010102 20010102 20010103 20010103 20010102
> 20010102 20010104 20010103 20010102 ...
>  > fff[1:12]
> [1] 20010102 20010102 20010102 20010103 20010103 20010102 20010102
> 20010104 20010103 20010102 20010105 20010103
>  > eee <- as.POSIXlt(strptime(fff,"%Y%m%d"))
>  > length(eee)
> [1] 9
>  > eee[1:12]
> [1] "2001-01-02" "2001-01-02" "2001-01-02" "2001-01-03" "2001-01-03"
> "2001-01-02" "2001-01-02" "2001-01-04" "2001-01-03" "2001-01-02"
> "2001-01-05" "2001-01-03"
>  > str(eee)
> 'POSIXlt', format: chr [1:15398] "2001-01-02" "2001-01-02"
> "2001-01-02" "2001-01-03" "2001-01-03" "2001-01-02" "2001-01-02"
> "2001-01-04" "2001-01-03" ...
>
>
>
> Many thanks in advance,
> Jacob Etches
>
>
> Doctoral candidate, Epidemiology Program
> Department of Public Health Sciences, University of Toronto Faculty
> of Medicine
>
> Research Associate
> Institute for Work & Health
> 800-481 University Avenue, Toronto, Ontario, Canada   M5G 2E9
> T: 416.927.2027 ext. 2290
> F: 416.927.4167
> [EMAIL PROTECTED]
> www.iwh.on.ca
>
>
>
>
>
> This e-mail may contain confidential information for the sol...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Algorythmic Question on Array Filtration

2007-07-14 Thread jim holtman

>> mass/1E5*5). I need to filter the array such that in
> >> case these mass
> >> windows overlap I retain the mass/intensity pair
> >> with the highest
> >> intensity.
> >> I apologize for this question, but I have no formal
> >> IT education and would
> >> value any nudges toward favorable algorithmic
> >> solutions highly.
> >>
> >> Thanks for any help,
> >>
> >> Joh
> >>
> >> __
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained,
> >> reproducible code.
> >>
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling a list faster

2007-07-13 Thread jim holtman

Actually if you are really interested in the list, then just do the
lapply and compute your data; it seems to be even faster than the
matrix:

> system.time(l.1 <- lapply(1:10^5, function(i) c(i, i+1, i)))
   user  system elapsed
   0.500.000.61
> l.1[1:4]
[[1]]
[1] 1 2 1

[[2]]
[1] 2 3 2

[[3]]
[1] 3 4 3

[[4]]
[1] 4 5 4



On 7/13/07, Philippe Grosjean <[EMAIL PROTECTED]> wrote:
> If all the data coming from your iterations are numeric (as in your toy
> example), why not to use a matrix with one row per iteration? Also, do
> preallocate the matrix and do not add row or column names before the end
> of the calculation. Something like:
>
>  > m <- matrix(rep(NA, 3*10^5), ncol = 3)
>  > system.time(for(i in (1:10^5)) m[i, ] <- c(i,i+1,i))
>user  system elapsed
>   1.362   0.033   1.424
>
> That is, about 1.5sec on my Intel Duo Core 2.33Mhz MacBook Pro, compared to:
>
>  > l <- list("1"<-c(1,2,3))
>  > system.time(for(i in (1:10^5)) l[[length(l)+1]] <- c(i,i+1,i))
>user  system elapsed
> 191.629  49.110 248.454
>
> ... more than 4 minutes for your code.
>
> By the way, what is your "very fast machine", that is actually four
> times faster than mine (gr!)?
>
> Best,
>
> Philippe Grosjean
>
> ..<∞}))><
>  ) ) ) ) )
> ( ( ( ( (Prof. Philippe Grosjean
>  ) ) ) ) )
> ( ( ( ( (Numerical Ecology of Aquatic Systems
>  ) ) ) ) )   Mons-Hainaut University, Belgium
> ( ( ( ( (
> ..
>
> Balazs Torma wrote:
> > hello,
> >
> > first I create a list:
> >
> > l <- list("1"<-c(1,2,3))
> >
> > then I run the following cycle, it takes over a minute(!) to
> > complete on a very fast mashine:
> >
> > for(i in (1:10^5)) l[[length(l)+1]] <- c(i,i+1,i)
> >
> > How can I fill a list faster? (This is just a demo test, the elements
> > of the list are calculated iteratively in an algorithm)
> >
> > Are there any packages and documents on how to use more advanced and
> > fast data structures like linked-lists, hash-tables or trees for
> > example?
> >
> > Thank you,
> > Balazs Torma
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] filling a list faster

2007-07-13 Thread jim holtman

It all depends on what you want to do.  In your example, it is faster
to first fill in a matrix and then convert the matrix to a list.  The
problem with filling in the list is that you are dynamically
allocating space for each iteration which is probably taking at least
an order of magnitude more time than the calculations you are doing.
So I just translated your problem into two steps and it takes about 2
seconds on my system.

> # fill in a matris
> l <- matrix(ncol=3, nrow=10^5)
> system.time(for(i in (1:10^5)) l[i,] <- c(i,i+1,i))
   user  system elapsed
   1.060.001.10
> # convert to a list
> system.time(l.list <- lapply(1:10^5, function(i) l[i,]))
   user  system elapsed
   0.450.000.46
> l.list[1:10]
[[1]]
[1] 1 2 1

[[2]]
[1] 2 3 2

[[3]]
[1] 3 4 3

[[4]]
[1] 4 5 4

[[5]]
[1] 5 6 5

On 7/13/07, Balazs Torma <[EMAIL PROTECTED]> wrote:
> hello,
>
>first I create a list:
>
> l <- list("1"<-c(1,2,3))
>
>then I run the following cycle, it takes over a minute(!) to
> complete on a very fast mashine:
>
> for(i in (1:10^5)) l[[length(l)+1]] <- c(i,i+1,i)
>
> How can I fill a list faster? (This is just a demo test, the elements
> of the list are calculated iteratively in an algorithm)
>
> Are there any packages and documents on how to use more advanced and
> fast data structures like linked-lists, hash-tables or trees for
> example?
>
> Thank you,
> Balazs Torma
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2007-07-12 Thread jim holtman

Is this what you want to do:

> auto.length <- c(12,15,6)
> for(i in 1:3) {
+ nam <- paste("auto.data",i, sep=".")
+ assign(nam, as.data.frame(matrix(1:auto.length[i], ncol=3)))
+ }
> auto.data.1
  V1 V2 V3
1  1  5  9
2  2  6 10
3  3  7 11
4  4  8 12
> auto.data.2
  V1 V2 V3
1  1  6 11
2  2  7 12
3  3  8 13
4  4  9 14
5  5 10 15
> # output the data
> for(i in 1:3){
+ cat(x <- paste('auto.data.', i, sep=''), '\n')
+ print(get(x))
+ }
auto.data.1
  V1 V2 V3
1  1  5  9
2  2  6 10
3  3  7 11
4  4  8 12
auto.data.2
  V1 V2 V3
1  1  6 11
2  2  7 12
3  3  8 13
4  4  9 14
5  5 10 15
auto.data.3
  V1 V2 V3
1  1  3  5
2  2  4  6
>


On 7/12/07, Drescher, Michael (MNR) <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I want to automatically generate a number of data frames, each with an
> automatically generated name and an automatically generated number of
> rows. The number of rows has been calculated before and is different for
> all data frames (e.g. c(4,5,2)). The number of columns is known a priori
> and the same for all data frames (e.g. c(3,3,3)). The resulting data
> frames could look something like this:
>
> > auto.data.1
>  X1 X2 X3
> 1  0  0  0
> 2  0  0  0
> 3  0  0  0
> 4  0  0  0
>
> > auto.data.2
>  X1 X2 X3
> 1  0  0  0
> 2  0  0  0
> 3  0  0  0
> 4  0  0  0
> 5  0  0  0
>
> > auto.data.3
>  X1 X2 X3
> 1  0  0  0
> 2  0  0  0
>
> Later, I want to fill the elements of the data frames with values read
> from somewhere else, automatically looping through the previously
> generated data frames.
>
> I know that I can automatically generate variables with the right number
> of elements with something like this:
>
> > auto.length <- c(12,15,6)
> > for(i in 1:3) {
> + nam <- paste("auto.data",i, sep=".")
> + assign(nam, 1:auto.length[i])
> + }
> > auto.data.1
>  [1]  1  2  3  4  5  6  7  8  9 10 11 12
> > auto.data.2
>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
> > auto.data.3
> [1]  1 2 3 4 5 6
>
> But how do I turn these variables into data frames or give them any
> dimensions? Any commands such as 'as.matrix', 'data.frame', or 'dim' do
> not seem to work. I also seem not to be able to access the variables
> with something like "auto.data.i" since:
>
> > auto.data.i
> Error: object "auto.data.i" not found
>
> Thus, how would I be able to automatically write to the elements of the
> data frames later in a loop such as ...
>
> > for(i in 1:3) {
> + for(j in 1:nrow(auto.data.i)) {   ### this obviously does not work
> since 'Error in nrow(auto.data.i) : object "auto.data.i" not found'
> + for(k in 1:ncol(auto.data.i)) {
> + auto.data.i[j,k] <- 'some value'
> + }}}
>
> Thanks a bunch for all your help.
>
> Best, Michael
>
>
> Michael Drescher
> Ontario Forest Research Institute
> Ontario Ministry of Natural Resources
> 1235 Queen St East
> Sault Ste Marie, ON, P6A 2E3
> Tel: (705) 946-7406
> Fax: (705) 946-2030
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] is.null doesn't work

2007-07-12 Thread jim holtman

'v' appears to be a list:

>  v=c(`-`,`+`,1,`^`,`^`,NA,NA,"X",9,"X",2)
>  i2=16
>  v[i2]
[[1]]
NULL

> str(v)
List of 11
 $ :function (e1, e2)
 $ :function (e1, e2)
 $ : num 1
 $ :function (e1, e2)
 $ :function (e1, e2)
 $ : logi NA
 $ : logi NA
 $ : chr "X"
 $ : num 9
 $ : chr "X"
 $ : num 2

because you used backquotes(`) on the '-'; notice the difference:

> str(c(`-`,1))
List of 2
 $ :function (e1, e2)
 $ : num 1
> str(c('-',1))
 chr [1:2] "-" "1"
>




On 7/12/07, Atte Tenkanen <[EMAIL PROTECTED]> wrote:
> Hi,
>
> What's wrong here?:
>
> > v=c(`-`,`+`,1,`^`,`^`,NA,NA,"X",9,"X",2)
> > i2=16
> > v[i2]
> [[1]]
> NULL
>
> > is.null(v[i2])
> [1] FALSE
>
> Is it a bug or have I misunderstood something?
>
> Atte Tenkanen
> University of Turku, Finland
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Compute rank within factor groups

2007-07-12 Thread jim holtman

Is this what you are looking for:

> x
report score
9 ADEA  0.96
8 ADEA  0.90
11 Asylum_FED9  0.86
3 ADEA  0.75
14 Asylum_FED9  0.60
5 ADEA  0.56
13 Asylum_FED9  0.51
16 Asylum_FED9  0.51
2 ADEA  0.42
7 ADEA  0.31
17 Asylum_FED9  0.27
1 ADEA  0.17
4 ADEA  0.17
6 ADEA  0.12
10ADEA  0.11
12 Asylum_FED9  0.10
15 Asylum_FED9  0.09
18 Asylum_FED9  0.07
> x$rank <- ave(x$score, x$report, FUN=rank)
> x
report score rank
9 ADEA  0.96 10.0
8 ADEA  0.90  9.0
11 Asylum_FED9  0.86  8.0
3 ADEA  0.75  8.0
14 Asylum_FED9  0.60  7.0
5 ADEA  0.56  7.0
13 Asylum_FED9  0.51  5.5
16 Asylum_FED9  0.51  5.5
2 ADEA  0.42  6.0
7 ADEA  0.31  5.0
17 Asylum_FED9  0.27  4.0
1 ADEA  0.17  3.5
4 ADEA  0.17  3.5
6 ADEA  0.12  2.0
10ADEA  0.11  1.0
12 Asylum_FED9  0.10  3.0
15 Asylum_FED9  0.09  2.0
18 Asylum_FED9  0.07  1.0
>


On 7/12/07, Ken Williams <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a data.frame which is ordered by score, and has a factor column:
>
>  Browse[1]> wc[c("report","score")]
>  report score
>  9 ADEA  0.96
>  8 ADEA  0.90
>  11 Asylum_FED9  0.86
>  3 ADEA  0.75
>  14 Asylum_FED9  0.60
>  5 ADEA  0.56
>  13 Asylum_FED9  0.51
>  16 Asylum_FED9  0.51
>  2 ADEA  0.42
>  7 ADEA  0.31
>  17 Asylum_FED9  0.27
>  1 ADEA  0.17
>  4 ADEA  0.17
>  6 ADEA  0.12
>  10ADEA  0.11
>  12 Asylum_FED9  0.10
>  15 Asylum_FED9  0.09
>  18 Asylum_FED9  0.07
>  Browse[1]>
>
> I need to add a column indicating rank within each factor group, which I
> currently accomplish like so:
>
>  wc$rank <- 0
>  for(report in as.character(unique(wc$report))) {
>wc[wc$report==report,]$rank <- 1:sum(wc$report==report)
>  }
>
> I have to wonder whether there's a better way, something that gets rid of
> the for() loop using tapply() or by() or similar.  But I haven't come up
> with anything.
>
> I've tried these:
>
>  by(wc, wc$report, FUN=function(pr){pr$rank <- 1:nrow(pr)})
>
>  by(wc, wc$report, FUN=function(pr){wc[wc$report %in% pr$report,]$rank <-
> 1:nrow(pr)})
>
> But in both cases the effect of the assignment is lost, there's no $rank
> column generated for wc.
>
> Any suggestions?
>
>  -Ken
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] exces return by mktcap decile for each year

2007-07-11 Thread jim holtman

factor.
>
> Here is a solution, which works, but it is clunky. I
> thought there might be a better/more R-like less
> for-loop way to do this.
>
> dat <- read.table("test.data", header=TRUE)
>
> if( "new.data" %in% ls()) {
>  rm( new.data)
> }
> yrs <- as.character(unique( dat$yr))
> for (y in yrs) {
>  bool <- as.character(dat$yr) == y
>  tmp.dat <-  dat[ bool,]
>  breaks <- quantile(tmp.dat$mc,
> probs=seq(0,1,0.1),na.rm=TRUE)
>  breaks[1] <- breaks[1]*.9
> # breaks >0, else 1st value not in (a,b] interval
>  cuts <- cut(tmp.dat$mc, breaks)
>  means.by.dec <- by( tmp.dat$ret, cuts, mean)
>  for ( i in seq(1, dim( tmp.dat)[1])) {
>tmp.dat[i,"dec.mean"] <- means.by.dec[ cuts[i]]
>  }
>  if(! "new.data" %in% ls()) {
>new.data <- tmp.dat
>  }  else {
>new.data <- rbind( new.data, tmp.dat)
>  }
> }
>
> Here is some test input data in the file test.data
> - test.data -
> mc yrret
>  32902.233 01/01/1995  0.426
>  15793.691 01/01/1995  0.024
>   2375.868 01/01/1995  0.660
>  54586.558 01/01/1996  0.497
>  10674.900 01/01/1996  0.405
>859.656 01/01/1996 -0.033
>770.963 01/01/1995 -1.248
>423.480 01/01/1995  0.654
>   2135.504 01/01/1995  0.394
>696.599 01/01/1995 -0.482
>   5115.476 01/01/1995  0.352
>821.347 01/01/1995  0.869
>  43329.695 01/01/1995  0.495
>   7975.151 01/01/1995  0.112
>396.450 01/01/1995  0.956
>843.870 01/01/1995  0.172
>   2727.037 01/01/1995 -0.358
>114.584 01/01/1995 -1.015
>   1347.327 01/01/1995 -0.083
>   4592.049 01/01/1995 -0.251
>674.305 01/01/1995 -0.327
>  39424.887 01/01/1996  0.198
>   4447.383 01/01/1996 -0.045
>   1608.540 01/01/1996 -0.109
>217.151 01/01/1996  0.539
>   1813.320 01/01/1996  0.754
>145.170 01/01/1996  0.249
>   3176.298 01/01/1996 -0.202
>  14379.686 01/01/1996  0.013
>   3009.059 01/01/1996 -0.328
>   1781.406 01/01/1996 -0.158
>   2576.215 01/01/1996  0.514
>   1236.317 01/01/1996  0.346
>   3003.735 01/01/1996  0.151
>   1544.003 01/01/1996  0.482
>   7588.657 01/01/1996  0.306
>   1516.625 01/01/1996  0.183
>   1596.098 01/01/1996  0.674
>   2792.192 01/01/1996  0.528
>   1276.702 01/01/1996  0.010
>875.716 01/01/1996  0.189
>   4858.450 01/01/1995  0.250
>   2033.623 01/01/1995 -0.582
>   2164.125 01/01/1995  0.631
>
> Here is the output which looks ok
>
> > new.data
>  mc yrret   dec.mean
> 1  32902.233 01/01/1995  0.426  0.4605000
> 2   4858.450 01/01/1995  0.250  0.301
> 3   2033.623 01/01/1995 -0.582 -0.094
> 4   2164.125 01/01/1995  0.631  0.6455000
> 5  15793.691 01/01/1995  0.024  0.068
> 6   2375.868 01/01/1995  0.660  0.6455000
> 7770.963 01/01/1995 -1.248 -0.1895000
> 8423.480 01/01/1995  0.654  0.198
> 9   2135.504 01/01/1995  0.394 -0.094
> 10   696.599 01/01/1995 -0.482 -0.4045000
> 11  5115.476 01/01/1995  0.352  0.301
> 12   821.347 01/01/1995  0.869 -0.1895000
> 13 43329.695 01/01/1995  0.495  0.4605000
> 14  7975.151 01/01/1995  0.112  0.068
> 15   396.450 01/01/1995  0.956  0.198
> 16   843.870 01/01/1995  0.172  0.0445000
> 17  2727.037 01/01/1995 -0.358 -0.3045000
> 18   114.584 01/01/1995 -1.015  0.198
> 19  1347.327 01/01/1995 -0.083  0.0445000
> 20  4592.049 01/01/1995 -0.251 -0.3045000
> 21   674.305 01/01/1995 -0.327 -0.4045000
> 22 39424.887 01/01/1996  0.198  0.236
> 23  4447.383 01/01/1996 -0.045 -0.1235000
> 24  1608.540 01/01/1996 -0.109  0.162
> 25   217.151 01/01/1996  0.539  0.2516667
> 26  1813.320 01/01/1996  0.754  0.162
> 27   145.170 01/01/1996  0.249  0.2516667
> 28  3176.298 01/01/1996 -0.202 -0.1235000
> 29 14379.686 01/01/1996  0.013  0.236
> 30  3009.059 01/01/1996 -0.328 -0.0885000
> 31  1781.406 01/01/1996 -0.158  0.162
> 32  2576.215 01/01/1996  0.514  0.521
> 33  1236.317 01/01/1996  0.346  0.2675000
> 34  3003.735 01/01/1996  0.151 -0.0885000
> 35  1544.003 01/01/1996  0.482  0.578
> 36  7588.657 01/01/1996  0.306  0.3555000
> 37  1516.625 01/01/1996  0.183  0.0965000
> 38 54586.558 01/01/1996  0.497  0.236
> 39 10674.900 01/01/1996  0.405  0.3555000
> 40   859.656 01/01/1996 -0.033  0.2516667
> 41  1596.098 01/01/1996  0.674  0.578
> 42  2792.192 01/01/1996  0.528  0.5210000
> 43  1276.702 01/01/1996  0.010  0.0965000
> 44   875.716 01/01/1996  0.189  0.2675000
> >
>
> notice that records 1 and 13 fall into the same mc
> decile for the year 1995, and their ret mean is .4605
> and so forth for the other mc deciles in both years.
>
> I'd be interest

Re: [R] making groups

2007-07-09 Thread jim holtman

It would be nice if you could supply an example of what your input
looks like and then what you would like your output to look like.  You
would probably use 'tapply', but I would have to see what you data
looks like.

On 7/9/07, Mag. Ferri Leberl <[EMAIL PROTECTED]> wrote:
> Dear everybody!
> If I have an array of numbers e.g. the points my students got at an
> examination, and a  key to group the numbers, e.g. the key which
> interval corresponds with which mark (two arrays of the same length or
> one 2x(number of marks)), how can I get the array of absolute
> frequencies of marks?
> I hope I have expressed my problem clearly.
> Thank you in advance.
> Mag. Ferri Leberl
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] parsing strings

2007-07-09 Thread jim holtman

Is this what you want:

> x <- "A10B10A10  B5AB 10 CD 12A10CD2EF3"
> x <- gsub(" ", "", x)  # remove blanks
> y <- gregexpr("[A-Z]+\\s*[0-9]+", x )[[1]]
>
> substring(x, y, y + attr(y, 'match.length') - 1)
[1] "A10"  "B10"  "A10"  "B5"   "AB10" "CD12" "A10"  "CD2"  "EF3"
>


On 7/9/07, Drescher, Michael (MNR) <[EMAIL PROTECTED]> wrote:
> Hi All,
>
>
>
> I have strings made up of an unknown number of letters, digits, and
> spaces. Strings always start with one or two letters, and always end
> with one or two digits. A set of letters (one or two letters) is always
> followed by a set of digits (one or two digits), possibly with one or
> more spaces between the sets of letters and digits. A set of letters
> always belongs to the following set of digits and I want to parse the
> strings into these groups. As an example, the strings and the desired
> parsing results could look like this:
>
>
>
> A10B10, desired parsing result: A10 and B10
>
> A10  B5, desired parsing result: A10 and B5
>
> AB 10 CD 12, desired parsing result: AB10 and CD12
>
> A10CD2EF3, desired parsing result: A10, CD2, and EF3
>
>
>
> I assume that it is possible to search a string for letters and digits
> and then break the string where letters are followed by digits, however
> I am a bit clueless about how I could use, e.g., the 'charmatch' or
> 'parse' commands to achieve this.
>
>
>
> Thanks a lot in advance for your help.
>
>
>
> Best, Michael
>
>
>
>
>
>
>
> Michael Drescher
>
> Ontario Forest Research Institute
>
> Ontario Ministry of Natural Resources
>
> 1235 Queen St East
>
> Sault Ste Marie, ON, P6A 2E3
>
> Tel: (705) 946-7406
>
> Fax: (705) 946-2030
>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Split graphs

2007-07-09 Thread jim holtman

How many columns do you have?  Is it 2 or 1000; can not tell from your
email.  A histogram of 2 values does not seem meaningful.

Do you want 1000 separate histograms, one per page, or multiple per
page?  Yes you can do it, the question is what/how do you want to do
it.

On 7/9/07, tian shen <[EMAIL PROTECTED]> wrote:
> Hello All,
>  I have a question, which somehow I think it is easy, however, I just 
> couldn't get it.
>  I want to histogram each row of a 1000*2 matrix( means it has 1000 rows), 
> and I want to see those 1000 pictures together. How can I do this? Am I able 
> to split a graph into 1000 parts and in each parts it contains a histogram 
> for one row?
>
>  Thank you very much
>
>  Jessie
>
>
> -
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a little problem on selecting a subset from dataset A accordingto dataset B?

2007-07-09 Thread jim holtman

You might want to be careful since what you are comparing is floating
point numbers.  You might want to scale them and then convert to
integers to make sure that you are getting the numbers you think you
should be getting. (FAQ 7.31)

On 7/9/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > AB <- with(B, subset(A, coords.x1 %in% X1))
> > AB
>   coords.x1 coords.x2
> 0   542250.9   3392404
> 7   541512.5   3394722
> 8   541479.3   3394878
> 9   538903.4   3395943
> 18  543274.0   3389919
> 19  543840.8   3392012
>
>
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of zhijie zhang
> Sent: Monday, 9 July 2007 2:43 AM
> To: R-help@stat.math.ethz.ch
> Subject: [R] a little problem on selecting a subset from dataset A
> accordingto dataset B?
>
> Dear Friends,
>   I want to extract the records from A according to B, but the results
> are
> not correct because R says :
>  The length of long object is not integer times on the length of short
> object.
>  Anybody have met the same problem? How to do it correctly?
>
> length(A)=47
> length(B)=6
>
> A[A$coords.x1==B$X1,]   #the program for the above task. I should get 6
> records, but i only get former 4 records for the above reason.
>
> Thanks.
>  The folloing shows dataset A and B.
>
>
> > A
>   coords.x1 coords.x2
> 0  542250.89 3392404.1
> 1  538813.87 3388339.0
> 2  536049.19 3385821.6
> 3  533659.62 3383194.2
> 4  530642.30 3376834.9
> 5  529573.15 3378177.8
> 6  530853.82 3394838.8
> 7  541512.51 3394721.6
> 8  541479.33 3394877.8
> 9  538903.39 3395942.5
> 10 536019.95 3396286.1
> 11 538675.23 3384213.2
> 12 535127.95 3381255.4
> 13 533852.24 3378660.4
> 14 531360.91 3379273.8
> 15 539289.14 3375759.8
> 16 543410.51 3384353.1
> 17 543089.27 3388170.1
> 18 543274.03 3389919.2
> 19 543840.77 3392012.4
> 20 553383.55 3402401.8
> 21 554621.51 3397938.9
> 22 564096.42 3397524.4
> 23 567529.64 3398702.9
> 24 561798.76 3404864.0
> 25 562868.34 3405502.2
> 26 563145.22 3403192.1
> 27 562419.87 3404090.4
> 28 558321.85 3403879.9
> 29 567050.74 3404973.1
> 30 570609.70 3408742.4
> 31 556777.57 3397858.0
> 32 531353.38 3368596.6
> 33 533513.50 3372749.3
> 34 537543.19 3364284.8
> 35 538779.41 3368224.8
> 36 525930.09 3374067.7
> 37 522990.85 3369213.1
> 38 528826.37 3359019.0
> 39 533865.85 3362595.4
> 40 531200.25 3365053.0
> 41 551054.10 3377181.3
> 42 546974.19 3369284.8
> 43 572315.59 3359541.1
> 44 562703.63 3355173.4
> 45 558959.31 3357804.4
> 46 558531.39 3361741.1
>
>
> > B
> X1X2
> 1 542250.89 3392404.1
> 2 541512.51 3394721.6
> 3 541479.33 3394877.8
> 4 538903.39 3395942.5
> 5 543274.03 3389919.2
> 6 543840.77 3392012.4
>
> --
> With Kind Regards,
>
> oooO:
> (..):
> :\.(:::Oooo::
> ::\_)::(..)::
> :::)./:::
> ::(_/
> :
> [***
> ]
> Zhi Jie,Zhang ,PHD
> Tel:86-21-54237149
> Dept. of Epidemiology,School of Public Health,Fudan University
> Address:No. 138 Yi Xue Yuan Road,Shanghai,China
> Postcode:200032
> Email:[EMAIL PROTECTED]
> Website: www.statABC.com
> [***
> ]
> oooO:
> (..):
> :\.(:::Oooo::
> ::\_)::(..)::
> :::)./:::
> ::(_/
> :
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 >

1 - 100 of 579 matches

Mail list logo