Re: [R] How to convert c:\a\b to c:/a/b?

2005-06-27 Thread james . holtman




If you have 'copied' the path from DOS, then you can use 'scan' to read it
into a variable with the proper characters.

Here is the string that I 'copied'

D:\spencerg\statmtds\R\Rnews


Here is the results after 'scan':

 x.1 - scan('clipboard', what='', allowEscapes=FALSE)
Read 1 item
 x.1
[1] D:\\spencerg\\statmtds\\R\\Rnews


Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929


   
 Henrik Bengtsson  
 [EMAIL PROTECTED] 
 Sent by:   To 
 [EMAIL PROTECTED] Spencer Graves  
 at.math.ethz.ch   [EMAIL PROTECTED]
cc 
   r-help@stat.math.ethz.ch, Dirk  
 06/27/2005 14:53  Eddelbuettel [EMAIL PROTECTED]   
   Subject 
   Re: [R] How to convert c:\a\b to  
   c:/a/b?   
   
   
   
   
   
   




Spencer Graves wrote:
 Hi, Henrik:

   Several functions, e.g., grep, sub, gsub, and regexpr,
 have an argument perl, FALSE by default.  Moreover, ?regexp has a
 section on Perl Regular Expressions.  If you can do it in perl, might
 that transfer to gsub(..., perl=TRUE)?

I do not know the details behind the different dialects of regular
expressions, but you can _not_ get the R parser to interpret the two
ASCII characters \n, as the two characters \ and n. The R parser
is used when code is read by source() or when expressions are typed at
the R prompt.  The parser will always read it as the newline character
(ASCII 10). The results from the parser is then passed to the R enginee.
  Thus, you cannot write your program such that it fools the parser,
because your program is evaluated first after the parser.  In other
words, there is no way you can get nchar(\n) to equal 2.

Cheers

Henrik


   Thanks,
   spencer graves
 p.s.  I skimmed the discussion of Pearl Regular Expressions, and
 experimented with gsub(..., perl=TRUE) without success.  However,
 there may be a way to do it, and I just don't know perl and regexp well
 enough to have figured it out in the time available.

 Henrik Bengtsson wrote:

 Spencer Graves wrote:

   Thanks, Dirk, Gabor, Eric:

   You all provided appropriate solutions for the stated problem.
 Sadly, I oversimplified the problem I was trying to solve:  I copy a
 character string giving a DOS path from MS Windows Explorer into an R
 script file, and I get something like the following:

   D:\spencerg\statmtds\R\Rnews

   I want to be able to use this in R with its non-R meaning,
 e.g., in readLine, count.fields, read.table, etc., after appending a
 file name. Your three solutions all work for my oversimplified toy
 example but are inadequate for the problem I really want to solve.



 Hmmm. It should work as long as you do not source() the file (see
 below).  There are two things to watch out for here.

 First, you have to be careful with backslashes, that is, a backslash
 is a single character ('\') in memory, but to be typed at the R
 prompt, you have to escape it (with a backslash), which is why we type
 \\, cf. nchar(\\) == 0.  Consider the file foo.txt containing the
 28 characters (==28 bytes in plain ASCII format)

 D:\spencerg\statmtds\R\Rnews

 You can create such a file in R by

   cat(file=foo.txt, D:\\spencerg\\statmtds\\R\\Rnews)
   str(file.info(foo.txt))
 `data.frame':   1 obs. of  6 variables:
  $ size : num 28
  $ isdir: logi FALSE
  $ mode :Class 'octmode'  int 438
  $ mtime:'POSIXct', format: chr 2005-06-27 19:14:20
  $ ctime:'POSIXct', format: chr 2005-06-27 19:14:20
  $ atime:'POSIXct', format: chr 2005-06-27 19:14:20

 Re-read it into R:
   bfr - readLines(foo.txt)
 Warning message:
 incomplete final line found by readLines on 'foo.txt'
   bfr
 [1] D:\\spencerg\\statmtds\\R\\Rnews
   cat(bfr=', bfr, '\n, sep=)
 bfr='D:\spencerg\statmtds\R\Rnews'

 Now, convert backslashes to forwardslashes:
 bfr2 - gsub(, /, bfr)
   bfr2
 [1] D:/spencerg/statmtds/R

Re: [R] grep negation

2005-06-23 Thread james . holtman




?setdiff

e.g.,

 txt - c(arm,foot,lefroo, bafoobar)
 i - grep(foo,txt); i
[1] 2 4
 setdiff(seq(length(txt)),grep(foo,txt))
[1] 1 3



Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Marcus Leinweber  
   
  [EMAIL PROTECTED]To:   
'r-help@stat.math.ethz.ch' r-help@stat.math.ethz.ch   
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] grep negation  
   
  ath.ethz.ch   
   

   

   
  06/23/2005 08:59  
   

   




hi,

using the example in the grep help:
txt - c(arm,foot,lefroo, bafoobar)
i - grep(foo,txt); i
[1] 2 4

but how can i get the negation (1,3) when looking for 'foo'?

thanks,
m.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (no subject)

2005-06-20 Thread james . holtman




'rle' might be your friend.  This will find the 'run of a sequence'

Here is some code working off the 'visit' data that you created.

# $Log$
x.1 - matrix(visit, ncol=4)  # your data
x.rle - apply(x.1, 1, rle)  # compute 'rle' for each row
Passed - lapply(x.rle, function(x){  # now process each row see if it
meets the criteria
.len - length(x$lengths)
if (x$lengths[.len]  1  x$values[.len] == 1) return(TRUE)  # last
two passed
else if (.len == 2){ # two sequences
if (x$lengths[.len] == 1  x$values[.len] == 1) return(TRUE) #
only last passed
}
return(FALSE)
})
cbind(unlist(Passed), x.1)  # put results in first column with the data

Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  [EMAIL PROTECTED] 
   
  .ca  To:   
r-help@stat.math.ethz.ch  
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] (no subject)   
   
  ath.ethz.ch   
   

   

   
  06/20/2005 11:58  
   

   





R friends,
I am using R 2.1.0 in a Win XP . I have a problem working with lists,
probably I
do not understand how to use them.

Lets suppose that a set of patients visit a clinic once a year for 4 years
on each visit a test, say 'eib' is performed with results 0 or 1
The patients do not all visit the clinic the 4 times but they missed a lot
of visits.
The test is considered positive if it is positive at the last 2 visits of
that
patient, or a more lenient definition, it is positive in the last visit,
and
never before.
Otherwise it is Negative = always negative or is a YoYo = unstable =
changes
from positive to negative.
So, if I codify the visits with codes 1,2,4,8 if present at year 1,2,3,4
and
similarly the tests positive I get the last2 list codifying the test code
corresponding to the visits patterns possible, similarly the last1 list
20 here means NULL

nobs - 400
#  visits   0   1   23 45  6  7   89
last1 - list((20),(1),(2),c(3,2),(4),c(5,4),c(6,4),c(7,6,4),(8),c(9,8),
#  visits  10  11 12  13 14 15

c(10,8),c(11,10,8),c(12,8),c(13,12,8),c(14,12,8),c(15,14,12,8))
#  visits   0   123   45   67 89
last2 - list((20),(20),(20),(3),(20),(5),(6),c(7,6),(20),(9),
#  visits  1011  1213   14   15
  (10),c(11,10),(12),c(13,12),c(14,12),c(15,14,12))
#
# simulate the visits
#
visit - rbinom(nobs,1,0.7)
eib - visit
#
# simulate a positive test at a given visit
#
eib - ifelse(runif(nobs)  0.7,visit,0)
#
# create the codes
#
viskode - matrix(visit,ncol=4) %*% c(1,2,4,8)
eibkode - matrix(eib,ncol=4) %*% c(1,2,4,8)
#
#this is the brute force method, slow, of computing the Results
according to
#the 2 definitions above. Add 16 to the test kode to signify YoYos,
Exactly
#16 will be the negatives
#
 eibnoyoyo - eibkode+16
 eiblst2 - eibkode+16
 for(i in 1:nobs){
   if(eibkode[i] %in% last1[[viskode[i]+1]])
  eibnoyoyo[i] - eibkode[i]
   if(eibkode[i] %in% last2[[viskode[i]+1]])
  eiblast2[i] - eibkode[i]
 }
#
#why is that these statements do not work?
#
eeibnoyoyo - eeiblst2 - rep(0,nobs)
eeibnoyoyo - ifelse(eibkode %in% last1[viskode+1],eibkode,eibkode+16)
eeiblast2   - ifelse(eibkode %in% last2[viskode+1],eibkode,eibkode+16)
#
table(viskode,eibkode)
table(viskode,eibnoyoyo)
table(viskode,eiblast2)
#
#  these two tables must be diagonal!!
#
table(eibnoyoyo,eeibnoyoyo)
table(eiblast2,eeiblast2)
#
Thanks for any help
Heberto Ghezzo
McGill University
Canada

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read

Re: [R] vectorisation suggestion

2005-06-20 Thread james . holtman




v3 - numeric()
v3[v1] - table(v2)[v1]


Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Federico Calboli  
   
  [EMAIL PROTECTED]To:   r-help 
r-help@stat.math.ethz.ch 
  c.ukcc:  
   
  Sent by: Subject:  [R] vectorisation 
suggestion  
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  06/20/2005 16:15  
   

   




Hi All,

I am counting the number of occurrences of the terms listed in one
vector in another vector.

My code runs:

for( i in 1:length(vector3)){
   vector3[i]  = sum(1*is.element(vector2,  vector1[i]))
}

where

vector1 = vector containing the terms whose occurrences I want to count
vector2 = made up of a number of repetitions of all the elements of
vector1
vector3 = a vector of NAs that is meant to get the result of the
counting

My problem is that vector1 is about 6 terms, and vector2 is
62... can anyone suggest a faster code than the one I wrote?

Cheers,

Federico Calboli


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] vectorization

2005-06-17 Thread james . holtman




try this:

 x.1 - data.frame(income=runif(100)*1,
educ=sample(c('hs','col','none'),100,T))
 x.1
income educ
1   5930.30882  col
2   5528.83222   hs
3   5967.04041   hs
4   3926.30682   hs
5   2603.75924 none
...
 x.2 - tapply(x.1$income, x.1$educ, mean)
 x.2
 col   hs none
5575.310 4994.921 5481.962
 x.1$median - x.2[x.1$educ]
 x.1
income educ   median
1   5930.30882  col 5575.310
2   5528.83222   hs 4994.921
3   5967.04041   hs 4994.921
4   3926.30682   hs 4994.921
5   2603.75924 none 5481.962
6   7398.83325  col 5575.310
7265.06895   hs 4994.921
.


Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Dimitri Joe 
   
  [EMAIL PROTECTED]To:   R-Help 
r-help@stat.math.ethz.ch   
  .br cc:  
   
  Sent by: Subject:  [R] vectorization  
   
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  06/17/2005 14:00  
   

   




Hi there,

I have a data frame (mydata) with 1 numeric variable (income) and 1 factor
(education). I want a new column in this data with the median income for
each education level. A obviously inneficient way to do this is

for ( k in 1: nrow(mydata) ){
l - mydata$education[k]
mydata$md[k] - median(mydata$income[mydata$education==l],na.rm=T)
}

Since mydata has nearly 30.000 rows, this will be done not untill the end
of this month. I thus need some help for vectorizing this, please.

Thanks,

Dimitri

 [[alternative HTML version deleted]]






___

Instale o discador agora! http://br.acesso.yahoo.com/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Manipulating dates

2005-06-14 Thread james . holtman




Use POSIX.  To convert:

my.dates - strptime(your.characters, format='%d/%m/%Y')


once you have that, you can use 'min' to find the minimum.

'difftime' will give you the differences.

Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Richard Hillary   
   
  [EMAIL PROTECTED]To:   
r-help@stat.math.ethz.ch  
  c.ukcc:  
   
  Sent by: Subject:  [R] Manipulating 
dates
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  06/14/2005 10:07  
   
  Please respond to 
   
  r.hillary 
   

   




Hello,
  Given a vector of characters, or factors, denoting the date in
the following way: 28/03/2000, is there a method of
1) Computing the earliest of these dates;
2) Using this as a base, then converting all the other dates into merely
the number of days after this minimum date
Many thanks
Richard Hillary

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Dateticks

2005-06-14 Thread james . holtman




try this example:

 x.1 - strptime(6/17/03,'%m/%d/%y')
 x.1
[1] 2003-06-17
 plot(0:250, xaxt='n')
 dates - x.1 + c(0,50,100,150,200,250) * 86400  # 'dates' is in seconds,
so add the appropriate number of days
 dates
[1] 2003-06-17 00:00:00 EDT 2003-08-06 00:00:00 EDT 2003-09-25
00:00:00 EDT
[4] 2003-11-13 23:00:00 EST 2004-01-02 23:00:00 EST 2004-02-21
23:00:00 EST
 axis(1, at=c(0,50,100,150,200,250), labels=format(dates,%m/%d/%y))  #
format the output


Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Bernard L. Dillard  
   
  [EMAIL PROTECTED]   To:   
r-help@stat.math.ethz.ch  
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] Dateticks  
   
  ath.ethz.ch   
   

   

   
  06/14/2005 12:27  
   

   




Hello.  I am having the worst time converting x-axis date ticks to real
dates.  I have tried several suggestions in online help tips and books to
no avail.

For example, the x-axis has 0, 50, 100, etc, and I want it to have
6/17/03, 8/6/03 etc.  See attached (sample).

Can anybody help me with this.

Here's my code:

ts.plot(date.attackmode.table[,1], type=l, col=blue, lty=2,ylab=IED
 Attacks, lwd=2,xlab=Attack Dates,main=Daily Summary of Attack
 Mode)
grid()

Thanks for your help if possible.
(See attached file: sample.pdf)
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

sample.pdf
Description: Adobe PDF document
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] transform large matrix into list

2005-06-07 Thread james . holtman




 x.1
 [,1] [,2]
[1,]14
[2,]25
[3,]   NA6
 cbind(x.1[!is.na(x.1)], which(!is.na(x.1), arr.ind=TRUE))
   row col
[1,] 1   1   1
[2,] 2   2   1
[3,] 4   1   2
[4,] 5   2   2
[5,] 6   3   2


Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Convergys Labs
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Stefan Mischke
   
  [EMAIL PROTECTED]To:   
r-help@stat.math.ethz.ch  
  .ch cc:  
   
  Sent by: Subject:  [R] transform 
large matrix into list  
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  06/07/2005 08:55  
   

   




Dear List

I need to transform a large matrix M with many NAs into a list L with
one row for each non missing cell. Every row should contain the cell
value in the first column, and its coordinates of the matrix in column
2 and 3.

M:
 x1  x2
y1   1   2
y2   4   5
y3   7   8

L:
vx   y
11   1
41   2
71   2
22   1
52   2
82   3

I'm trying to do this with a loop, but since my matrix is quite large
(around 10k^2) this just takes a very long time.
There must be a more efficient and elegant way to do this. Any hints?

Thanks,
Stefan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] weighted.mean and tapply (again)

2005-05-25 Thread james . holtman




 x.1 - read.table('clipboard',header=T)
 x.1
   GROUP VALUE FREQUENCY
1  2 278
2  2 340
3  2 416
4  2 5 3
5  2 6 1
6  2 8 1
7  3 319
8  3 410
9  3 519
10 3 6 4
 by(x.1, x.1$GROUP, function(x) weighted.mean(x$VALUE, x$FREQUENCY))
x.1$GROUP: 2
[1] 2.654676
---
x.1$GROUP: 3
[1] 4.153846


Jim
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Dan Bolser
   
  [EMAIL PROTECTED]To:   R mailing list 
r-help@stat.math.ethz.ch 
  uk  cc:  
   
  Sent by: Subject:  [R] weighted.mean 
and tapply (again)  
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  05/25/2005 11:33  
   

   





I read answers to questions including the words tapply and
weighted.mean, but I didn't understand either the problem (data) or the
solution provided.

Here is my question ...

 dat[1:10,]
  GROUP  VALUE FREQUENCY
1 2  278
2 2  340
3 2  416
4 2  5 3
5 2  6 1
6 2  8 1
7 3  319
8 3  410
9 3  519
1 3  6 4


For each GROUP, I would like to calculate the weighted.mean of VALUE using
the FREQUENCY as the weight, so for the snippet of data shown that would
be...

group.2 - weighted.mean(c(2,3,4,5,6,8),c(78,40,16,3,1,1))
group.3 - weighted.mean(c(3,4,5,6),c(19,10,19,4))

 cbind(rbind(2,3),rbind(group.2,group.3))
[,1] [,2]
group.22 2.654676
group.33 4.153846

I would like to use tapply to automatically do this across the whole
dataset (dat) - which includes lots of other distinct grouping factors,
however, like I said, I couldn't understand (and therefore apply to my
data) any of the other solutions I found, so any help here would be
greatly appreciated!

All the best,
Dan.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] plot question

2005-03-03 Thread james . holtman




tt - data.frame(c(0.5, 1, 0.5))
names(tt) - a
plot(tt$a, type = 'o',xlim=c(0,4))

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Christoph Lehmann 
   
  [EMAIL PROTECTED]To:   
r-help@stat.math.ethz.ch  
  x.chcc:  
   
  Sent by: Subject:  [R] plot question  
   
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  03/03/2005 11:29  
   

   




I have the following simple situation:

tt - data.frame(c(0.5, 1, 0.5))
names(tt) - a
plot(tt$a, type = 'o')

gives the following plot ('I' and '.' represent the axis):

I
I
I X
I
I
I X   X
I...
   1   2   3

what do I have to change to get the following:


I
I
I  X
I
I
I  X   X
I.
1   2   3

i.e. the plot-region should be widened at the left and right side

thanks for a hint

christoph

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (no subject)

2005-02-16 Thread james . holtman




use 'gsub'

 x - c('1,200.44', '23,345.66')
 gsub(',','',x)
[1] 1200.44  23345.66
 as.numeric(gsub(',','',x))
[1]  1200.44 23345.66

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Jim Gustafsson
   
  [EMAIL PROTECTED]   To:   
r-help@stat.math.ethz.ch  
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] (no subject)   
   
  ath.ethz.ch   
   

   

   
  02/16/2005 09:08  
   

   





R-people

I wonder if one could change a list of table with number of the form
1,200.44 , to 1200.44

Regards
JG


--

This e-mail and any attachment may be confidential and may also be
privileged.
If you are not the intended recipient, please notify us immediately and
then
delete this e-mail and any attachment without retaining copies or
disclosing
the contents thereof to any other person.
Thank you.
--

 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Programming/scripting with expressions - variables

2005-02-07 Thread james . holtman




Here is one way.  It is the custom to return a value that will be assigned
to the variable, so I changed your 'macro' to a function that returns the
value and then assigns it to your variable:

 test - function(name, value){
+ .result - NULL # initialize to NULL
+ .result[name] - value
+ .result[paste('other_', name, sep='')] - paste(other_, value,
sep='')
+ .result
+ }
 Gregor - test('Gorjanc', '25')
 Gregor# print out the vector
  Gorjanc other_Gorjanc
 25other_25

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Gorjanc Gregor  
   
  [EMAIL PROTECTED]To:   
r-help@stat.math.ethz.ch
  uni-lj.si   cc:  
   
  Sent by: Subject:  [R] 
Programming/scripting  with expressions - variables 
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  02/07/2005 09:52  
   

   

   




Hello to Rusers!

I am puzzled with R and I really do not know where to look
in for my problem. I am moving from SAS and I have difficulties
in translating SAS to R world. I hope I will get some hints
or pointers so I can study from there on.

I would like to do something like this. In SAS I can write
a macro as example bellow, which is afcourse a silly one but
shows what I don't know how to do in R.

%macro test(data, colname, colvalue);
data data;
...
colname=colvalue;
other_colname=other_colvalue;
run;
%mend;

And if I run it with this call:
%test(Gregor, Gorjanc, 25);

I get a table with name 'Gregor' and columns 'Gorjanc',
and 'other_Gorjanc' with values:

Gorjanc other_Gorjanc
25other_25

So can one show me the way to do the same thing in R?

Thanks!

--
Lep pozdrav / With regards,
Gregor GORJANC

---
University of Ljubljana
Biotechnical Faculty   URI: http://www.bfro.uni-lj.si
Zootechnical Departmentemail: gregor.gorjanc at bfro.uni-lj.si
Groblje 3  tel: +386 (0)1 72 17 861
SI-1230 Domzalefax: +386 (0)1 72 17 888
Slovenia

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Frequency of Data

2005-02-02 Thread james . holtman




Try this using strsplit and table:

 dates - c('29.02.1997','15.02.2001','15.02.2001','23.12.2002')

 x.1 - do.call('rbind',strsplit(dates,'\\.'))
 x.1
 [,1] [,2] [,3]
[1,] 29 02 1997
[2,] 15 02 2001
[3,] 15 02 2001
[4,] 23 12 2002
 class(x.1) - 'integer'
 x.1
 [,1] [,2] [,3]
[1,]   292 1997
[2,]   152 2001
[3,]   152 2001
[4,]   23   12 2002

 table(list(x.1[,2], x.1[,3]))

.2
.1   1997 2001 2002
  2  120
  12 001

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



 
  Carsten Steinhoff   
 
  [EMAIL PROTECTED]To:   
r-help@stat.math.ethz.ch
  ttingen.decc:
 
  Sent by:   Subject:  [R] 
Frequency of Data 
  [EMAIL PROTECTED] 
   
  h 
 

 

 
  02/02/2005 14:44  
 

 

 




Hello,

just another problem in R, maybe it's simple to solve for you. I didn't
find
a solution up to now, but I'm convinced that I'm not the only one who
has/had a similar problem. Maybe there's a ready-made function in R?

The prob:

I've imported a CSV-file into R with 1000 dates of an observed event
(there's only information of the date. When there happend no event the date
is not recorded, when there have been two events it's recordet twice). Now
I
want to COUNT the frequency of events in every month or year.

The CSV-data is structured as:

date
25.02.2003
29.07.1997
...

My desired output would be a matrix with n rows for the years and m columns
for the month.

How could a solution look like ? If the format is no matrix it doesn't
matter. Importend is the extraction of frequency from my data.

Thanks for all reply,

Carsten

 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Finding runs of TRUE in binary vector

2005-01-27 Thread james . holtman




use 'rle';

 a - rnorm(20)
 b - a  .5
 b
 [1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE
TRUE
[13] FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE
 rle(b)
Run Length Encoding
  lengths: int [1:9] 1 7 2 2 2 3 1 1 1
  values : logi [1:9] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Sean Davis
   
  [EMAIL PROTECTED]To:   r-help 
r-help@stat.math.ethz.ch 
  cc:  
   
  Sent by: Subject:  [R] Finding runs 
of TRUE in binary vector   
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  01/27/2005 17:13  
   

   

   




I have a binary vector and I want to find all regions of that vector
that are runs of TRUE (or FALSE).

  a - rnorm(10)
  b - a0.5
  b
  [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE

My function would return something like a list:
region[[1]] 1,3
region[[2]] 5,5
region[[3]] 7,10

Any ideas besides looping and setting start and ends directly?

Thanks,
Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Avoiding a Loop?

2005-01-21 Thread james . holtman




Does this do what you want?

nr.of.columns - 4

myconstant - 27.5

mymatrix - matrix(myconstant, nrow=5, ncol=nr.of.columns)

mymatrix[,1] - 1:5

t(apply(mymatrix, 1, function(x) cumprod(x)))


__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Rau, Roland 
   
  [EMAIL PROTECTED]  To:   
r-help@stat.math.ethz.ch
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] Avoiding a Loop?   
   
  ath.ethz.ch   
   

   

   
  01/21/2005 07:31  
   

   

   




Dear R-Helpers,

I have a matrix where the first column is known. The second column is
the result of multiplying this first column with a constant const. The
third column is the result of multiplying the second column with
const.
So far, I did it like this (as a simplified example):

nr.of.columns - 4

myconstant - 27.5

mymatrix - matrix(numeric(0), nrow=5, ncol=nr.of.columns)

mymatrix[,1] - 1:5

for (i in 2:nr.of.columns) {
 mymatrix[,i] - myconstant * mymatrix[,i-1]
}


Can anyone give me some advice whether it is possible to avoid this loop
(and if yes: how)?

Any suggestions are welcome!

Thanks,
Roland




+
This mail has been sent through the MPI for Demographic Rese...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] recoding large number of categories (select in SAS)

2005-01-19 Thread james . holtman




Here is a way of doing it by setting up a matrix of values to test against.
Easier than writing all the 'select' statements.

 x.trans - matrix(c(  # translation matrix; first column is min, second
is max,
+ 149, 150, 150,  # and third is the value to be returned
+ 186, 187, 187,
+ 438, 438, 438,
+ 430, 430, 430,
+ 808, 826, 808,
+ 830, 832, 808,
+ 997, 998, 792,
+ 792, 796, 792), ncol=3, byrow=T)
 colnames(x.trans) - c('min', 'max', 'value')

 x.default -    # default/nomatch value

 x.test - c(150, 149, 148, 438, 997, 791, 795, 810, 820, 834)   # test
data
 #
 # this function will test each value and if between the min/max, return 3
column
 #
 newValues - sapply(x.test, function(x){
+ .value - x.trans[(x = x.trans[,'min'])  (x =
x.trans[,'max']),'value']
+ if (length(.value) == 0) .value - x.default# on no match, take
default
+ .value[1]   # return first value if multiple matches
+ })
 newValues
 [1]  150  150   438  792   792  808  808 

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Denis Chabot  
   
  [EMAIL PROTECTED]To:   
r-help@stat.math.ethz.ch  
  .netcc:  
   
  Sent by: Subject:  [R] recoding large 
number of categories (select in SAS)   
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  01/19/2005 08:56 AM   
   

   

   




Hi,

I have data on stomach contents. Possible prey species are in the
hundreds, so a list of prey codes has been in used in many labs doing
this kind of work.

When comes time to do analyses on these data one often wants to regroup
prey in broader categories, especially for rare prey.

In SAS you can nest a large number of if-else, or do this more
cleanly with select like this:
select;
   when (149 = prey =150)   preyGr= 150;
   when (186 = prey = 187)  preyGr= 187;
   when (prey= 438) preyGr= 438;
   when (prey= 430) preyGr= 430;
   when (prey= 436) preyGr= 436;
   when (prey= 431) preyGr= 431;
   when (prey= 451) preyGr= 451;
   when (prey= 461) preyGr= 461;
   when (prey= 478) preyGr= 478;
   when (prey= 572) preyGr= 572;
   when (692 = prey =  695 )
preyGr= 692;
   when (808 = prey =  826, 830 = prey = 832 )   preyGr= 808;
   when (997 = prey = 998, 792 = prey = 796) preyGr= 792;
   when (882 = prey = 909)
 preyGr= 882;
   when (prey in (999, 125, 994))
preyGr= 9994;
   otherwise preyGr= 1;
end; *select;

The number of transformations is usually much larger than this short
example.

What is the best way of doing this in R?

Sincerely,

Denis Chabot

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] plotting percent of incidents within different 'bins'

2005-01-05 Thread james . holtman




You can use 'cut' to create the breaks..  Actually there are 8 in the 3-4
range:

   Outcome predictor
10 1
21 2
31 2
40 3
50 3
60 2
71 3
81 4
91 4
10   0 4
11   0 4
12   0 4
 cut(x.1$p, breaks=c(0,2,4))
 [1] (0,2] (0,2] (0,2] (2,4] (2,4] (0,2] (2,4] (2,4] (2,4] (2,4] (2,4]
(2,4]
Levels: (0,2] (2,4]
 x.c - cut(x.1$p, breaks=c(0,2,4))
 tapply(x.1$O, x.c, function(x)sum(x==1)/length(x))
(0,2] (2,4]
0.500 0.375

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Stephen Choularton  
   
  [EMAIL PROTECTED]   To:   R Help 
r-help@stat.math.ethz.ch   
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] plotting percent 
of incidents within different 'bins' 
  ath.ethz.ch   
   

   

   
  01/05/2005 14:34  
   

   

   




Hi

Say I have some data, two columns in a table being a binary outcome plus
a predictor and I want to plot a graph that shows the percentage
positives of the binary outcome within bands of the predictor, e.g.


Outcome   predictor

0  1
1  2
1  2
0  3
0  3
0  2
1  3
1  4
1  4
0  4
0  4
0  4
etc

In this case there are 4 cases in the band 1 - 2 of the predictor, 2 of
them are true so the percent is 50% and there are 7 cases in the band 3
- 4, 3 of which are true making the percentage 43% .

Is there some function in R that will sum these outcomes by bands of
predictor and produce a one by two  data set with the percentages in one
column and the ordered bands in the other, or alternately is there some
sort of special plot. that does it all for you?

Thanks

Stephen



 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lists within a list / data-structure problem

2004-12-13 Thread james . holtman




construct you list in the loop:

x.all - list()  # initialize
for (i in 1:limit){
  ...
  x.all[[i]] - result.list
}

now you want to name them, e.g., run1 

names(x.all) - paste('run', seq(length(x.all)), sep='')


To access, you can dox.all$run1$Dom


To extract all the 'Dom's

lapply(x.all, function(x) x$Dom)

HTH
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Jan Wantia
   
  [EMAIL PROTECTED]To:   
[EMAIL PROTECTED]  
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] lists within a 
list / data-structure problem  
  ath.ethz.ch   
   

   

   
  12/13/2004 10:59  
   

   

   




Dear all,

this is a rather basic question; i am not sure how to structure my data
well:
I want to extraxt various measures from my raw-data. These measures are
of different sizes, so I decided to store them in a list, like:

run1 - list(Dom = (my_vector), mean = (my_single_number))

I can do that in a for loop for 40 runs, ending up with 40 lists: run1,
run2, ..., run40.
To have all the measurements neatly together I thought of making another
list, containing  40 sub-lists:

  ALL - list(run1, run2,..., run40)
  ALL
[[1]]
[[1]]$Dom
[1] my_vector

[[1]]$mean
[1] my_single_number


[[2]]
[[2]]$Dom
[1] my_vector

[[2]]$mean
[1] my_single_number

...

1) This may be a bit clumsy as I have to type all the sub-list's names
in by hand in order to produce my ALL-list: Is there a better way?

2) I have problems of addressing the data now. I can easily access any
single value; for example, for the second component of the second sub-
list:

  ALL[[2]][[2]]
[1] my_single_number,

but: how could I get the second component of all sub-lists, to plot, for
example, all the $mean in one plot? For a matrix, mat[,2] would give me
the whole second column, but
ALL[[]][[2]]
does not return all the second components.

I feel that 'lapply' might help me here, but I could not figure out
exactly how to use it, and it always comes down to the problem of how to
correctly address the components in the sublists.

Or is there maybe a smarter way to do that instead of using a list of
lists?

Any hint would be warmly appreciated!

Jan
(R 2.0.1 on windows XP)

--

__

Jan Wantia
Deptartment of Informatics, University of Zurich
Andreasstr. 15
CH 8050 Zurich
Switzerland

Tel.:+41 (0) 1 635 4315
Fax: +41 (0) 1 635 45 07
email: [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to duplicate rows in dataframe?

2004-12-13 Thread james . holtman




 x.1 - data.frame(a=1:5, b=1:5)
 x.1
  a b
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
 x.1[c(1,2,2,2,3,3,4,4,5,4,3,2,1),]
a b
1   1 1
2   2 2
2.1 2 2
2.2 2 2
3   3 3
3.1 3 3
4   4 4
4.1 4 4
5   5 5
4.2 4 4
3.2 3 3
2.3 2 2
1.1 1 1

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  cstrato   
   
  [EMAIL PROTECTED] To:   
[EMAIL PROTECTED]  
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] How to duplicate 
rows in dataframe?   
  ath.ethz.ch   
   

   

   
  12/13/2004 14:02  
   

   

   




Dear all:

I have the following (simple?) problem:
Consider a dataframe where the first column contains
integers used as index, e.g.
index
 24
 13
 46
 32

Now I have the following vector used to sort the dataframe:
x - c(13,24,32,46)
Sorting the dataframe can be done by using order.

However consider the following vector:
x - c(13,32,13,24,46,24,24)
Now I want to get the dataframe in the order of the rows
defined in x, i.e. the dataframe contains duplicate rows.
One way to achieve this would be to use rbind in a for-loop.

My question is:
Is there an easier and - more important - faster way to
obtain the dataframe as defined in x?

Thank you in advance.
Best regards
Christian
_._._._._._._._._._._._._._._._
C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
V.i.e.n.n.a   A.u.s.t.r.i.a
_._._._._._._._._._._._._._._._

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] 'object.size' takes a long time to return a value

2004-12-12 Thread james . holtman




I was using 'object.size' to see how much memory a list was taking up.
After executing the command, I had thought that my computer had locked up.
After further testing, I determined that it was taking 241 seconds for
object.size to return a value.

I did notice in the release notes that 'object.size' did take longer when
the list contained character vectors.  Is the time that it is taking
'object.size' to return a value to be expected for such a list?

Much better results were obtained when the character vectors were converted
to factors.


##  Results from the testing  ###
 str(x.1)
List of 10
 $ : chr [1:227299] sadc sar date ksh ...
 $ : chr [1:227299] aprperf aprperf aprperf aprperf ...
 $ : num [1:227299] 23 23 0 23 23 0 0 0 0 23 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 62608 6796829 10208 13128 ...
 $ : num [1:227299] 0 1 0 0 1 0 0 0 0 0 ...

# takes a long time (241 seconds) to report the size
 gc();system.time(print(object.size(x.1)))
  used (Mb) gc trigger  (Mb)
Ncells  711007 19.02235810  59.8
Vcells 5191294 39.7   14409257 110.0
[1] 34154972
[1] 241.07   0.00 241.08 NA NA

# trying list of 1000
 x.2 - list.subset(x.1, 1:1000);gc();system.time(print(object.size(x.2)))
  used (Mb) gc trigger  (Mb)
Ncells  711006 19.02235810  59.8
Vcells 4300288 32.9   14409257 110.0
[1] 145860
[1] 0.01 0.00 0.01   NA   NA

# trying list of 10,000
 x.2 - list.subset(x.1,
1:1);gc();system.time(print(object.size(x.2)))
  used (Mb) gc trigger  (Mb)
Ncells  711006 19.02235810  59.8
Vcells 4381288 33.5   14409257 110.0
[1] 1491948
[1] 0.28 0.00 0.28   NA   NA

# list of 40,000
 x.2 - list.subset(x.1,
1:4);gc();system.time(print(object.size(x.2)))
  used (Mb) gc trigger  (Mb)
Ncells  711006 19.02235810  59.8
Vcells 4651288 35.5   14409257 110.0
[1] 5988460
[1] 7.15 0.00 7.15   NA   NA

# list of 60,000
 x.2 - list.subset(x.1,
1:6);gc();system.time(print(object.size(x.2)))
  used (Mb) gc trigger  (Mb)
Ncells  711006 19.02235810  59.8
Vcells 4831288 36.9   14409257 110.0
[1] 9001556
[1] 17.33  0.00 17.32NANA

# list of 100,000
 x.2 - list.subset(x.1,
1:10);gc();system.time(print(object.size(x.2)))
  used (Mb) gc trigger  (Mb)
Ncells  711006 19.02235810  59.8
Vcells 5191288 39.7   14409257 110.0
[1] 15044780
[1] 51.85  0.00 51.86NANA

# list structure of the last object
 str(x.2)
List of 10
 $ : chr [1:10] sadc sar date ksh ...
 $ : chr [1:10] aprperf aprperf aprperf aprperf ...
 $ : num [1:10] 23 23 0 23 23 0 0 0 0 23 ...
 $ : num [1:10] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:10] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ...
 $ : num [1:10] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:10] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:10] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:10] 62608 6796829 10208 13128 ...
 $ : num [1:10] 0 1 0 0 1 0 0 0 0 0 ...

# with the first two items on the list converted to factors,
# 'object.size' performs a lot better
 str(x.1)
List of 10
 $ : Factor w/ 175 levels #bpbkar,#bpcd,..: 132 133 60 93 13 160 60 84
60 132 ...
 $ : Factor w/ 8 levels apra3g,aprperf,..: 2 2 2 2 2 2 2 2 2 2 ...
 $ : num [1:227299] 23 23 0 23 23 0 0 0 0 23 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 62608 6796829 10208 13128 ...
 $ : num [1:227299] 0 1 0 0 1 0 0 0 0 0 ...
 system.time(print(object.size(x.1)))  # now it is fast
[1] 16374176
[1]  0  0  0 NA NA

 version
 _
platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major2
minor0.1
year 2004
month11
day  15
language R

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929
--
NOTICE:  The information contained in this electronic mail ...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how can I get the coefficients of x^0, x^1, x^2, . , x^6 from expansion of (1+x+x^2)^3

2004-12-03 Thread james . holtman




Use the 'polynom' library:

 p - as.polynomial(c(1,1,1))
 p
1 + x + x^2
 p^3
1 + 3*x + 6*x^2 + 7*x^3 + 6*x^4 + 3*x^5 + x^6
  unclass(p^3)
[1] 1 3 6 7 6 3 1

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Peter Yang  
   
  [EMAIL PROTECTED]To:   
[EMAIL PROTECTED]
  cc:  
   
  Sent by: Subject:  [R] how can I get 
the coefficients of x^0, x^1, x^2, . ,x^6 from  
  [EMAIL PROTECTED] expansion of (1+x+x^2)^3
   
  ath.ethz.ch   
   

   

   
  12/03/2004 14:56  
   

   

   




Hi,



I would like to get the coefficients of x^0, x^1, x^2,  . , x^6 from
expansion of (1+x+x^2)^3.

The result should be 1, 3, 6, 7, 6, 3, 1;



How can I calculate in R?



You help will be greatly appreciated.



Peter






 [[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] scatterplot of 100000 points and pdf file format

2004-11-24 Thread james . holtman




Have you tried

plot(...,pch='.')

This will use the period as the plotting character instead of the 'circle'
which is drawn.  This should reduce the size of the PDF file.

I have done scatter plots with 2M points and they are typically meaningless
with that many points overlaid.  Check out 'hexbin' on Bioconductor (you
can download the package from the RGUI window.  This is a much better way
of showing some information since it will plot the number of points that
are within a hexagon.  I have found this to be a better way of looking at
some data.
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Witold Eryk Wolski
   
  [EMAIL PROTECTED]To:   R Help Mailing List 
[EMAIL PROTECTED]
  cc:  
   
  Sent by: Subject:  [R] scatterplot of 
10 points and pdf file format  
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  11/24/2004 10:34  
   

   

   




Hi,

I want to draw a scatter plot with 1M  and more points and save it as pdf.
This makes the pdf file large.
So i tried to save the file first as png and than convert it to pdf.
This looks OK if printed but if viewed e.g. with acrobat as document
figure the quality is bad.

Anyone knows a way to reduce the size but keep the quality?


/E

--
Dipl. bio-chem. Witold Eryk Wolski
MPI-Moleculare Genetic
Ihnestrasse 63-73 14195 Berlin
tel: 0049-30-83875219 __(_
http://www.molgen.mpg.de/~wolski  \__/'v'
http://r4proteomics.sourceforge.net||/   \
mail: [EMAIL PROTECTED]^^ m m
  [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] timeDate

2004-11-23 Thread james . holtman




You might want to check out 'chron'.  This stores the time as days and
fractions of a day.

If you take the current date,

 as.numeric(chron(dates.=11/23/2004))
[1] 12745


you get the value above.  If you change this to millisecond, you get

 as.numeric(chron(dates.=11/23/2004)) * 86400 * 1000
[1] 1.101168e+12


this value requires 46 bits and since a floating point number has 54 bits
of value, it should be enough to give you millisecond resolution and still
maintain the 'date'
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Yasser El-Zein
   
  [EMAIL PROTECTED]To:   
[EMAIL PROTECTED]  
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  Re: [R] timeDate   
   
  ath.ethz.ch   
   

   

   
  11/23/2004 09:55  
   
  Please respond to 
   
  Yasser El-Zein
   

   

   




I am looking for up to the millisecond resolution. Is there a package
that has that?


On Mon, 22 Nov 2004 21:48:20 + (UTC), Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 Yasser El-Zein abu3ammar at gmail.com writes:

 
  From the document it is apparent to me that I need as.POSIXct  (I have
  a double representing the number of millis since 1/1/1970 and I need
  to construct a datetime object). I see it showing how to construct the
  time object from a string representing the time but now fro a double
  of millis. Does anyone know hoe to do that?
 

 If by millis you mean milliseconds (i.e. one thousandths of a second)
 then POSIXct does not support that resolution, but if rounding to
 seconds is ok then

   structure(round(x/1000), class = c(POSIXt, POSIXct))

 should give it to you assuming x is the number of milliseconds.

 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to extract data?

2004-11-23 Thread james . holtman




By 'ignore', can we delete those from the list of data?  I would then
assume that if you have a sequence of +0+0+ that you would want the last
+ for the increase of three.

If that is the case, then do a 'diff' and delete the entries that are 0.
Then create a new 'diff' and then use 'rle' to see what the length of the
sequences are:

 x - c(1,2,2,3,3,4,3,3,2,2,2,1)
 x
 [1] 1 2 2 3 3 4 3 3 2 2 2 1
 x.d - diff(x)
 x.d
 [1]  1  0  1  0  1 -1  0 -1  0  0 -1
 x.new - x[c(x.d,1) != 0]
 x.new
[1] 1 2 3 4 3 2 1
 x.d1 - diff(x.new)
 x.d1
[1]  1  1  1 -1 -1 -1
 rle(x.d1)
Run Length Encoding
  lengths: int [1:2] 3 3
  values : num [1:2] 1 -1


you can check the results of 'rle' to determine where the changes are.
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  ebashi
   
  [EMAIL PROTECTED] To:   [EMAIL PROTECTED], 
[EMAIL PROTECTED] 
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] How to extract 
data?  
  ath.ethz.ch   
   

   

   
  11/23/2004 15:54  
   

   

   




I appreciate if anyone can help me,
I have a table as follow,
 rate
  DATE VALUE
1   1997-01-10  5.30
2   1997-01-17  5.30
3   1997-01-24  5.28
4   1997-01-31  5.30
5   1997-02-07  5.29
6   1997-02-14  5.26
7   1997-02-21  5.24
8   1997-02-28  5.26
9   1997-03-07  5.30
10  1997-03-14  5.30
... ...
... ...
... ...
I want to extract the DATE(s) on which the VALUE has
already dropped twice and the DATE(s) that VALUE has
already increased for three times,( ignore where
VALUE(i+1)-VALUE(i)=0),I try to use diff() function,
however that works only for one increase or decrease.

Sincerely,

Sean

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RE : Create sequence for dataset

2004-11-21 Thread james . holtman




I think this might do it.

 x.1 - data.frame(x=sample(1:3,20,T), y=sample(10:12,20,T))  # create
test data
 x.1  # print it out
   x  y
1  2 11
2  3 11
3  2 10
4  1 12
5  3 11
6  1 10
7  3 10
8  1 11
9  1 12
10 1 11
11 1 12
12 1 12
13 2 11
14 3 11
15 3 10
16 3 10
17 2 12
18 2 10
19 3 11
20 2 11
# split the data by the numbers in 'x' (would be your 'amnl_key)
# and add a column containing the sequence number
 x.s - by(x.1, x.1$x, function(x){x$seq - seq(along=x$x); x})
# the result in 'x.s' is a list and the rows have to be recombined (rbind)
to form the result
 x.s  # print out the data
x.1$x: 1
   x  y seq
4  1 12   1
6  1 10   2
8  1 11   3
9  1 12   4
10 1 11   5
11 1 12   6
12 1 12   7

x.1$x: 2
   x  y seq
1  2 11   1
3  2 10   2
13 2 11   3
17 2 12   4
18 2 10   5
20 2 11   6

x.1$x: 3
   x  y seq
2  3 11   1
5  3 11   2
7  3 10   3
14 3 11   4
15 3 10   5
16 3 10   6
19 3 11   7
 do.call('rbind', x.s)  # bind the rows and print out the result
 x  y seq
1.4  1 12   1
1.6  1 10   2
1.8  1 11   3
1.9  1 12   4
1.10 1 11   5
1.11 1 12   6
1.12 1 12   7
2.1  2 11   1
2.3  2 10   2
2.13 2 11   3
2.17 2 12   4
2.18 2 10   5
2.20 2 11   6
3.2  3 11   1
3.5  3 11   2
3.7  3 10   3
3.14 3 11   4
3.15 3 10   5
3.16 3 10   6
3.19 3 11   7

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  [EMAIL PROTECTED] 
  
  Sent by: To:   
[EMAIL PROTECTED]  
  [EMAIL PROTECTED]cc:  
   
  ath.ethz.ch  Subject:  [R] RE : Create 
sequence for dataset  

   

   
  11/21/2004 16:28  
   

   

   




Dear members,

I want to create a sequence of numbers for the multiple records of
individual animal in my dataset. The SAS code below will do the trick, but
I want to learn to do it in R. Can anyone help ?

data htssn;
set htssn;
by anml_key;
if first.anml_key then do;
seq_ht_rslt=0;
end;
seq_ht_rslt+1;

Thanks in advance.

Stella
___
This message, including attachments, is confidential. If you are not the
intended recipient, please contact us as soon as possible and then destroy
the message. Do not copy, disclose or use the contents in any way.

The recipient should check this email and any attachments for viruses and
other defects. Livestock Improvement Corporation Limited and any of its
subsidiaries and associates are not responsible for the consequences of any
virus, data corruption, interception or unauthorised amendments to this
email.

Because of the many uncertainties of email transmission we cannot guarantee
that a reply to this email will be received even if correctly sent. Unless
specifically stated to the contrary, this email does not designate an
information system for the purposes of section 11(a) of the New Zealand
Electronic Transactions Act 2002.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] scan or source a text file into a list

2004-11-11 Thread james . holtman




use an 'environment' to read in the values; e.g.,

 with(e1 - new.env(),source('/tempxx.txt', local=T)) # read in the file
to a new environment
 myList - list()  # define empty list
 for (i in ls(e1)){  # process each element
+ myList[i] - get(i, e1)
+ }

 ls(e1)  # show objects in the list
[1] fnrnYears qe year0
 myList  # output my list
$fnr
[1] 0.3

$nYears
[1] 50

$qe
[1] 0.04

$year0
[1] 1970


__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Andy Bunn   
   
  [EMAIL PROTECTED] To:   R-Help 
[EMAIL PROTECTED]   
  Sent by: cc:  
   
  [EMAIL PROTECTED]Subject:  [R] scan or source a 
text file into a list
  ath.ethz.ch   
   

   

   
  11/11/2004 10:57  
   

   

   




I've ported somebody else's rather cumbersome Matlab model to R for
colleagues that want a free build of the model with the same type of I/O.

The Matlab model reads a text file with the initial parameters specified
as:

C:\Data\Carluc\Rportmore Params.R
# Number of years to simulate
nYears = 50;
# Initial year for graphing purposes
year0 = 1970;
# NPP/GPP ratio (cpp0 unitless)
fnr = 0.30;
# Quantum efficency
qe  = 0.040;

That is, there are four input variables (for this run - there can be many
more) written in a way that R can understand them. In R, I can have the
model source the parameter text file easily enough and have the objects in
the workspace. The model function in R takes a list at runtime. How can I
have R read that file and put the contents into the list I need?

E.g.,
 rm(list = ls())
 source(Params.R)
 ls()
[1] fnrnYears qe year0
 fnr
[1] 0.3
 nYears
[1] 50
 foo.list - list(fnr = fnr, nYears = nYears)

 foo.list
$fnr
[1] 0.3

$nYears
[1] 50


The model is then run with
 CarlucR(inputParamList = foo.list, ...)

I can't build inputParamList by hand as above because the number of
initial parameters changes with the model run and this runs in a wrapper.

Any thoughts? Some combination of paste with scan or parse?
-Andy


 version
 _
platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major2
minor0.0
year 2004
month10
day  04
language R


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Error in PDF output in R 2.0.0

2004-10-29 Thread james . holtman




The following script works fine in R 1.9.1.  It was creating a PDF file
with the graphs in it.  In R 2.0.0, I got the error message below.  I tried
the same script just outputting to Windows and postscript and the output
was OK.  The error message only showed up when trying to create a PDF file.


 version
 _
platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major2
minor0.0
year 2004
month10
day  04
language R

## output to windows -- OK
 print(xyplot(I(usr/sys) ~ time|factor(cpu), memIn,
+ panel=function(x,y)panel.xyplot(x,y,type='l')))
 print(xyplot(csw ~ time|factor(cpu), memIn,
+ panel=function(x,y)panel.xyplot(x,y,type='l')))

## ouput to postscript file -- OK
 postscript('out.ps')
 print(xyplot(I(usr/sys) ~ time|factor(cpu), memIn,
+ panel=function(x,y)panel.xyplot(x,y,type='l')))
 print(xyplot(csw ~ time|factor(cpu), memIn,
+ panel=function(x,y)panel.xyplot(x,y,type='l')))
 dev.off()
windows
  2

## output to PDF file  --  ERRORS
 pdf('out.pdf')
 print(xyplot(I(usr/sys) ~ time|factor(cpu), memIn,
+ panel=function(x,y)panel.xyplot(x,y,type='l')))
Error in [-(`*tmp*`, pos.heights[[nm]], value = numeric(0)) :
nothing to replace with
 print(xyplot(csw ~ time|factor(cpu), memIn,
+ panel=function(x,y)panel.xyplot(x,y,type='l')))
Error in [-(`*tmp*`, pos.heights[[nm]], value = numeric(0)) :
nothing to replace with
 dev.off()
windows
  2

## traceback on error
 traceback()
3: calculateGridLayout(x, rows.per.page, cols.per.page, number.of.cond,
   panel.height, panel.width, main, sub, xlab, ylab, x.alternating,
   y.alternating, x.relation.same, y.relation.same, xaxis.rot,
   yaxis.rot, xaxis.cex, yaxis.cex, par.strip.text, legend)
2: print.trellis(xyplot(csw ~ time | factor(cpu), memIn, panel =
function(x,
   y) panel.xyplot(x, y, type = l)))
1: print(xyplot(csw ~ time | factor(cpu), memIn, panel = function(x,
   y) panel.xyplot(x, y, type = l)))

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929
--
NOTICE:  The information contained in this electronic mail ...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error with repeat lines() in function

2004-09-24 Thread james . holtman





The problem was is that you were not return a value from the apply
function.  It was trying to store the result of the apply into an array and
there was no value.

See the line I added in your function.
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929


   

  Sean Davis   

  [EMAIL PROTECTED]To:   Uwe Ligges [EMAIL 
PROTECTED] 
  cc:   r-help [EMAIL 
PROTECTED] 
  Sent by: Subject:  Re: [R] Error with repeat 
lines() in function 
  [EMAIL PROTECTED]

  ath.ethz.ch  

   

   

  09/24/2004 13:09 

   

   





Here is an example that seems to reproduce the error:

rf1 - matrix(sort(abs(round(runif(4)*100))),nrow=1)
annot1 - sort(abs(round(runif(193)*100)))
annot2 - annot1 + 70
annot3 - cbind(annot1,annot2)
rat2 - rnorm(193)
rat1 - rnorm(193)
plotter -
function(annot,rat1,rat2,rf1,...) {
 par(las=2)
 xmax - max(annot[,2])
 xmin - min(annot[,1])
 par(mfrow=c(2,1))
 plot(annot[,1],rat1,type=l,xlab=,ylab=log2 Ratio,...)
 points(annot[,1],rat1)
 apply(rf1,1,function(z) {
   if (z[4]==+) {
 color - 'green'
 yoffset=1
   } else {
 color - 'red'
 yoffset=-1
   }

lines(list(x=c(z[1],z[4]),y=c(-2-yoffset/10,-2-yoffset/
10)),lwd=2,col=color)

lines(list(x=c(z[2],z[3]),y=c(-2-yoffset/10,-2-yoffset/
10)),lwd=4,col=color)

1 #  fake a return value

 })
 abline(h=0,lty=2)
   }
plotter(annot3,rat1,rat2,rf1)
Error in ans[[1]] : subscript out of bounds

Enter a frame number, or 0 to exit
1:plotter(annot3, rat1, rat2, rf1)
2:apply(rf1, 1, function(z) {
Selection: 0

On Sep 24, 2004, at 12:05 PM, Uwe Ligges wrote:

 Sean Davis wrote:

 I have a function that does some plotting.  I then add lines to the
 plot.  If executed one line at a time, there is not a problem.  If I
 execute the function, though, I get:
 Error in ans[[1]] : subscript out of bounds
 This always occurs after the second lines command, and doesn't happen
  with all of my data points (some do not have errors).  Any ideas?

 Please give an example how to produce the error,
 i.e. specify a very small toy example (including generated data and
 the call to your function).
 Many people on this list are quite busy these days and don't want to
 think about how to call your function and invent an example ...

 Uwe Ligges



 Thanks,
 Sean
  function(x,annot,rat1,rat2,rf,...) {
 par(las=2)
 wh - which(annot[,5]==x)
 xmax - max(annot[wh,4])
 xmin - min(annot[wh,3])
 chr - annot[wh,2][1]
 wh.rf - rf$chrom==as.character(chr)  rf$txStartxmin 
 rf$txEndxmax
 par(mfrow=c(2,1))
 plot(annot[wh,3],rat1[wh],type=l,xlab=,ylab=log2
 Ratio,main=x,...)
 points(annot[wh,3],rat1[wh])
 apply(rf[wh.rf,],1,function(z) {
   browser()
   if (z[4]==+) {
 color - 'green'
 yoffset=1
   } else {
 color - 'red'
 yoffset=-1
   }
lines(list(x=c(z[5],z[6]),y=c(-2-yoffset/10,-2-yoffset/
 10)),lwd=2,col=color)
lines(list(x=c(z[5],z[6]),y=c(-2-yoffset/10,-2-yoffset/
 10)),lwd=2,col=color)
 })
 abline(h=0,lty=2)
 }
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE

Re: [R] Unique lists from a list

2004-09-01 Thread james . holtman




Try this:

l.1 - list(list(name='a', addr='123'),list(name='b', addr='234'),
  list(name='b', addr='234'), list(name='a', addr='123'))  # create a
list


l.names - unlist(lapply(l.1, '[[', 'name'))  # get the 'name'
l.u - unique(l.names)  # make unique

new.list - l.1[match(l.u, l.names)]  # create new list with just one
'name'

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929


   
   
  michael watson  
   
  (IAH-C) To:   [EMAIL PROTECTED]   

  [EMAIL PROTECTED]cc:

  .ac.uk  Subject:  [R] Unique lists from a 
list 
  Sent by: 
   
  [EMAIL PROTECTED]
   
  ath.ethz.ch  
   
   
   
   
   
  09/01/2004 10:31 
   
   
   
   
   




Hi

I have a list.  Two of the elements of this list are Name and
Address, both of which are character vectors.  Name and Address are
linked, so that the same Name always associates with the same
Address.

What I want to do is pull out the unique values, as a new list of the
same format (ie two elements of character vectors).  Now I've worked out
that unique(list$Name) will give me a list of the unique names, but how
do I then go and link those to the correct (unique) addresses so I end
up with a new list which is the same format as the rest, but now unique?

Cheers
Mick

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread james . holtman




It is amazing the amount of time that has been spent on this issue.  In
most cases, if you do some timing studies using 'scan', you will find that
you can read some quite large data structures in a reasonable time.  If you
initial concern was having to wait 10 minutes to have your data read in,
you could have read in quite a few data sets by now.

When comparing speeds/feeds of processors, you also have to consider what
it being done on them.  Back in the dark ages we had a 1 MIP computer
with 4M of memory handling input from 200 users on a transaction system.
Today I need a 1GHZ computer with 512M to just handle me.  Now true, I am
doing a lot different processing on it.

With respect to I/O, you have to consider what is being read in and how it
is converted.  Each system/program has different requirements.  I have some
applications (running on a laptop) that can read in approximately 100K rows
of data per second (of course they are already binary).  On the other hand,
I can easily slow that down to 1K rows per second if I do not specify the
correct parameters to 'read.table'.

So go back and take a look at what you are doing, and instrument your code
to see where time is being spent.  The nice thing about R is that there are
a number of ways of approaching a solution and it you don't like the timing
of one way, try another.  That is half the fun of using R.
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929


   

  [EMAIL PROTECTED]   
 
  mple.eduTo:   [EMAIL PROTECTED]   
 
  Sent by: cc:   [EMAIL PROTECTED],
 
  [EMAIL PROTECTED] [EMAIL PROTECTED], [EMAIL PROTECTED]   
   
  ath.ethz.ch  Subject:  Re: [R] naive question

   

   

  06/30/2004 16:25 

   

   





 [EMAIL PROTECTED] writes:

 I did not use R ten years ago, but reasonable RAM amounts have
 multiplied by roughly a factor of 10 (from 128Mb to 1Gb), CPU speeds
 have gone up by a factor of 30 (from 90Mhz to 3Ghz), and disk space
 availabilty has gone up probably by a factor of 10. So, unless the I/O
 performance scales nonlinearly with size (a bit strange but not
 inconsistent with my R experiments), I would think that things should
 have gotten faster (by the wall clock, not slower). Of course, it is
 possible that the other components of the R system have been worked on
 more -- I am not equipped to comment...

 I think your RAM calculation is a bit off. in late 1993, 4MB systems
 were the standard PC, with 16 or 32 MB on high-end workstations.

I beg to differ. In 1989, Mac II came standard with 8MB, NeXT came
standard with 16MB. By 1994, 16MB was pretty much standard on good quality
(= Pentium, of which the 90Mhz was the first example) PCs, with 32Mb
pretty common (though I suspect that most R/S-Plus users were on SUNs,
which were somewhat more plushly equipped).

 Comparable figures today are probably  256MB for the entry-level PC and
 a couple GB in the high end. So that's more like a factor of 64. On the
 other hand, CPU's have changed by more than the clock speed; in
 particular, the number of clock cycles per FP calculation has
 decreased considerably and is currently less than one in some
 circumstances.

I think that FP performance has increased more than integer performance,
which has pretty much kept pace with the clock speed. The compilers have
also improved a bit...

  Igor

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] binding rows from different matrices

2004-06-29 Thread james . holtman




Try:

  veca=matrix(1:25,5,5)
  vecb=matrix(letters[1:25],5,5)
  vecc=matrix(LETTERS[1:25],5,5)
 x.1 - lapply(1:5,function(x)rbind(veca[x,],vecb[x,],vecc[x,]))
 do.call('rbind',x.1)
  [,1] [,2] [,3] [,4] [,5]
 [1,] 1  6  11 16 21
 [2,] a  f  k  p  u
 [3,] A  F  K  P  U
 [4,] 2  7  12 17 22
 [5,] b  g  l  q  v
 [6,] B  G  L  Q  V
 [7,] 3  8  13 18 23
 [8,] c  h  m  r  w
 [9,] C  H  M  R  W
[10,] 4  9  14 19 24
[11,] d  i  n  s  x
[12,] D  I  N  S  X
[13,] 5  10 15 20 25
[14,] e  j  o  t  y
[15,] E  J  O  T  Y

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929


   

  Stephane DRAY

  [EMAIL PROTECTED]To:   [EMAIL PROTECTED]
  
  eal.ca  cc: 

  Sent by: Subject:  [R] binding rows from 
different matrices  
  [EMAIL PROTECTED]

  ath.ethz.ch  

   

   

  06/29/2004 11:00 

   

   





Hello list,
I have 3 matrices with same dimension :
  veca=matrix(1:25,5,5)
  vecb=matrix(letters[1:25],5,5)
  vecc=matrix(LETTERS[1:25],5,5)

I would like to obtain a new matrix composed by alternating rows of these
different matrices (row 1 of mat 1, row 1 of mat 2, row 1 of mat 3, row 2
of mat 1.)

I have found a solution to do it but it is not very pretty and I wonder if
I can do it in an other way (perhaps with apply ) ?

  res=matrix(0,1,5)
  for(i in 1:5)
+ res=rbind(res,veca[i,],vecb[i,],vecc[i,])
  res=res[-1,]
  res
   [,1] [,2] [,3] [,4] [,5]
  [1,] 1  6  11 16 21
  [2,] a  f  k  p  u
  [3,] A  F  K  P  U
  [4,] 2  7  12 17 22
  [5,] b  g  l  q  v
  [6,] B  G  L  Q  V
  [7,] 3  8  13 18 23
  [8,] c  h  m  r  w
  [9,] C  H  M  R  W
[10,] 4  9  14 19 24
[11,] d  i  n  s  x
[12,] D  I  N  S  X
[13,] 5  10 15 20 25
[14,] e  j  o  t  y
[15,] E  J  O  T  Y
 

Thanks in advance !

Stéphane DRAY
--


Département des Sciences Biologiques
Université de Montréal, C.P. 6128, succursale centre-ville
Montréal, Québec H3C 3J7, Canada

Tel : 514 343 6111 poste 1233
E-mail : [EMAIL PROTECTED]
--


Web
http://www.steph280.freesurf.fr/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Specifying suitable PC to run R

2003-10-09 Thread james . holtman

If you are running Windows, do you have the Performance Monitor running?
This will help identify the reasons that programs are running slow.  Most
likely, you are low on memory and are paging a lot.  I alway have it
running and when I am running a large R script, if I am not using 100% of
the CPU, then I must be paging (assuming that I am not reading in my data).
You can also sprinkle the following function throughout your code to see
how much CPU and memory you are using.  I bracket all my major
computational sections with it:

 my.stats -  function(text = stats)
{
cat(text, -,sys.call(sys.parent())[[1]], :,  proc.time()[1:3],  :
, round(
memory.size()/2.^20., 1.), MB\n)
invisible(flush.console())
}

This prints out a message like:

 my.stats('Begin Reading')
Begin Reading - my.stats : 5.61 3.77 22309.67  :  18.7 MB

This says that I have used 5.61 CPU seconds of 'user' time, 3.77 CPU
seconds of 'system' time and the R session has been running for 22309
seconds (I always have one waiting for simple calculation) and I have
18.7MB of memory allocated to objects.

My first choice is get as much memory on your machine as you can; 1GB since
this the most that R can use.  I noticed a big difference in upgrading from
256M - 512M.  I also watch the Performance Monitor and when memory gets
low and I want to run a large job, I restart R.  Most of my scripts are
setup to run R without saving any data in the .Rdata file.  If I need to
save a large object, I do it explicitly since memory is key performance
limiting factor and Windows is not that good at freeing up memory after you
have used a lot of it.

A faster CPU will also help, but it would be the second choice, since if
you are paging, most of your time is spent on data transfer and not
computation.
__
James HoltmanWhat is the problem you are trying to solve?
Executive Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
(513) 723-2929


   

  Michael Dewey

  [EMAIL PROTECTED]To:   [EMAIL PROTECTED]
  
  uk  cc: 

  Sent by: Subject:  [R] Specifying suitable 
PC to run R   
  [EMAIL PROTECTED]

  ath.ethz.ch  

   

   

  10/09/2003 14:04 

   

   





If I am buying a PC where the most compute intensive task will be running R

and I do not have unlimited resources what trade-offs should I make?
Specifically should I go for
1 - more memory, or
2 - faster processor, or
3 - something else?
If it makes a difference I shall be running Windows on it and I am thinking

about getting a portable which I understand makes upgrading more difficult.

Extra background: the tasks I notice going slowly at the moment are fitting

models with lme which have complex random effects and bootstrapping. By the

standards of r-help posters I have small datasets (few thousand cases, few
hundred variables). In order to facilitate working with colleagues I need
to stick with windows even if linux would be more efficient


Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help




--
NOTICE:  The information contained in this electronic mail ...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] timezones

2003-08-03 Thread james . holtman

Part of the problem is that 'now' is POSIXct and 'now.gmt' is POSIXlt.  If
you use as.POSIXct, you get the right answer.

 (now - Sys.time())
[1] 2003-08-03 18:29:38 EDT
 str(now)
`POSIXct', format: chr 2003-08-03 18:29:38
 (now.gmt - as.POSIXlt(now,tz=GMT))
[1] 2003-08-03 22:29:38 GMT
 str(now.gmt)
`POSIXlt', format: chr 2003-08-03 22:29:38
 (now.gmt - as.POSIXct(now,tz=GMT))
[1] 2003-08-03 18:29:38 EDT
 str(now.gmt)
`POSIXct', format: chr 2003-08-03 18:29:38
 now-now.gmt
Time difference of 0 secs

__
James HoltmanWhat is the problem you are trying to solve?
Executive Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
(513) 723-2929


   

  Jerome Asselin   

  [EMAIL PROTECTED]To:   [EMAIL PROTECTED],   
 
   [EMAIL PROTECTED]  
 
  Sent by: cc: 

  [EMAIL PROTECTED]Subject:  Re: [R] timezones 

  ath.ethz.ch  

   

   

  07/31/2003 12:30 

   

   






I share your concerns regarding Problems 1 and 2. However, I am unable to
provide help on those at this moment.

As for Problem 3, an alternative for the time being would be to use
another package such as chron or date, although it would be preferable to
use the classes of the base package if possible.

Sorry I can't be more helpful.

Jerome

On July 30, 2003 09:19 pm, Gabor Grothendieck wrote:
 I have some questions and comments on timezones.

 Problem 1.

 # get current time in current time zone

  (now - Sys.time())

 [1] 2003-07-29 18:23:58 Eastern Daylight Time

 # convert this to GMT

  (now.gmt - as.POSIXlt(now,tz=GMT))

 [1] 2003-07-29 22:23:58 GMT

 # take difference

  now-now.gmt

 Time difference of -5 hours

 Note that the difference between the times displayed by the first two
 R expressions is -4 hours.  Why does the last expression return
 -5 hours?


 Problem 2.  Why do the two expressions below give different answers?
 I take the difference between two dates in GMT and then repeat it in the
 current time zone (EDT).

 # days since origin in GMT

  julian(as.POSIXct(2003-06-29,tz=GMT),origin=as.POSIXct(1899-12-30
 ,tz=GMT))

 Time difference of 37801 days

 # days since origin in current timezone

  julian(as.POSIXct(2003-06-29),origin=as.POSIXct(1899-12-30))

 Time difference of 37800.96 days


 I thought this might be daylight savings time related but even with

 standard time I get:
  julian(as.POSIXct(2003-06-29,tz=EST),origin=as.POSIXct(1899-12-30
 ,tz=EST))

 Time difference of 37800.96 days


 Problem 3. What is the general strategy of dealing with dates, as
 opposed to datetimes, in R?

 I have had so many problems and a great deal of frustration, mostly
 related to timezones.

 The basic problem is that various aspects of the date such as the year,
 the month, the day of the month, the day of the week can be different
 depending on the timezone you use.  This is highly undesirable since
 I am not dealing with anything more granular than a day yet timezones,
 which are completely extraneous to dates and by all rights should not
 have to enter into my problems, keep fowling me up.

 A lesser problem is that I find myself using irrelevant constants such
 as the number of seconds in a day which you would think would be
 something I would not have to deal with since I am concerned with daily
 data.

 With R have nifty object oriented features I think a good project would
 be to implement a class in the base that handled dates without times
 (or timezones!)

 P.S. I have sent an earlier version of this but did not see it posted
 so if both get posted please ignore the prior one since this one has
 more info in it.

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing

[R] Problem reading a PDF output

2003-06-19 Thread james . holtman
I generated a PDF output file of 10 plots.  When I try to view it with
Adobe reader (R4  R5), it will lockup the reader (it is consuming 100% of
the CPU) after presenting the 4th plot.  I can generate the plots just fine
in Windows and as a postscript file reading it with GSview.

Is there anyway to tell what might be wrong with the PDF output?  The file
is 890KB in size if anyone would like to look at it.  The postscript file
is 783KB in size.

I was able to isolate it to a single plot in a PDF file and it had 38,000
lines of the following that composed 99% of the file:  (I was plotting out
individual events, which were about that many)

86.66 88.67 m
86.66 92.81 l
86.66 96.95 l
86.66 101.09 l
86.66 105.23 l
86.66 109.37 l
86.66 113.51 l
:  38,000 more of the same
388.87 96.95 l
388.87 92.81 l
388.97 88.67 l
389.07 84.53 l
S
Q q
0.000 0.000 0.000 RG
0.75 w
[] 0 d

Is this breaking some limit in PDF?

I am running:

platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major1
minor7.1
year 2003
month06
day  16
language R
__
James Holtman   What is the problem you are trying to solve?
Executive Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
(513) 723-2929

--
NOTICE:  The information contained in this electronic mail tran... {{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Convert char vector to numeric table

2003-03-31 Thread james . holtman

use 'textConnection':

 x.1 - c('1 2 3','4 5 6','7 8 9','8 7 6','6 5 4')   # create character
vector
 x.in - textConnection(x.1) # setup connection
 x.data - read.table(x.in)  # read in the character vector
 x.data
  V1 V2 V3
1  1  2  3
2  4  5  6
3  7  8  9
4  8  7  6
5  6  5  4




   

  Nurnberg-LaZerte   

  [EMAIL PROTECTED] To:   R's help mailing 
list [EMAIL PROTECTED]
  Sent by: cc: 

  [EMAIL PROTECTED]Subject:  [R] Convert char vector to 
numeric table  
  ath.ethz.ch  

   

   

  03/31/03 17:09   

  Please respond to

  Nurnberg-LaZerte 

   

   





I'm a great fan of read.table(), but this time the data had a lot of cruft.
So I used readLines() and editted the char vector to eventually get
something like this:
 23.4   1.5   4.2
 19.1   2.2   4.1
and so on. To get that into a 3 col numeric table, I first just used:

writeLines(data,tempfile)
read.table(tempfile,col.names=c(A,B,C))

Works fine, but writing to a temporary file seems ... inelegant?  And
read.table() doesn't take a char vector as a file or connection argument.
The following works but it seems like a lot of code:

data - sub( +,,data)# remove leading blanks
for strsplit
data - strsplit(data, +)# strsplit returns a
list of char vectors
ndata - character(0)  # vectorize the list
of char vectors
for (ii in 1:length(data)) ndata - c(ndata,data[[ii]])
ndata - as.numeric(ndata)
dim(ndata) - c(3,length(data))
data - t(ndata)
data.frame(A=data[,1],B=data[,2],C=data[,3])

Am I missing something?

Thanks,
Bruce L.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help




--
NOTICE:  The information contained in this electronic mail transmission is
intended by Convergys Corporation for the use of the named individual or
entity to which it is directed and may contain information that is
privileged or otherwise confidential.  If you have received this electronic
mail transmission in error, please delete it from your system without
copying or forwarding it, and notify the sender of the error by reply email
or by telephone (collect), so that the sender's address records can be
corrected.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] overlapping pattern match (errata 2.0)

2003-03-29 Thread james . holtman

Another way to find all the multiple occurances of a character in a string
is to use 'rle':

 x.s - 'aaabbcdeeeggiijjysbbddeffghjjjsdk'
 x - unlist(strsplit(x.s, NULL))
 x
 [1] a a a b b c d e e e f f f f g g i
i j
[20] j y s b b d d e f f g h j j j s d
k k
[39] k k k
 rle(x)
Run Length Encoding
  lengths: int [1:21] 3 2 1 1 3 4 2 2 2 1 ...
  values : chr [1:21] a b c d e f g i j y s b d
e f g ...


When the lengths are 1, the corresponding 'values' are the repeated
characters.




   

  FMGCFMGC 

  [EMAIL PROTECTED]  To:   [EMAIL PROTECTED] 
 
  Sent by: cc:   [EMAIL PROTECTED] 
 
  [EMAIL PROTECTED]Subject:  Re: [R] overlapping pattern 
match (errata 2.0)
  ath.ethz.ch  

   

   

  03/28/03 17:36   

   

   





well! excuse me again but...

your.string - aaacdf
nc1 - nchar(your.string)-1
x - unlist(strsplit(your.string, NULL))  CORRECT
x2 - c()
for (i in 1:nc1)
x2 - c(x2, paste(x[i], x[i+1], sep=))  ERRATA 2
cat(ocurrences of aa in your.string: , length(grep(aa, x2)),
sep=, fill=TRUE)

Fran

PD: sorry again

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help




--
NOTICE:  The information contained in this electronic mail transmission is
intended by Convergys Corporation for the use of the named individual or
entity to which it is directed and may contain information that is
privileged or otherwise confidential.  If you have received this electronic
mail transmission in error, please delete it from your system without
copying or forwarding it, and notify the sender of the error by reply email
or by telephone (collect), so that the sender's address records can be
corrected.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help