date:20100308

Hi

r-help-boun...@r-project.org napsal dne 08.03.2010 05:23:25:

 
 Hello. I am new to R, and am typing some homework for my undergrad 
Analysis
 class. I am trying to graph the following function in R: f(x) = x^2 for 
x
 =0, and f(x) = 0 for x 0. How do I do this in R?

I would probably use curve function.

Regards
Petr


 
 Thanks for the help.
 -- 
 View this message in context: 
http://n4.nabble.com/Graphing-a-piece-wise-
 defined-function-tp1584105p1584105.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] barplot with factors problem

2010-03-08 Thread Jim Lemon


On 03/08/2010 04:48 AM, casperyc wrote:


http://n4.nabble.com/file/n1583733/100307070476876317b486a941.jpg

I want to get a histogram by factors.


Hi casperyc,
Have a look at the third example for the barp function in the plotrix 
package.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] asdate parsing


Thanks Gabor - sprintf did the trick
-- 
View this message in context: 
http://n4.nabble.com/as-date-parsing-tp1582868p1584218.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lattice: barchart, error bars and grouped data

2010-03-08 Thread Johannes Graumann

Hi,

How can I, given the code snippet below, draw the error bars in the center 
of each grouped bar rather than in the center of the group?

Thanks for any hints,

Joh

library(lattice)

barley[[SD]] - 5
barchart(
  yield ~ variety | site, 
  data = barley,
  groups=year,
  origin=0,
  lowDev=barley[[SD]],
  highDev=barley[[SD]],
  panel = function(
x,
y,
...,
lowDev,
highDev
  ){
panel.barchart(x, y, ...)
panel.segments(
   as.numeric(x),
   as.numeric(y) - lowDev,
   as.numeric(x),
   as.numeric(y) + highDev,
   col = 'red', lwd = 2,
   ...)
  }
)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Redhat Linux Install

2010-03-08 Thread Paul Hiemstra

Marc Schwartz wrote:

On Mar 5, 2010, at 3:45 PM, Ryan Garner wrote:

I just installed R on Redhat Linux at work for the first time and have two
questions.

1. I tried to install R to have png and cairo capabilities and was
unsuccessful. Before running make, I ran ./configure --with-libpng=yes
--with-x=no --with-cairo=yes --with-readline-yes . R installed fine, but
when I run R and type capabilities()

capabilities()

jpeg pngtiff tcltk X11 aqua http/ftp
sockets
TRUE TRUE TRUE TRUEFALSEFALSE TRUE TRUE
l ibxmlfifo cledit iconv NLS profmemcairo
TRUEFALSE TRUE TRUE TRUE TRUEFALSE

Why are png and cairo still FALSE?

2. I would also like to have X11 enabled. From reading the message board,
the consesus seems to be to install xorg-dev. I'm unable to do this because
I don't have root or super user priveleges. But if I'm able to log into my
work servers with PuTTY and Xming and run xemacs or xvim, does this mean
that X11 is already installed somewhere? If so, how do I specify this when
doing ./configure?

There is conflicting information here.

You specified --with-x=no, yet you want X. You indicate that you installed R,
yet you do not have root access.

In order to compile R from source and have the functionality that you seem to want, you will need either have root access to install the required libraries or have the SysAdmin do so. In order to install R using the defaults, you need to have root access or have your SysAdmin do so.

The required libraries are the 'dev' or development versions of the RPMs for
each of the components such as libpng, cairo, readline and X. These contain the
header files (.h) that are required to compile R from source and support these
features. These issues are described in the R Installation and Administration
manual:

http://cran.r-project.org/doc/manuals/R-admin.html

The easier option would be to have the SysAdmin simply use the available RPMs
for R rather than compiling from source. I presume that by Red Hat Linux, you
mean RHEL. You can point your SysAdmin to the EPEL
(http://fedoraproject.org/wiki/EPEL) which provides pre-built RPMs for R,
installable by using 'yum'.

In addition to Marc, CRAN also provides .rpm version of R [1]. It could
be that these are newer than the ones on EPEL, but I'm not sure.

cheers,
Paul

[1] http://cran.r-project.org/bin/linux/redhat/

If you can use ssh to login to the server using PuTTY and that supports X as
you indicate, then it means that the server has been configured to support 'X
forwarding' and that your ssh login is using the '-X' option to request it on
your end of the connection. This means that the server supports X, but may or
may not have the X related development RPMs installed, which as I note, are
required to compile R from source and support X. Xming, on the other hand, I
believe provides its own X server implementation, which potentially brings
other issues into play. I have not used it, so would defer to others on the
details.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +3130 274 3113 Mon-Tue
Phone: +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

[R] plot a distance matrix

2010-03-08 Thread Rosa Manrique



Dear friends,
I have problems to plot a matrix from an already distance-matrix. I Just want 
to plot it, the distance calculation were already done, I don' have the 
original data to re-calculate the distance in R that will be easy. If some one 
can prove, here goes the data.
Thank you
Rosa.__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to match vector with a list ?

2010-03-08 Thread Carlos Petti

Thank you for answers.

 My code is very slow compared with yours ;-)


#my code

system.time(r0-f0(iBig,jBig))

user system elapsed

82.489 15.060 97.544


 #Holtman's code

system.time(r1-f1(iBig,jBig))

user system elapsed

0.100 0.012 0.113


 #Dunlap's code

system.time(r2-f2(iBig,jBig))

user system elapsed

0.084 0.004 0.088


2010/3/5 William Dunlap wdun...@tibco.com

  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Carlos Petti
  Sent: Friday, March 05, 2010 9:43 AM
  To: r-help@r-project.org
  Subject: [R] How to match vector with a list ?
 
  Dear list,
 
  I have a vector of characters and a list of two named elements :
 
  i - c(a,a,b,b,b,c,c,d)
 
  j - list(j1 = c(a,c), j2 = c(b,d))
 
  I'm looking for a fast way to obtain a vector with names, as follows :
 
  [1] j1 j1 j2 j2 j2 j1 j1 j2

 A request with a such a nice copy-and-pastable
 example in it deserves an answer.

 It looks to me like you want to map the item names
 in i to the group names that are the names of the list j,
 which maps group names to the items in each group.
 When there are lots of groups it can be faster to
 first invert the list j into a mapping vector pair,
 as in:

 f2 - function (i, j) {
groupNames - rep(names(j), sapply(j, length)) # map to groupName
itemNames - unlist(j, use.names = FALSE) # map from itemName
groupNames[match(i, itemNames, nomatch = NA)]
 }

 I put your original code into a function, as this makes
 testing and development easier:

 f0 - function (i, j) {
 match - lapply(j, function(x) {
which(i %in% x)
})
k - vector()
for (y in 1:length(match)) {
k[match[[y]]] - names(match[y])
}
k
 }

 With your original data these give identical results:

  identical(f0(i,j), f2(i,j))
 [1] TRUE

 I made a list describing 1000 groups, each containing
 an average of 10 members:

 jBig - split(paste(N,1:1,sep=),
 sample(paste(G,1:1000,sep=),size=1,replace=TRUE))

 and a vector of a million items sampled from the those
 member names:

 iBig - sample(paste(N,1:1,sep=), replace=TRUE, size=1e6)

 Then I compared the times it took f0 and f2 to compute
 the result and verified that their outputs were identical:

  system.time(r0-f0(iBig,jBig))
   user  system elapsed
  100.89   10.20  111.27
  system.time(r2-f2(iBig,jBig))
   user  system elapsed
   0.140.000.14
  identical(r0,r2)
 [1] TRUE

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 
  I used :
 
  match - lapply(j, function (x) {which(i %in% x)})
  k - vector()
  for (y  in 1:length(match)) {
  k[match[[y]]] - names(match[y])}
  k
  [1] j1 j1 j2 j2 j2 j1 j1 j2
 
  But, I think a better way exists ...
 
  Thanks in advance,
  Carlos
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Some hints for the R beginner

2010-03-08 Thread Ivan Calandra


Hi Patrick,

I've read it quickly and it seems to be a good resource for beginners 
that have just downloaded R and have no idea what to do.
My guess is that it should be a good introduction to other documents 
such as the An introduction to R.


I'll test it with the next students in my team that will have to learn R!

One good thing would be to have it as PDF, it's easier to have it always 
close to you.


Regards,
Ivan

Le 3/7/2010 20:30, Patrick Burns a écrit :

There is now a document called Some hints
for the R beginner whose purpose is to get
people up and running with R as quickly as
possible.

Direct access to it is:
http://www.burns-stat.com/pages/Tutor/hints_R_begin.html

JRR Tolkien wrote a story (sans hobbits) called
'Leaf by Niggle' that has always resonated
with me.  I offer you an imperfect, incomplete
tree (but my roof is intact).

Suggestions for improvements are encouraged.




--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unsigned Posts; Was Setting graphical parameters

2010-03-08 Thread stephen's mailinglist account

On 4 March 2010 23:47, Jim Lemon j...@bitwrit.com.au wrote:
 On 03/05/2010 04:11 AM, Bert Gunter wrote:

 Folks:

 Rolf's (appropriate, in my view) response below seems symptomatic of an
 increasing tendency of posters to hide their identities with pseudonyms
 and
 fake headers. While some of this may be due to identity paranoia (which I
 think is overblown for this list), I suspect that a good chunk of it is
 lazy
 students trying to beat the system by having us do their homework. The
 etiquette of this list has traditionally been, like the software, open: we
 sign our names. I would urge helpeRs to adhere to this etiquette and
 ignore
 unsigned posts.


 I had a working hypothesis about this, but decided to check the data before
 replying. Looking at the ten most recent obvious pseudonyms, all were from
 free email accounts like gmail or yahoo. A few of these included all or part
 of the name of the user anyway. As such accounts tend to be used for lots of
 things, the users may well be concerned about identification, even if
 everyone on the R help list is of the highest moral standing.

 Jim

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Personally I am just paranoid.  I used my normal email address for the
Debian mailing lists a few years ago, then found myself deluged with
spam, and bounced emails where my email had been inserted into the
from field (these generally originated from Russian or Brazilian
domains - if you checked the full headers). Maybe the debian lists
weren't the source and it was harvested by other means, but my full
email address was viewable on some of the news list archives.

I have also suffered some credit card fraud which was fortunately
swiftly curtailed, but it means I am inclined to keep aspects of my
life out of the online domain, and use this completely separate email
address for mailing lists.  It also means I can dip in and out of the
lists, and not clog up a work or personal email account.

R is a tool that I use for part of my job - albeit an excellent tool.
I am particularly happy with the graphics that I have managed to
produce.



-- 
Stephen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error_hier.part

2010-03-08 Thread Marco Jorge

Hi everyone,

BEGINNER question:
I get the error below when running hier.part. Probably i´m doing
something wrong.

Error in glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart,  :
  object 'fit' not found
In addition: Warning messages:
1: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart,  :
  no observations informative at iteration 1
2: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart,  :
  algorithm did not converge

The steps i followed:
- read.table to import 9 ascii files with no header
- data.frame to join those nine objects (factors)
- unlist to turn another imported ascii into vector (dependent variable)

Note: novalues in the ascii appear as -. Importing them like that gives
factors; if i change that to NaN, data frames result.

I know i must be doing something wrong. Can someone please give me some
clues on what that is?
Thanks in advance
Marco

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to match vector with a list ?

2010-03-08 Thread Linlin Yan

Maybe you can create a helper vector first:
 helper - structure(names = unlist(j), rep(names(j), sapply(j, length)))
 helper
   acbd
j1 j1 j2 j2
 helper[i]
   aabbbccd
j1 j1 j2 j2 j2 j1 j1 j2

On Sat, Mar 6, 2010 at 1:42 AM, Carlos Petti carlos.pe...@gmail.com wrote:
 Dear list,

 I have a vector of characters and a list of two named elements :

 i - c(a,a,b,b,b,c,c,d)

 j - list(j1 = c(a,c), j2 = c(b,d))

 I'm looking for a fast way to obtain a vector with names, as follows :

 [1] j1 j1 j2 j2 j2 j1 j1 j2

 I used :

 match - lapply(j, function (x) {which(i %in% x)})
 k - vector()
 for (y  in 1:length(match)) {
 k[match[[y]]] - names(match[y])}
 k
 [1] j1 j1 j2 j2 j2 j1 j1 j2

 But, I think a better way exists ...

 Thanks in advance,
 Carlos

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice: barchart, error bars and grouped data

2010-03-08 Thread Dieter Menne



Johannes wrote:
 
 
 How can I, given the code snippet below, draw the error bars in the center 
 of each grouped bar rather than in the center of the group?
 

http://markmail.org/message/oljgimkav2qcdyre

Dieter

-- 
View this message in context: 
http://n4.nabble.com/Lattice-barchart-error-bars-and-grouped-data-tp1584239p1584376.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Graphing a piece-wise defined function

2010-03-08 Thread Karl Ove Hufthammer

On Sun, 7 Mar 2010 20:23:25 -0800 (PST) thedoctor81877 thedoctor81877
@gmail.com wrote:
 Hello. I am new to R, and am typing some homework for my undergrad Analysis
 class. I am trying to graph the following function in R: f(x) = x^2 for x
 =0, and f(x) = 0 for x 0. How do I do this in R?

f=function(x) ifelse(x = 0, x^2, 0)
curve(f, -2, 4)

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpretation of 'swtich'

2010-03-08 Thread Duncan Murdoch


Duncan Murdoch wrote:

On 07/03/2010 5:26 PM, rkevinbur...@charter.net wrote:
  

Thatnk you.

The documentation indicates as you indicated that if there is not an exact 
match then the next element is chosen. But it does not indicate the case that 
contains an exact match but there is not value to be returned (=, case). From 
what you indicate this is treated as if it was not a match.



I think the writing is not very clear (and Brian Ripley has improved it 
in R-devel), but it does intend to say:


  - If there is an exact match with a value, then that value is returned.
  


Re-reading it, I think my clarification is unclear.  The matching occurs 
to the argument name.  I should have written:


   - If there is an exact match to an argument name with an associated 
value, then that value is returned.


Duncan Murdoch
  - If there is an exact match with no value, then the next value is 
returned.
  - If there is no match, then the 1st unnamed arg (or 2nd if EXPR isn't 
named) is returned.


I think this is different from your interpretation.

Duncan Murdoch

  

Kevin

 Uwe Ligges lig...@statistik.tu-dortmund.de wrote: 


On 06.03.2010 21:49, rkevinbur...@charter.net wrote:
  

In browsing the source I see the following construct:

 res- switch(type, working = , response = r, deviance = ,
 pearson = if (is.null(object$weights))
 r
 else r * sqrt(object$weights), partial = r)

I understand that 'switch' will execute the code that is matched by its corresponding 
string value (in this case 'type'). What I don't understand is the empty code. Is this 
code saying that if the type is deviance then fill the 'res' variable with an 
empty value? From my naive point of view it seems that 'res' will only get a value(s) if 
'type' is 'response', 'pearson', or 'partial'. Please help with my understanding.


Please do read the help pages!

 From ?switch:

If there is an exact match then that element is evaluated and returned 
if there is one, otherwise the next element is chosen, [...]


Example:

switch(A, A=1, B=, C=2) # 1
switch(B, A=1, B=, C=2) # 2

Uwe Ligges



  

Kevin Burton
rkevinbur...@charter.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] POSIXct type lost


It appears that I am creating a matrix where als columns are of type number,
so my Date column has been converted to a number.

Is there a way to show or display this number column as a Date again?
-- 
View this message in context: 
http://n4.nabble.com/POSIXct-type-lost-tp1584379p1584410.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] POSIXct type lost


I am generating a column of dates using POSIXct, but when I try to assign it
to an existing dataframe - it gets stored as numbers instead of as
POSIXct.

Is there a way to force a column to be a specific type (POSIXct)?

Thanks,

Moon
-- 
View this message in context: 
http://n4.nabble.com/POSIXct-type-lost-tp1584379p1584379.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] POSIXct type lost

Add the number to the POSIXct origin.  See table at end of R News 4/1.

On Mon, Mar 8, 2010 at 7:11 AM, ManInMoon xmoon2...@googlemail.com wrote:

 It appears that I am creating a matrix where als columns are of type number,
 so my Date column has been converted to a number.

 Is there a way to show or display this number column as a Date again?
 --
 View this message in context: 
 http://n4.nabble.com/POSIXct-type-lost-tp1584379p1584410.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Average regions of non-zeros

2010-03-08 Thread jim holtman

What I was looking for was the string of non-zero values and where they
'broke' at.  I could have used 'rle', but I sometime find this approach just
as easy.  Every place there is a zero will be TRUE which has the value 1.
'cumsum' will generate a running sum of these values.  When there is a
non-zero value, you will get consecutive values of cumsum to be the same.
The is what you saw in pasting the value into the window.  Notice that the
run of '2's begins with a value of zero and then includes all the non-zero
values following.  By using 'split', I cn create a list of each group.  If
the group is of length 1, then it only contains zero and I ignore it.  If
the length is greater than 1, then we have some non-zero values and we have
to throw away the leading zero in the group (tail(a, -1)) and then take the
mean.

HTH

On Mon, Mar 8, 2010 at 3:26 AM, bogaso.christofer 
bogaso.christo...@gmail.com wrote:

 Hi Jim I was following this thread and found that your answer is perfect
 there. However I could not comprehend the meaning of the expression 
 cumsum(x == 0). If I paste it in R window, I get following :

  cumsum(x == 0)
  [1] 1 2 2 2 2 3 4 4 4 4

 I gone through the help page of cumsum() function I correctly understand
 that this function calculates the cumulative sum. But could not understand
 really the meaning of cumsum(x == 0)

 Would you please explain that?

 Thanks,

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of jim holtman
 Sent: 08 March 2010 08:32
 To: Daren Tan
 Cc: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Average regions of non-zeros

 Try this:

  x - c(0,0,1,2,3,0,0,4,5,6)
  # partition the data
  x.p - split(x, cumsum(x == 0))
  # now only process groups  1
  x.mean - lapply(x.p, function(a){
 + if (length(a) == 1) return(NULL)
 + return(list(grp=tail(a, -1), mean=mean(tail(a, -1
 + })
  # now only return the real values
  x.mean[unlist(lapply(x.mean, length) != 0)]
 $`2`
 $`2`$grp
 [1] 1 2 3
 $`2`$mean
 [1] 2

 $`4`
 $`4`$grp
 [1] 4 5 6
 $`4`$mean
 [1] 5



 On Sun, Mar 7, 2010 at 9:48 PM, Daren Tan dare...@hotmail.com wrote:

 
  x - c(0,0,1,2,3,0,0,4,5,6)
 
 
 
  How to identify the regions of non-zeros and average c(1,2,3) and
 c(4,5,6)
  to get 2 and 5.
 
 
 
  Thanks
 
 
 
  _
  Hotmail: Trusted email with Microsoft s powerful SPAM protection.
 
 [[alternative HTML version deleted]]
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 http://www.r-project.org/posting
 -guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 


 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

[[alternative HTML version deleted]]





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Speed up sparse matrices

2010-03-08 Thread Feng Li

Dear R,

I have three matrices like this

K:  pp-by-pp   commutation matrix,
I:  p-by-p diagonal matrix,
X:  p-by-q dense matrix,

and I wish to calculate

K(IoX)

where `o' denotes Kronecker product.

Can you give me any suggestion to speed it up when `p' and `q' are large?


Thanks in advance.


Feng

-- 
Feng Li
Department of Statistics
Stockholm University
106 91 Stockholm, Sweden
http://feng.li/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Average regions of non-zeros

2010-03-08 Thread Linlin Yan

Nice shot of cumsum(). Just improve it a little:

 x - c(0,0,1,2,3,0,0,4,5,6)
 x.groups - split(x, (x != 0) * cumsum(x == 0))[-1]
 x.groups
$`2`
[1] 1 2 3

$`4`
[1] 4 5 6

 lapply(x.groups, mean)
$`2`
[1] 2

$`4`
[1] 5

On Mon, Mar 8, 2010 at 11:02 AM, jim holtman jholt...@gmail.com wrote:
 Try this:

 x - c(0,0,1,2,3,0,0,4,5,6)
 # partition the data
 x.p - split(x, cumsum(x == 0))
 # now only process groups  1
 x.mean - lapply(x.p, function(a){
 +     if (length(a) == 1) return(NULL)
 +     return(list(grp=tail(a, -1), mean=mean(tail(a, -1
 + })
 # now only return the real values
 x.mean[unlist(lapply(x.mean, length) != 0)]
 $`2`
 $`2`$grp
 [1] 1 2 3
 $`2`$mean
 [1] 2

 $`4`
 $`4`$grp
 [1] 4 5 6
 $`4`$mean
 [1] 5



 On Sun, Mar 7, 2010 at 9:48 PM, Daren Tan dare...@hotmail.com wrote:


 x - c(0,0,1,2,3,0,0,4,5,6)



 How to identify the regions of non-zeros and average c(1,2,3) and c(4,5,6)
 to get 2 and 5.



 Thanks



 _
 Hotmail: Trusted email with Microsoft’s powerful SPAM protection.

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] black cluster in salt and pepper image

2010-03-08 Thread Sylvain Sardy


Hi,

on a lattice, I have binary 0/1 data. 1s are rare and may form clusters. 
I would like

to know the size/length of largest cluster. Any help warmly welcome,

Sylvain.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] arbitrary scaling

Hi all

I know I probably reinvented wheel but it was maybe simpler then search in 
docs or ask help before I did my part. 

I made a simple function which can scale a vector between chosen values. 
Do anybody know simpler/better approach?

myscale-function(x, miny=0.5, maxy=1) {
rx - diff(range(x, na.rm=T))
minx - min(x, na.rm=T)
tga - (maxy-miny)/rx
b - miny - tga* minx
res - x*tga+b
res
}

x - c(5,30,50)

myscale(x)
[1] 0.500 0.778 1.000

Thank you

Regards
Petr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scientific (statistical) foundation for Y-RANDOMIZATION in regression analysis

2010-03-08 Thread Liaw, Andy

That sounds like a particular form of permutation test.  If the
scrambling is replaced by sampling with replacement (i.e., some data
points can be sampled more than once while others can be left out),
that's the simple (or nonparametric) bootstrap.  The goal is to generate
the distribution of the statistic of interest (R^2 or q^2) under the
null hypothesis that there's no relationship between the activity (or
property) and the structure.

To make the test valid, one needs to ensure that the entire model
building process is carried through for all of the sampled data,
including feature selections, etc.

Andy

From: Damjan Krstajic
 
 Dear all,
 
 I am a statistician doing research in QSAR, building 
 regression models where the dependent variable is a numerical 
 expression of some chemical activity and input variables are 
 chemical descriptors, e.g. molecular weight, number of carbon 
 atoms, etc.
 
 I am building regression models and I am confronted with a 
 widely a technique called Y-RANDOMIZATION for which I have 
 difficulties in finding references in general statistical 
 literature regarding regression analysis. I would be grateful 
 if someone could point me to papers/literature in statistical 
 regression analysis which give scientific (statistical) 
 foundation for using Y-RANDOMIZATION.
 
 Y-RANDOMIZATION is a widely used technique in QSAR community 
 to unsure the robustness of a QSPR (regression) model. It is 
 used after the best regression model is selected and to 
 make sure that there are no chance correlations. Here is a 
 short description. The dependent variable vector (Y-vector) 
 is randomly shuffled and a new QSPR (regression) model is 
 fitted using the original independent variable matrix. By 
 repeating this a number of times, say 100 times, one will get 
 hundred R2 and q2 (leave one out cross-validation R2) based 
 on hundred shuffled Y. It is expected that the resulting 
 regression models should generally have low R2 and low q2 
 values. However, if the majority of hundred regression models 
 obtained in the Y-randomization have relatively high R2 and 
 high q2 then it implies that an acceptable regression model 
 cannot be obtained for the given data set by the current 
 modelling method.
 
 I cannot find any references to Y-randomization or 
 Y-scrambling anywhere in the literature outside 
 chemometrics/QSAR. Any links or references would be much appreciated.
 
 Thanks in advance.
 
 DK
 --
 Damjan Krstajic
 Director
 Research Centre for Cheminformatics
 Belgrade, Serbia
 
 --
 
 
 _
 Tell us your greatest, weirdest and funniest Hotmail stories
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I understand this sentenc e,and express it by means of Mathema tical approach？

2010-03-08 Thread Liaw, Andy

If your ultimate interest is in real scientific progress, I'd suggest that you 
ignore that sentence (and any conclusion drawn subsequent to it).

Cheers,
Andy 

From: bbslover
 
 This topic refer to independent variables reduction, as we 
 know ,a lot of
 method can do with it,however, for pre-processing independent 
 varibles, a
 method like the sentence below can reduce many variable, How can I
 understand it?
 
 what is  significant correlation at 5% level, what is the criterion？ P
 value？or what？
 
 
 Independent variables whose correlation with the response 
 variable was not
 significant at 5% level were removed
 
 how can I calucate the correlation between them?
 
 thank you!
 -- 
 View this message in context: 
 http://n4.nabble.com/How-can-I-understand-this-sentence-and-ex
press-it-by-means-of-Mathematical-approach- tp1584036p1584036.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] arbitrary scaling

Perhaps approx:

   approx(range(x), c(0.5, 1), xout = x)$y

A one-linear, but longer, is also possible based on lm:

  predict(lm(c(0.5, 1) ~ x, data.frame(x = range(x))), data.frame(x))


On Mon, Mar 8, 2010 at 9:37 AM, Petr PIKAL petr.pi...@precheza.cz wrote:
 Hi all

 I know I probably reinvented wheel but it was maybe simpler then search in
 docs or ask help before I did my part.

 I made a simple function which can scale a vector between chosen values.
 Do anybody know simpler/better approach?

 myscale-function(x, miny=0.5, maxy=1) {
 rx - diff(range(x, na.rm=T))
 minx - min(x, na.rm=T)
 tga - (maxy-miny)/rx
 b - miny - tga* minx
 res - x*tga+b
 res
 }

 x - c(5,30,50)

 myscale(x)
 [1] 0.500 0.778 1.000

 Thank you

 Regards
 Petr

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] r code to generate interaction columns

2010-03-08 Thread Sharma, Dhruv


 thanks Kieth.  I wanted something generic code to check column data
type and loop through and create the interaction columns automatically
as I want to test this out as a new algorithm for data mining.

Traditional regression may give misleading results with
multi-collinearity and thus I wanted to take interaction terms and run
them through random forests and rpart as they would need interaction
terms to be manually created.

Hope that clarifies.

Dhruv

-Original Message-
From: kMan [mailto:kchambe...@gmail.com] 
Sent: Sunday, March 07, 2010 8:08 PM
To: Sharma, Dhruv; r-help@r-project.org
Subject: RE: [R] r code to generate interaction columns

Dear Dhruv,

You could create interaction variables manually (assuming A is your
dependent variable). Just multiply the variables together.
cd.int-C*D
ce.int-C*E
cde.int-C*D*E # what about D*E, or interactions with B?
Include those in your model, such as
A~B+C+D+E+cd.int+cd.int+ce.int+cde.int.
Then you can compare those models to the results you get when you
specify the interaction in the model formula directly using the
documented syntax.
In your R-console, type ?formula, or help(formula) for details. 

Sincerely,
KeithC.


-Original Message-
From: Sharma, Dhruv [mailto:dhruv.sha...@penfed.org]
Sent: Saturday, March 06, 2010 10:30 AM
To: r-help@r-project.org
Subject: [R] r code to generate interaction columns

Hi,
   is there a way to take a dataset and extract numeric columns and
create interaction columns from it automatically?

   For e.g.  there are 5 columns of data: A,B,C,D,E.

   CDE are numeric.

   Can someone provide code to automatically create more columns such
as:

   1) C*D, C*E, C*D*E, (C+E)/(D+.01 (to avoid divide by zero),
(D+E)/(C+.01 (to avoid divide by zero), (C+D)/(E+.01 (to avoid divide by
zero))

?

I know in glm multiplying can create terms but i want the columns to be
part of the data set so that i can feed this into Random forest to pick
out predictive interaction terms as regression cannot reliably handle
correlated interaction terms.

if anyone has some simple code that can do this that would be helpful.

thanks
Dhruv


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there an equivalence of lm's anova for an rpart object ?

2010-03-08 Thread Liaw, Andy

One way to do it (no p-values) is explained in the original CART book.
You basically add up all the improvement (in fit$split[, improve])
due to each splitting variable.

Andy 

From: Tal Galili
 
 Simple example:
 
 # Classification Tree with rpart
 
 library(rpart)
 
 # grow tree
 
 fit - rpart(Kyphosis ~ Age + Number + Start,
 
  method=class, data=kyphosis)
 
 Now I would like to know how can I measure the importance 
 of each of my
 three explanatory variables (Age, Number, Start) in the model?
 
 If this was a regression model, I could have looked at p 
 values from the
 anova F test (between lm models with and without the 
 variable). But what
 is the equivalence of using anova on lm to an rpart object ?
 
 Any pointers, insights and references to this question will 
 be helpful.
 
 Thanks,
 
 Tal
 
 
 
 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il 
 (Hebrew) |
 www.r-statistics.com (English)
 --
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] arbitrary scaling

2010-03-08 Thread GULATI, BRIJESH (Global Markets FFO NY)

Thanks

I would never deduct it out from the help page of approx.

Regards
Petr


r-help-boun...@r-project.org napsal dne 08.03.2010 15:47:37:

 Perhaps approx:
 
approx(range(x), c(0.5, 1), xout = x)$y
 
 A one-linear, but longer, is also possible based on lm:
 
   predict(lm(c(0.5, 1) ~ x, data.frame(x = range(x))), data.frame(x))
 
 
 On Mon, Mar 8, 2010 at 9:37 AM, Petr PIKAL petr.pi...@precheza.cz 
wrote:
  Hi all
 
  I know I probably reinvented wheel but it was maybe simpler then 
search in
  docs or ask help before I did my part.
 
  I made a simple function which can scale a vector between chosen 
values.
  Do anybody know simpler/better approach?
 
  myscale-function(x, miny=0.5, maxy=1) {
  rx - diff(range(x, na.rm=T))
  minx - min(x, na.rm=T)
  tga - (maxy-miny)/rx
  b - miny - tga* minx
  res - x*tga+b
  res
  }
 
  x - c(5,30,50)
 
  myscale(x)
  [1] 0.500 0.778 1.000
 
  Thank you
 
  Regards
  Petr
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] setClass or setValidity?

2010-03-08 Thread Albert-Jan Roskam

Hi, 
 
I'm reading up on S4 classes *). There seem to be at least two ways of input 
validation:
setClass() (using the 'validity' argument)  and setValidity(). Is it a matter 
of taste which function is used? Or should more complex validation code better 
be put in a setValiditity call?

*) A (Not So) Short Introduction to S4 Object Oriented Programming in R 
V0.5.1 Christophe Genolini August 20, 2008
 
And, inside those validity checks, is most of the checking done with 'if' 
'else' computations, or is it also common to use except()?

Cheers!!
Albert-Jan

~~
In the face of ambiguity, refuse the temptation to guess.
~~


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setClass or setValidity?

2010-03-08 Thread Albert-Jan Roskam

Sorry: there was an error in the last sentence:
And, inside those validity checks, is most of the checking done with 'if' 
'else' computations, or is it also common to use try()?

Cheers!!
Albert-Jan

~~
In the face of ambiguity, refuse the temptation to guess.
~~

--- On Mon, 3/8/10, Albert-Jan Roskam fo...@yahoo.com wrote:


From: Albert-Jan Roskam fo...@yahoo.com
Subject: [R] setClass or setValidity?
To: r-help@r-project.org
Date: Monday, March 8, 2010, 4:14 PM


Hi, 
 
I'm reading up on S4 classes *). There seem to be at least two ways of input 
validation:
setClass() (using the 'validity' argument)  and setValidity(). Is it a matter 
of taste which function is used? Or should more complex validation code better 
be put in a setValiditity call?

*) A (Not So) Short Introduction to S4 Object Oriented Programming in R 
V0.5.1 Christophe Genolini August 20, 2008
 
And, inside those validity checks, is most of the checking done with 'if' 
'else' computations, or is it also common to use except()?

Cheers!!
Albert-Jan

~~
In the face of ambiguity, refuse the temptation to guess.
~~


      
    [[alternative HTML version deleted]]


-Inline Attachment Follows-


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data.frame issue (pls help)

Hi: 
I want to obtain a particular value from a data.frame. Following is my
dataframe:

 Quotes
BID ASK
Name
CT2 GOVT99.9296999.9375 CT2
TUM0 COMDTY 108.53125   108.5469TUM0
CT5 GOVT100.10156   100.1094GT5
FVM0 COMDTY 115.56250   115.5703FVM0
TYM0 COMDTY 116.93750   116.9531TYM0

If I try to run: QuoteTUM0BID = Quotes[Quotes$Name %in% TUM0, BID]
and print QuoteTUM0BID, I get 108.5312, instead of 108.53125 as an
answer. Please let me know why is it ignoring the last digit. 


Additional Information. 
If I run QuoteBID = Quotes[, BID], I get the whole array in which TUM0
BID is 108.53125 (a correct number).


Thanks in advance...
Rgds,
Brijesh



--
This message w/attachments (message) may be privileged, ...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using sprintf to pass a variable to a RMySQL query

2010-03-08 Thread alison waller

Hello,

I am using RmySQL and would like to iterate through a few queries.

I would like to use sprintf but I think I'm having problems mixing and
matching the sprintf syntax and the SQL regex.

I have checked my sqlcmd and it works when I wan to match %MG1% but how
do I iterate for i 1-72?  Escape characters,?

thanks in advance

i-1
sqlcmd_ScaffLen-sprintf('SELECT scaffold.length
FROM scaffold,scaffold2contig,contig2read
WHERE scaffold.scaffold_id=scaffold2contig.scaffold_id AND
scaffold2contig.contig_id=contig2read.contig_id AND contig2read.read_id LIKE
'%MG%s%' ,i)

= Here is my vague error message

Error: unexpected input in:

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] POSIXct type lost


What is R News 4/1?
-- 
View this message in context: 
http://n4.nabble.com/POSIXct-type-lost-tp1584379p1584464.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fit a gamma pdf using Residual Sum-of-Squares

2010-03-08 Thread vincent laperriere

Hi all,

I would like to fit a gamma pdf to my data using the method of RSS (Residual 
Sum-of-Squares). Here are the data:

 x - c(86,  90,  94,  98, 102, 106, 110, 114, 118, 122, 126, 130, 134, 138, 
142, 146, 150, 154, 158, 162, 166, 170, 174)
 y - c(2, 5, 10, 17, 26, 60, 94, 128, 137, 128, 77, 68, 65, 60, 51, 26, 17, 9, 
5, 2, 3, 7, 3)

I have typed the following code, using nls method:

fit - nls(y ~ (1/((s^a)*gamma(a))*x^(a-1)*exp(-x/s)), start = c(s=3, a=75, 
x=86))

But I have the following message error (sorry, this is in German):


Fehler in qr(.swts * attr(rhs, gradient)) : 
  Dimensionen [Produkt 3] passen nicht zur Länge des Objektes [23]
Zusätzlich: Warnmeldung:
In .swts * attr(rhs, gradient) : Länge des längeren Objektes
  ist kein Vielfaches der Länge des kürzeren Objektes

Could anyone help me with the code?
I would greatly appreciate it.
Sincerely yours,
Vincent Laperrière.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (box-) plot annotation: italic within paste?

2010-03-08 Thread Bernd Panassiti

Dear R users,

in the example below the name of the genus will be displayed in the main 
titles using the variable predictor[i] and paste.
I would like to have the genus name in italic. However all my attempts 
using expression and substitute failed.
Does anybody know a solution? 
Thanks a lot in advance. bernd


Acrobeles  -c(65.1,0.0,0.0,0.0,0.0,0.0)
Acrobeloides   -c(0.0,9.8,76.7,51.1,93.9,43.9)
Alaimus-c(0.0,4.9,0.0,0.0,0.0,6.3)
Aphelenchoides -c(126.5,29.3,76.7,134.1,176.7,87.9)

x-data.frame(Acrobeles,Acrobeloides,Alaimus,Aphelenchoides)

predictor   - colnames(x)
ylabel  -Numerical abundance
mainlabel1  -Boxplot for
mainlabel2  -sp.
cexalabel   -1.8 # axis label
cexmlabel   -1.6 # main label

par(oma=c(6,6,3,3),mar = c(6, 4, 4, 2) + 0.1,mfrow=c(2,2))

for (i in 1:ncol(x)){

boxplot(x[,i],
main=paste(mainlabel,predictor[i],mainlabel2),ylab=paste(ylabel),cex.lab=cexalabel,cex.main=cexmlabel,cex.axis=1.5)
}



---

Bernd Panassiti

National Institute of Public Health  the Environment (RIVM)
Laboratory for Ecological Risk Assessment (LER)
P.O. Box 1
3720 BA Bilthoven
The Netherlands
e-mail: bernd.panass...@rivm.nl
tel. +31 30 274 3647

Radboud University Nijmegen
Department of Environmental Science
b.panass...@science.ru.nl



Disclaimer RIVM
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tcltk

2010-03-08 Thread Vasco Cadavez

Hi,

I'm trying to install tcltk in R-2.10.1, however I get error.
someone can help?

thanks

Vasco

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] POSIXct type lost

Try google.

On Mon, Mar 8, 2010 at 8:09 AM, ManInMoon xmoon2...@googlemail.com wrote:

 What is R News 4/1?
 --
 View this message in context: 
 http://n4.nabble.com/POSIXct-type-lost-tp1584379p1584464.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] combinations and table selection problem

2010-03-08 Thread Carlos Guerra

Dear all,

I have a table like this:

a - read.csv(test.csv, header = TRUE, sep = ;)
a

 UTM   pUrb   pUrb_class  pAgri 
 pAgri_class  pNatFor  pNatFor_class
1 NF188520.160307   NA 79.921386NA  
0.00   NA
2 NF188651.965649   NA 46.657713NA  
0.00   NA
3 NF189326.009581   NA 40.269204NA  
0.00   NA
4 NF18943.141484 NA  0.00  NA   
   0.00   NA
5 NF189564.296826   NA  0.440691 NA 
 0.00   NA
6 NF189614.174068   NA 25.613839NA  
0.00   NA
7 NF189740.985589   NA 37.680521NA  
0.00   NA
8 NF189834.054325   NA 66.027334NA  
0.00   NA
9 NF189920.657632   NA 79.424024NA  
0.00   NA
10   NF198294.857605   NA 45.368606NA   
   0.00   NA

...

And I executed the following code:

#data classification#

a$pUrb_class-cut(a$pUrb, c(-Inf,80,Inf), labels = c(0,1))
a$pAgri_class-cut(a$pAgri, c(-Inf,80,Inf), labels = c(0,1))
a$pNatFor_class-cut(a$pNatFor, c(-Inf,80,Inf), labels = c(0,1))

a

 UTM   pUrb   pUrb_class  pAgri 
 pAgri_class  pNatFor  pNatFor_class
1 NF188520.160307   079.9213860 
0.00   0   
2 NF188651.965649   046.6577130 
0.00   0   
3 NF189326.009581   040.2692040 
0.00   0   
4 NF18943.141484 0 0.00  0  
   0.00   0   
5 NF189564.296826   0 0.440691 0
 0.00   0   
6 NF189614.174068   025.6138390 
0.00   0   
7 NF189740.985589   037.6805210 
0.00   0   
8 NF189834.054325   066.0273340 
0.00   0   
9 NF189920.657632   079.4240240 
0.00   0   
10   NF198294.857605   145.3686060  
   0.00   0   

...

#obtaining the number of combinations present in the data base#

library(survival)

b-strata(a$pUrb_class,a$pAgri_class,a$pNatFor_class, sep=,)
table(b)
b
a$pUrb_class=0,a$pAgri_class=0,a$pNatFor_class=0 
   17698 
a$pUrb_class=0,a$pAgri_class=0,a$pNatFor_class=1 
 112 
a$pUrb_class=0,a$pAgri_class=1,a$pNatFor_class=0 
4360 
a$pUrb_class=1,a$pAgri_class=0,a$pNatFor_class=0 
 160

median(table(b))
[1] 2260


In this stage I have 3 questions:

1st:
how can I obtain the combinations witch are present over the median (in this 
case the first and the second combination)?

2nd:
how can I obtain the combinations witch are present over the median and have at 
least one condition present (in this case only the second combination)?

3rd:
how can I select/extract from the original table the rows witch comply with the 
2nd question, in this case:


 UTM   pUrb   pUrb_class  pAgri 
 pAgri_class  pNatFor  pNatFor_class
10   NF198294.857605   145.3686060  
   0.00   0   

...



Thanks in advance,

Carlos Guerra

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tcltk

2010-03-08 Thread Rubén Roa

I doubt it. Guess why?

 

Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit
Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN


-Mensaje original-
De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En 
nombre de Vasco Cadavez
Enviado el: lunes, 08 de marzo de 2010 15:46
Para: r-help@r-project.org
Asunto: [R] tcltk

Hi,

I'm trying to install tcltk in R-2.10.1, however I get error.
someone can help?

thanks

Vasco

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data.frame issue (pls help)

2010-03-08 Thread Ivan Calandra


Hi,

I cannot really test since your dataframe is completely distorted, but 
what happens if you try: format(QuoteBID, nsmall=5)?
I think it's just a matter of printing, which uses the number of digits 
from options(digits=).


See: ?options, ?format, etc.

But I'm not an expert and cannot really test for this.
HTH
Ivan

Le 3/8/2010 14:15, GULATI, BRIJESH (Global Markets FFO NY) a écrit :

Hi:
I want to obtain a particular value from a data.frame. Following is my
dataframe:

   

Quotes
 

BID ASK
Name
CT2 GOVT99.9296999.9375 CT2
TUM0 COMDTY 108.53125   108.5469TUM0
CT5 GOVT100.10156   100.1094GT5
FVM0 COMDTY 115.56250   115.5703FVM0
TYM0 COMDTY 116.93750   116.9531TYM0

If I try to run: QuoteTUM0BID = Quotes[Quotes$Name %in% TUM0, BID]
and print QuoteTUM0BID, I get 108.5312, instead of 108.53125 as an
answer. Please let me know why is it ignoring the last digit.


Additional Information.
If I run QuoteBID = Quotes[, BID], I get the whole array in which TUM0
BID is 108.53125 (a correct number).


Thanks in advance...
Rgds,
Brijesh



--
This message w/attachments (message) may be privileged, ...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

   


--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [help] deleting rows which contain more than 2 NAs or zeros

2010-03-08 Thread sjaffe


If the data is a dataframe or matrix 'd':

d - d[apply(d, 1, function(v) sum( is.na(v) ) = 2  sum(v==0, na.rm=T) =
2 ), ]

which can be deconstructed as follows:

i1 - apply(d, 1, function(v) sum(is.na(v)) = 2 ) ## true for rows with 2
or fewer na's
i2 - apply(d, 1, function(v) sum( v == 0, na.rm=T ) = 2 ##true for rows
with 2 or fewer 0's

i1  i2 ##logical vector, true for rows satisfying both conditions

d[ i1  i2, ] ##only those rows satisfying the condition, and all columns


-- 
View this message in context: 
http://n4.nabble.com/help-deleting-rows-which-contain-more-than-2-NAs-or-zeros-tp1584613p1584641.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] quickest way convert 1-col df to vector?

2010-03-08 Thread sjaffe


anything shorter than as.vector(as.matrix( df ) )?
-- 
View this message in context: 
http://n4.nabble.com/quickest-way-convert-1-col-df-to-vector-tp1584646p1584646.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (box-) plot annotation: italic within paste?

2010-03-08 Thread Miguel Porto

Hello,

Try this way (not sure if it's the best way, but it works):

boxplot(x[,i],
main=substitute(expression(paste(a, ,italic(b),
,c)),list(a=mainlabel1,b=predictor[i],c=mainlabel2)),
ylab=paste(ylabel),cex.lab=cexalabel,cex.main=cexmlabel,cex.axis=1.5)

Best,
Miguel


On Mon, Mar 8, 2010 at 2:27 PM, Bernd Panassiti bernd.panass...@rivm.nlwrote:

 Dear R users,

 in the example below the name of the genus will be displayed in the main
 titles using the variable predictor[i] and paste.
 I would like to have the genus name in italic. However all my attempts
 using expression and substitute failed.
 Does anybody know a solution?
 Thanks a lot in advance. bernd


 Acrobeles  -c(65.1,0.0,0.0,0.0,0.0,0.0)
 Acrobeloides   -c(0.0,9.8,76.7,51.1,93.9,43.9)
 Alaimus-c(0.0,4.9,0.0,0.0,0.0,6.3)
 Aphelenchoides -c(126.5,29.3,76.7,134.1,176.7,87.9)

 x-data.frame(Acrobeles,Acrobeloides,Alaimus,Aphelenchoides)

 predictor   - colnames(x)
 ylabel  -Numerical abundance
 mainlabel1  -Boxplot for
 mainlabel2  -sp.
 cexalabel   -1.8 # axis label
 cexmlabel   -1.6 # main label

 par(oma=c(6,6,3,3),mar = c(6, 4, 4, 2) + 0.1,mfrow=c(2,2))

 for (i in 1:ncol(x)){

 boxplot(x[,i],

 main=paste(mainlabel,predictor[i],mainlabel2),ylab=paste(ylabel),cex.lab=cexalabel,cex.main=cexmlabel,cex.axis=1.5)
 }



 ---

 Bernd Panassiti

 National Institute of Public Health  the Environment (RIVM)
 Laboratory for Ecological Risk Assessment (LER)
 P.O. Box 1
 3720 BA Bilthoven
 The Netherlands
 e-mail: bernd.panass...@rivm.nl
 tel. +31 30 274 3647

 Radboud University Nijmegen
 Department of Environmental Science
 b.panass...@science.ru.nl



 Disclaimer RIVM
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quickest way convert 1-col df to vector?

2010-03-08 Thread Erik Iverson




sjaffe wrote:

anything shorter than as.vector(as.matrix( df ) )?


df[[1]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quickest way convert 1-col df to vector?

2010-03-08 Thread Steve Jaffe

D'oh -- thanks! I'm always forgetting the double-bracket extractor...

-Original Message-
From: Erik Iverson [mailto:er...@ccbr.umn.edu] 
Sent: Monday, March 08, 2010 10:50 AM
To: Steve Jaffe
Cc: r-help@r-project.org
Subject: Re: [R] quickest way convert 1-col df to vector?

sjaffe wrote:
 anything shorter than as.vector(as.matrix( df ) )?

df[[1]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using sprintf to pass a variable to a RMySQL query

2010-03-08 Thread jim holtman

Try this:

i-1
sqlcmd_ScaffLen-sprintf('SELECT scaffold.length
FROM scaffold,scaffold2contig,contig2read
WHERE scaffold.scaffold_id=scaffold2contig.scaffold_id AND
scaffold2contig.contig_id=contig2read.contig_id AND contig2read.read_id LIKE
\'%%MG%d%%\'' ,i)
sqlcmd_ScaffLen

Your problem:
1. Need %% to create % when using sprintf
2. Need to use %d and not %s for integer values
3. Need to escape the quote marks.

On Mon, Mar 8, 2010 at 8:06 AM, alison waller alison.wal...@embl.de wrote:

 Hello,

 I am using RmySQL and would like to iterate through a few queries.

 I would like to use sprintf but I think I'm having problems mixing and
 matching the sprintf syntax and the SQL regex.

 I have checked my sqlcmd and it works when I wan to match %MG1% but how
 do I iterate for i 1-72?  Escape characters,?

 thanks in advance

 i-1
 sqlcmd_ScaffLen-sprintf('SELECT scaffold.length
 FROM scaffold,scaffold2contig,contig2read
 WHERE scaffold.scaffold_id=scaffold2contig.scaffold_id AND
 scaffold2contig.contig_id=contig2read.contig_id AND contig2read.read_id
 LIKE
 '%MG%s%' ,i)

 = Here is my vague error message

 Error: unexpected input in:

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using sprintf to pass a variable to a RMySQL query

Another possibility is to use fn$ in the gsubfn package. Just preface
any command with fn$ to enable a quasi-perl-like string interpolation.
In this example $i is replaced with 1:

 library(gsubfn)
 library(sqldf)
 i - 1
 fn$sqldf(select count(*) from CO2 where Plant like '%n$i%')
  count(*)
1   14

 # as seen here:
 fn$identity(select count(*) from CO2 where Plant like '%n$i%')
[1] select count(*) from CO2 where Plant like '%n1%'

See http://gsubfn.googlecode.com for more.



On Mon, Mar 8, 2010 at 11:08 AM, jim holtman jholt...@gmail.com wrote:
 Try this:

 i-1
 sqlcmd_ScaffLen-sprintf('SELECT scaffold.length
 FROM scaffold,scaffold2contig,contig2read
 WHERE scaffold.scaffold_id=scaffold2contig.scaffold_id AND
 scaffold2contig.contig_id=contig2read.contig_id AND contig2read.read_id LIKE
 \'%%MG%d%%\'' ,i)
 sqlcmd_ScaffLen

 Your problem:
 1. Need %% to create % when using sprintf
 2. Need to use %d and not %s for integer values
 3. Need to escape the quote marks.

 On Mon, Mar 8, 2010 at 8:06 AM, alison waller alison.wal...@embl.de wrote:

 Hello,

 I am using RmySQL and would like to iterate through a few queries.

 I would like to use sprintf but I think I'm having problems mixing and
 matching the sprintf syntax and the SQL regex.

 I have checked my sqlcmd and it works when I wan to match %MG1% but how
 do I iterate for i 1-72?  Escape characters,?

 thanks in advance

 i-1
 sqlcmd_ScaffLen-sprintf('SELECT scaffold.length
 FROM scaffold,scaffold2contig,contig2read
 WHERE scaffold.scaffold_id=scaffold2contig.scaffold_id AND
 scaffold2contig.contig_id=contig2read.contig_id AND contig2read.read_id
 LIKE
 '%MG%s%' ,i)

 = Here is my vague error message

 Error: unexpected input in:

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with Hmisc, cut2, split and quantile

2010-03-08 Thread Guy Green


Hello,
I have a set of data with two columns: Target and Actual.  A 
http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt  is
attached but the data looks like this:

Actual  Target
-0.125  0.016124906
0.135   0.120799865
... ...
... ...

I want to be able to break the data into tables based on quantiles in the
Target column.  I can see (using cut2, and also quantile) how to get the
barrier points between the different quantiles, and I can see how I would
achieve this if I was just looking to split up a vector.  However I am
trying to break up the whole table based on those quantiles, not just the
vector.

The following code shows me the ranges for the deciles of the Target data:
library(Hmisc)
read_data=read.table(C:/Sample table.txt, head = T)
table(cut2(Read_data$Target,g=10))

However I would like to be able to break the table into ten separate tables,
each with both Actual and Target data, based on the Target data
deciles:

top_decile = ...(top decile of read_data, based on Target data)
next_decile = ...and so on...
bottom_decile = ...

That way I could manipulate the deciles, graph them separately (and
together) and so on, just as easily as I can the whole table.  I'm sure this
must be simple, but I can't see the way forward.  I have also looked at
split() and quantile() but have not been able to get them to achieve what I
am after.  Can anybody see a simple way foward on this?

Thanks,
Guy
-- 
View this message in context: 
http://n4.nabble.com/Help-with-Hmisc-cut2-split-and-quantile-tp1584647p1584647.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [help] deleting rows which contain more than 2 NAs or zeros

2010-03-08 Thread AuriDUL


Hello.

I have just started learning how to work with R program but I have
encountered a problem.

I can't think up how to remove the rows which contain two (2) or more NA or
Zero (0).

I would be glad if you could help me because I just have some basic
knowledge so far and I even haven't mastered all the basics yet as well.

Thanks in advance.
-- 
View this message in context: 
http://n4.nabble.com/help-deleting-rows-which-contain-more-than-2-NAs-or-zeros-tp1584613p1584613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] compare tables

2010-03-08 Thread Laetitia Schmid

Hi!
I need some help to finish my script.

I have two tables that I combine randomly to produce a third table.  
This I do for hundreds of iterations. In the output file I get all the  
simulated tables after each other. It looks like this (in this case 3  
iterations):

output file:

[[1]]
   [,1]   [,2]   [,3]
  [1,] GM030005 WI920024 
  [2,] GM930026 WI920362 
  [3,] GM980051 WI920007 CGCC
  [4,] GM970009 WI920417 
  [5,] GM920089 WI920023 
  [6,] GM930109 WI920359 
  [7,] GM980007 WI920428 CGCC
  [8,] GM940039 WI920430 
  [9,] GM990027 WI920349 
[10,] GM920222 WI920410 CGCC
[11,] GM930029 WI920001 CGCC
[12,] GM990105 WI920431 
[13,] GM050009 WI920430 
[14,] GM920224 WI920369 
[15,] GM920224 WI920352 
[16,] GM960028 WI920427 
[17,] GM940031 WI920004 
[18,] GM930040 WI920441 
[19,] GM930040 WI920441 
[20,] GM050099 WI920417 
[21,] GM050099 WI920423 CCCG
[22,] GM920096 WI920370 
[23,] GM920034 WI920437 
[24,] GM960023 WI920017 
[25,] GM920031 WI920430 
[26,] GM920202 WI920367 CCCG
[27,] GM990066 WI920410 

[[2]]
   [,1]   [,2]   [,3]
  [1,] GM030005 WI920017 
  [2,] GM930026 WI920415 
  [3,] GM980051 WI920028 CGCC
  [4,] GM970009 WI920017 
  [5,] GM920089 WI920028 
  [6,] GM930109 WI920353 
  [7,] GM980007 WI920009 CGCT
  [8,] GM940039 WI920415 
  [9,] GM990027 WI920423 CCCG
[10,] GM920222 WI920423 CGCG
[11,] GM930029 WI920363 CGCC
[12,] GM990105 WI920362 
[13,] GM050009 WI920365 
[14,] GM920224 WI920362 
[15,] GM920224 WI920410 
[16,] GM960028 WI920355 CCCG
[17,] GM940031 WI920361 
[18,] GM930040 WI920356 
[19,] GM930040 WI920353 
[20,] GM050099 WI920360 
[21,] GM050099 WI920353 
[22,] GM920096 WI920023 
[23,] GM920034 WI920426 
[24,] GM960023 WI920024 
[25,] GM920031 WI920022 
[26,] GM920202 WI920009 CCCG
[27,] GM990066 WI920001 

[[3]]
   [,1]   [,2]   [,3]
  [1,] GM030005 WI920433 
  [2,] GM930026 WI920408 
  [3,] GM980051 WI920352 CGCC
  [4,] GM970009 WI920416 
  [5,] GM920089 WI920022 
  [6,] GM930109 WI920369 
  [7,] GM980007 WI920415 CGCC
  [8,] GM940039 WI920022 
  [9,] GM990027 WI920361 
[10,] GM920222 WI920024 CGCC
[11,] GM930029 WI920437 CGCC
[12,] GM990105 WI920423 CCCG
[13,] GM050009 WI920416 
[14,] GM920224 WI920423 CCCG
[15,] GM920224 WI920427 
[16,] GM960028 WI920437 
[17,] GM940031 WI920441 
[18,] GM930040 WI920417 
[19,] GM930040 WI920370 
[20,] GM050099 WI920015 
[21,] GM050099 WI920428 
[22,] GM920096 WI920007 
[23,] GM920034 WI920009 CCCG
[24,] GM960023 WI920410 
[25,] GM920031 WI920430 
[26,] GM920202 WI920015 
[27,] GM990066 WI920415 

Now I would like to compare one of the tables used to create the  
output tables with every output table, one after the other. In detail,  
I am comparing row 1 of the creator table with row 1 of the first  
output table and then row 2 of the creator table with row 2 of the  
first output table and so on until row 27 and each row for all  
columns. Then, when the first output table is finished I go on  
comparing the first creator table with the second table in the  
output, row for row for all columns. I do this for all iterations.

The first creator table is called data_mc.

# apply similarity function (lettermatch) to my data
for (i in 1:(nrow(data_mc))){
   for (y in 1:(ncol(data_mc))) {
 creator_table - data_mc[data_mc$Status==mother,y]
 output_tables - ???
 output[i,y]-(lettermatch(creator_table, output_tables))
   }
}

Could you please help me how I have to call up the output tables in  
the way I need them (described above) for the function lettermatch?  
Maybe I need to change the format of the output file?

Thank you.
Laetitia















[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (box-) plot annotation: italic within paste?


Here's a variation on the theme:

 boxplot(x[,i])
 title(main =
   bquote(.(mainlabel1)~~italic(.(predictor[i]))~~.(mainlabel2))
 )

 -Peter Ehlers

On 2010-03-08 8:46, Miguel Porto wrote:

Hello,

Try this way (not sure if it's the best way, but it works):

boxplot(x[,i],
main=substitute(expression(paste(a, ,italic(b),
,c)),list(a=mainlabel1,b=predictor[i],c=mainlabel2)),
ylab=paste(ylabel),cex.lab=cexalabel,cex.main=cexmlabel,cex.axis=1.5)

Best,
Miguel


On Mon, Mar 8, 2010 at 2:27 PM, Bernd Panassitibernd.panass...@rivm.nlwrote:


Dear R users,

in the example below the name of the genus will be displayed in the main
titles using the variable predictor[i] and paste.
I would like to have the genus name in italic. However all my attempts
using expression and substitute failed.
Does anybody know a solution?
Thanks a lot in advance. bernd


Acrobeles-c(65.1,0.0,0.0,0.0,0.0,0.0)
Acrobeloides-c(0.0,9.8,76.7,51.1,93.9,43.9)
Alaimus-c(0.0,4.9,0.0,0.0,0.0,6.3)
Aphelenchoides-c(126.5,29.3,76.7,134.1,176.7,87.9)

x-data.frame(Acrobeles,Acrobeloides,Alaimus,Aphelenchoides)

predictor- colnames(x)
ylabel-Numerical abundance
mainlabel1-Boxplot for
mainlabel2-sp.
cexalabel-1.8 # axis label
cexmlabel-1.6 # main label

par(oma=c(6,6,3,3),mar = c(6, 4, 4, 2) + 0.1,mfrow=c(2,2))

for (i in 1:ncol(x)){

boxplot(x[,i],

main=paste(mainlabel,predictor[i],mainlabel2),ylab=paste(ylabel),cex.lab=cexalabel,cex.main=cexmlabel,cex.axis=1.5)
}



---

Bernd Panassiti

National Institute of Public Health  the Environment (RIVM)
Laboratory for Ecological Risk Assessment (LER)
P.O. Box 1
3720 BA Bilthoven
The Netherlands
e-mail: bernd.panass...@rivm.nl
tel. +31 30 274 3647

Radboud University Nijmegen
Department of Environmental Science
b.panass...@science.ru.nl



Disclaimer RIVM
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fit a gamma pdf using Residual Sum-of-Squares

2010-03-08 Thread Matthew Dowle

Thanks for making it quickly reproducible - I was able to see that message 
in English within a few seconds.
The start has x=86, but the data is also called x.  Remove x=86 from start 
and you get a different error.
P.S. - please do include the R version information. It saves time for us, 
and we like it if you save us time.

vincent laperriere vincent_laperri...@yahoo.fr wrote in message 
news:883644.16455...@web24106.mail.ird.yahoo.com...
Hi all,

I would like to fit a gamma pdf to my data using the method of RSS (Residual 
Sum-of-Squares). Here are the data:

 x - c(86,  90,  94,  98, 102, 106, 110, 114, 118, 122, 126, 130, 134, 138, 
142, 146, 150, 154, 158, 162, 166, 170, 174)
 y - c(2, 5, 10, 17, 26, 60, 94, 128, 137, 128, 77, 68, 65, 60, 51, 26, 17, 
9, 5, 2, 3, 7, 3)

I have typed the following code, using nls method:

fit - nls(y ~ (1/((s^a)*gamma(a))*x^(a-1)*exp(-x/s)), start = c(s=3, a=75, 
x=86))

But I have the following message error (sorry, this is in German):


Fehler in qr(.swts * attr(rhs, gradient)) :
  Dimensionen [Produkt 3] passen nicht zur Länge des Objektes [23]
Zusätzlich: Warnmeldung:
In .swts * attr(rhs, gradient) : Länge des längeren Objektes
  ist kein Vielfaches der Länge des kürzeren Objektes

Could anyone help me with the code?
I would greatly appreciate it.
Sincerely yours,
Vincent Laperrière.



[[alternative HTML version deleted]]









__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] black cluster in salt and pepper image

2010-03-08 Thread Gregoire Pau


Hello,

The function bwlabel() in the Bioconductor package EBImage, extracts the 
connected components of an image. Denoting your binary matrix by x, the 
following code gives you the first 10 largest clusters (in size).


 library(EBImage)
 y = bwlabel(x)
 sort(table(y), dec=TRUE)[1:10]

See http://www.bioconductor.org/packages/release/bioc/html/EBImage.html 
how to download/install EBImage.


Best regards,

Greg
---
Gregoire Pau
EMBL Research Officer
http://www.ebi.ac.uk/~gpau/


Sylvain Sardy wrote:

Hi,

on a lattice, I have binary 0/1 data. 1s are rare and may form clusters. 
I would like

to know the size/length of largest cluster. Any help warmly welcome,

Sylvain.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there an equivalence of lm's anova for an rpart object ?

2010-03-08 Thread Tal Galili

Thanks Liaw!

I just implemented it using tapply:
tapply(fit$splits[, improve], rownames(fit$splits), sum)

If you can reference me to any other source / example and so on - it would
be great.  but either way - you helped me a lot, thank you !

Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Mon, Mar 8, 2010 at 4:52 PM, Liaw, Andy andy_l...@merck.com wrote:

 One way to do it (no p-values) is explained in the original CART book.
 You basically add up all the improvement (in fit$split[, improve])
 due to each splitting variable.

 Andy

 From: Tal Galili
 
  Simple example:
 
  # Classification Tree with rpart
 
  library(rpart)
 
  # grow tree
 
  fit - rpart(Kyphosis ~ Age + Number + Start,
 
   method=class, data=kyphosis)
 
  Now I would like to know how can I measure the importance
  of each of my
  three explanatory variables (Age, Number, Start) in the model?
 
  If this was a regression model, I could have looked at p
  values from the
  anova F test (between lm models with and without the
  variable). But what
  is the equivalence of using anova on lm to an rpart object ?
 
  Any pointers, insights and references to this question will
  be helpful.
 
  Thanks,
 
  Tal
 
 
 
  Contact
  Details:---
  Contact me: tal.gal...@gmail.com |  972-52-7275845
  Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il
  (Hebrew) |
  www.r-statistics.com (English)
  --
  
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 Notice:  This e-mail message, together with any attach...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fit a gamma pdf using Residual Sum-of-Squares


On 2010-03-08 8:24, vincent laperriere wrote:

Hi all,

I would like to fit a gamma pdf to my data using the method of RSS (Residual 
Sum-of-Squares). Here are the data:

  x- c(86,  90,  94,  98, 102, 106, 110, 114, 118, 122, 126, 130, 134, 138, 
142, 146, 150, 154, 158, 162, 166, 170, 174)
  y- c(2, 5, 10, 17, 26, 60, 94, 128, 137, 128, 77, 68, 65, 60, 51, 26, 17, 9, 
5, 2, 3, 7, 3)

I have typed the following code, using nls method:

fit- nls(y ~ (1/((s^a)*gamma(a))*x^(a-1)*exp(-x/s)), start = c(s=3, a=75, 
x=86))



There are a couple of problems:
1) don't include a start value for x; it's not a 'parameter';
2) you're trying to fit a *density* function to data that's
   clearly not normalized.

A quick check shows that your empirical curve integrates to
about 4000:

 y[-1] * diff(x)
 # 3992

This almost works:

 fit- nls(y ~ 4000*(1/((s^a)*gamma(a))*x^(a-1)*exp(-x/s)),
   start = c(s=3, a=75))

but not quite; I still get an error. So let's do the right
thing and plot the data and some test fits:

 plot(y ~ x)
 curve(4000 * dgamma(x, shape=75, scale=3), add=TRUE)
 # no good
 curve(4000 * dgamma(x, shape=75, scale=1), add=TRUE)
 # no good
 curve(4000 * dgamma(x, shape=75, scale=1.6), add=TRUE)
 # pretty good!

 fit- nls(y ~ 4000*(1/((s^a)*gamma(a))*x^(a-1)*exp(-x/s)),
   start = c(s=1.6, a=75))

 coef(fit)
 #s a
 # 1.399638 86.395409

 xx - seq(86, 174, length=100)
 yy - predict(fit, data.frame(x=xx))
 plot(y ~ x)
 lines(yy ~ xx, col='red')

 -Peter Ehlers



But I have the following message error (sorry, this is in German):


Fehler in qr(.swts * attr(rhs, gradient)) :
   Dimensionen [Produkt 3] passen nicht zur L�nge des Objektes [23]
Zus�tzlich: Warnmeldung:
In .swts * attr(rhs, gradient) : L�nge des l�ngeren Objektes
   ist kein Vielfaches der L�nge des k�rzeren Objektes

Could anyone help me with the code?
I would greatly appreciate it.
Sincerely yours,
Vincent Laperri�re.



[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Combinations and table selection problem (reviewed)

2010-03-08 Thread Carlos Guerra

Dear all,

I have the following dataset:

t - structure(list(pUrb = c(20.160307, 51.965649, 26.009581, 3.141484, 
64.296826
), pUrb_class = structure(c(1L, 1L, 1L, 1L, 1L), .Label = c(0, 
1), class = factor), pAgri = c(79.921386, 46.657713, 
40.269204, 0, 0.440691), pAgri_class = structure(c(1L, 1L, 
1L, 1L, 1L), .Label = c(0, 1), class = factor), pNatFor = c(0, 
0, 0, 0, 0), pNatFor_class = structure(c(1L, 1L, 1L, 1L, 
1L), .Label = c(0, 1), class = factor), pArtFor = c(0, 
0, 0, 0, 24.566125), pArtFor_class = structure(c(1L, 1L, 
1L, 1L, 1L), .Label = c(0, 1), class = factor), pMixFor = c(0, 
0, 33.578923, 96.940185, 10.655666), pMixFor_class = structure(c(1L, 
1L, 1L, 2L, 1L), .Label = c(0, 1), class = factor), 
pPioMo = c(0, 0, 0, 0, 0), pPioMo_class = structure(c(1L, 
1L, 1L, 1L, 1L), .Label = c(0, 1), class = factor), 
SwiLU = c(0.565419, 0.937982, 1.071812, 0.056002, 0.831812
), SwiLU_class = structure(c(1L, 2L, 2L, 1L, 2L), .Label = c(1, 
0, 2), class = factor), NumP = c(4L, 7L, 3L, 4L, 6L
), NumP_class = structure(c(2L, 3L, 1L, 2L, 3L), .Label = c(1, 
0, 2), class = factor), Roaddist = c(1615.55, 2140.09, 
2308.68, 2088.06, 2000), Roaddist_class = c(NA, NA, NA, NA, 
NA), Roaddens = c(0, 0, 0, 0, 0), Roaddens_class = c(NA, 
NA, NA, NA, NA), SwiSlo = c(0, 0, 0, 0, 0), SwiSlo_class = structure(c(1L, 
1L, 1L, 1L, 1L), .Label = c(1, 2, 0, 3), class = factor), 
SwiAlt = c(0, 0, 0, 0, 0), SwiAlt_class = structure(c(1L, 
1L, 1L, 1L, 1L), .Label = c(1, 2, 0, 3), class = factor)), .Names 
= c( 
pUrb, pUrb_class, pAgri, pAgri_class, pNatFor, pNatFor_class, 
pArtFor, pArtFor_class, pMixFor, pMixFor_class, pPioMo, 
pPioMo_class, SwiLU, SwiLU_class, NumP, NumP_class, 
Roaddist, Roaddist_class, Roaddens, Roaddens_class, SwiSlo, 
SwiSlo_class, SwiAlt, SwiAlt_class), row.names = c(NF1885, NF1886, 
NF1893, NF1894, NF1895), class = data.frame)


#obtaining the number of combinations present in the data base#

library(survival)

w - strata(t$NumP_class,t$SwiLU_class,t$pMixFor_class, sep=,)
table(w)
w
t$NumP_class=1,t$SwiLU_class=0,t$pMixFor_class=0 
   1 
t$NumP_class=0,t$SwiLU_class=1,t$pMixFor_class=0 
   1 
t$NumP_class=0,t$SwiLU_class=1,t$pMixFor_class=1 
   1 
t$NumP_class=2,t$SwiLU_class=0,t$pMixFor_class=0 
   2

#obtaining median value#

median(table(w))
[1] 1


In this stage I have 3 questions:

1st:
how can I obtain the combinations witch are present over the median (in this 
case the fourth combination)?

2nd:
how can I obtain the combinations witch are present over the median and have at 
least one condition  2 (in this case the fourth combination)?

3rd:
how can I select/extract from the original table the rows witch comply with the 
2nd question, in this case (row id: NF1895):


pUrb pUrb_class pAgri pAgri_class pNatFor pNatFor_class  
pArtFor pArtFor_class  pMixFor
NF1895 64.296826  0  0.440691   0   0 0 
24.56612 0 10.65567
   pMixFor_class pPioMo pPioMo_classSwiLU SwiLU_class NumP NumP_class 
Roaddist Roaddist_class Roaddens
NF1895 0  00 0.831812   06  2  
2000.00 NA0
   Roaddens_class SwiSlo SwiSlo_class SwiAlt SwiAlt_class
NF1895 NA  01  01




Best Regards,

Carlos
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to convert character variables into numeric variables directly

2010-03-08 Thread Xumin Zeng

Here is the example.

 age=18:29
 height=c(76.1,77,78.1,78.2,78.8,79.7,79.9,81.1,81.2,81.8,82.8,83.5)
 type=c(A, B, C, D,A, B, C, D,A, B, C, D)
 typec=c(0,4,2,9,0,7,2,3,0,1,2,3)
 typen=c(0,1,2,3,0,1,2,3,0,1,2,3)
 data1=data.frame(age=age,height=height, type=type, typec=typec, 
typen=typen)

 data1[,3]=as.numeric(data1[,3])
 data1[,4]=as.numeric(data1[,4])
 data1[,5]=as.numeric(data1[,5])

 print(data1)

and I got the output as:

   age height type typec typen
1   18   76.11 1 0
2   19   77.02 5 1
3   20   78.13 3 2
4   21   78.24 7 3
5   22   78.81 1 0
6   23   79.72 6 1
7   24   79.93 3 2
8   25   81.14 4 3
9   26   81.21 1 0
10  27   81.82 2 1
11  28   82.83 3 2
12  29   83.54 4 3

The typec is not what I expected. How can I get the direct conversion 
from character to numeric and get the following output? 

   age height type typec typen
1   18   76.11 0 0
2   19   77.02 4 1
3   20   78.13 2 2
4   21   78.24 9 3
5   22   78.81 0 0
6   23   79.72 7 1
7   24   79.93 2 2
8   25   81.14 3 3
9   26   81.21 0 0
10  27   81.82 1 1
11  28   82.83 2 2
12  29   83.54 3 3

Thanks.

Xumin
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] questions

2010-03-08 Thread amitava


Can somebody tell me how can I calculate derivatives of some functions
through r citing some examples??
by Taylor's expansion formulae how can I handle complecated functions which
can not be handled properly manually by r-script?
-- 
View this message in context: 
http://n4.nabble.com/questions-tp1584771p1584771.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to convert character variables into numeric variables directly

2010-03-08 Thread milton ruser

Try: as.numeric(as.character( typec))

milton
On Mon, Mar 8, 2010 at 12:55 PM, Xumin Zeng xumin.z...@abbott.com wrote:

 Here is the example.

  age=18:29
  height=c(76.1,77,78.1,78.2,78.8,79.7,79.9,81.1,81.2,81.8,82.8,83.5)
  type=c(A, B, C, D,A, B, C, D,A, B, C, D)
  typec=c(0,4,2,9,0,7,2,3,0,1,2,3)
  typen=c(0,1,2,3,0,1,2,3,0,1,2,3)
  data1=data.frame(age=age,height=height, type=type, typec=typec,
 typen=typen)

  data1[,3]=as.numeric(data1[,3])
  data1[,4]=as.numeric(data1[,4])
  data1[,5]=as.numeric(data1[,5])

  print(data1)

 and I got the output as:

   age height type typec typen
 1   18   76.11 1 0
 2   19   77.02 5 1
 3   20   78.13 3 2
 4   21   78.24 7 3
 5   22   78.81 1 0
 6   23   79.72 6 1
 7   24   79.93 3 2
 8   25   81.14 4 3
 9   26   81.21 1 0
 10  27   81.82 2 1
 11  28   82.83 3 2
 12  29   83.54 4 3

 The typec is not what I expected. How can I get the direct conversion
 from character to numeric and get the following output?

   age height type typec typen
 1   18   76.11 0 0
 2   19   77.02 4 1
 3   20   78.13 2 2
 4   21   78.24 9 3
 5   22   78.81 0 0
 6   23   79.72 7 1
 7   24   79.93 2 2
 8   25   81.14 3 3
 9   26   81.21 0 0
 10  27   81.82 1 1
 11  28   82.83 2 2
 12  29   83.54 3 3

 Thanks.

 Xumin
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to convert character variables into numeric variables directly

2010-03-08 Thread Xumin Zeng

Thanks, it works!

Xumin




milton ruser milton.ru...@gmail.com 
03/08/2010 01:02 PM

To
Xumin Zeng xumin.z...@abbott.com
cc
r-help r-help@r-project.org
Subject
Re: [R] how to convert character variables into numeric variables directly






Try: as.numeric(as.character( typec))

milton
On Mon, Mar 8, 2010 at 12:55 PM, Xumin Zeng xumin.z...@abbott.com wrote:
Here is the example.

 age=18:29
 height=c(76.1,77,78.1,78.2,78.8,79.7,79.9,81.1,81.2,81.8,82.8,83.5)
 type=c(A, B, C, D,A, B, C, D,A, B, C, D)
 typec=c(0,4,2,9,0,7,2,3,0,1,2,3)
 typen=c(0,1,2,3,0,1,2,3,0,1,2,3)
 data1=data.frame(age=age,height=height, type=type, typec=typec,
typen=typen)

 data1[,3]=as.numeric(data1[,3])
 data1[,4]=as.numeric(data1[,4])
 data1[,5]=as.numeric(data1[,5])

 print(data1)

and I got the output as:

  age height type typec typen
1   18   76.11 1 0
2   19   77.02 5 1
3   20   78.13 3 2
4   21   78.24 7 3
5   22   78.81 1 0
6   23   79.72 6 1
7   24   79.93 3 2
8   25   81.14 4 3
9   26   81.21 1 0
10  27   81.82 2 1
11  28   82.83 3 2
12  29   83.54 4 3

The typec is not what I expected. How can I get the direct conversion
from character to numeric and get the following output?

  age height type typec typen
1   18   76.11 0 0
2   19   77.02 4 1
3   20   78.13 2 2
4   21   78.24 9 3
5   22   78.81 0 0
6   23   79.72 7 1
7   24   79.93 2 2
8   25   81.14 3 3
9   26   81.21 0 0
10  27   81.82 1 1
11  28   82.83 2 2
12  29   83.54 3 3

Thanks.

Xumin
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with Hmisc, cut2, split and quantile


On 2010-03-08 8:47, Guy Green wrote:


Hello,
I have a set of data with two columns: Target and Actual.  A
http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt  is
attached but the data looks like this:

Actual  Target
-0.125  0.016124906
0.135   0.120799865
... ...
... ...

I want to be able to break the data into tables based on quantiles in the
Target column.  I can see (using cut2, and also quantile) how to get the
barrier points between the different quantiles, and I can see how I would
achieve this if I was just looking to split up a vector.  However I am
trying to break up the whole table based on those quantiles, not just the
vector.

The following code shows me the ranges for the deciles of the Target data:
library(Hmisc)
read_data=read.table(C:/Sample table.txt, head = T)
table(cut2(Read_data$Target,g=10))

However I would like to be able to break the table into ten separate tables,
each with both Actual and Target data, based on the Target data
deciles:

top_decile = ...(top decile of read_data, based on Target data)
next_decile = ...and so on...
bottom_decile = ...


I would just add a factor variable indicating to which decile
a particular observation belongs:

 dat$DEC - with(dat, cut(Target, breaks=10, labels=1:10))

If you really want to have separate data frames you can then
split on the decile:

 L - split(dat, dat$DEC)


   -Peter Ehlers



That way I could manipulate the deciles, graph them separately (and
together) and so on, just as easily as I can the whole table.  I'm sure this
must be simple, but I can't see the way forward.  I have also looked at
split() and quantile() but have not been able to get them to achieve what I
am after.  Can anybody see a simple way foward on this?

Thanks,
Guy


--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working with combinations

2010-03-08 Thread Erich Neuwirth

I had a bug in my last solution.
This one should work.

next.combin - function(oldcomb,n){
  lcomb - length(oldcomb)
  hole.pos - last.hole.pos(oldcomb,n)
  if ((hole.pos == lcomb)  oldcomb[lcomb]==n) {
return(NA)
  }
  newcomb-oldcomb
  newcomb[hole.pos:lcomb]-oldcomb[hole.pos]+(1:(lcomb-hole.pos+1))  
  return(newcomb)
}

last.hole.pos - function(comb,n){
 lcomb - length(comb)
 if (comb[lcomb]n) {
   return(lcomb)
 }
 diffs - comb[-1]-comb[-lcomb]
 if (max(diffs)==1) {
return(lcomb)
 } 
 diffpos - which(diffs1)
 return(diffpos[length(diffpos)])  
}

demo - function(n,k){
  currCombin - 1:k 
  #this is the first possible combination k out if n. 
  selectedSet - NULL
  while (!any(is.na(currCombin))){
#  if(currCombin passes test) {
 selectedSet - rbind(selectedSet,currCombin)
#  }
  currCombin - next.combin(currCombin,n)
  }
  rownames(selectedSet)-NULL
  selectedSet
}

On Mar 6, 2010, at 9:50 PM, Herm Walsh wrote:

 The usage here is exactly what I am looking for, thanks.  However this 
 function seems to omit some combinations.  Continuing with the example below, 
 given an always true condition in #***, it will only produce 7 combinations 
 (omitting 1,5 1,4 and 2,5).
 Am I overlooking something that makes it produces all of the combinations?
 
  
 
 From: Erich Neuwirth erich.neuwi...@univie.ac.at
 To: Herm Walsh hermwa...@yahoo.com; r-help r-help@r-project.org
 Sent: Sat, March 6, 2010 9:12:13 AM
 Subject: Re: [R] Working with combinations
 
 currCombin - c(1,2) #this is the first possible combination 2 out if 5. 
 Since the vector has length 2, we are doing 2 out of x here.
 selectedSet - NULL
 while (!is.na(currCombin)){
if( #*** currCombin passes test ***#   ) {
   selectedSet - rbind(selectedSet,currCombin)
}
  currCombin - next.combin(currCombin,5) #here we say that we want 2 out of 5
  }
 
 
 
 On Mar 6, 2010, at 5:54 PM, Herm Walsh wrote:
 
 Erich-
 This approach would be great for my context.  However, in the code below I 
 do not see how to restrict the output to the set of combinations I am 
 looking for.  For example, suppose I am looking for the 10 two element 
 combinations of 1:5.  Can you give me some psuedocode that shows how to do 
 this?
 After putting the code below in a loop I do not see how to restrict the 
 output to containing only numbers from 1:5.
 thanks very much.
 
 From: Erich Neuwirth erich.neuwi...@univie.ac.at
 To: Herm Walsh hermwa...@yahoo.com
 Cc: David Winsemius dwinsem...@comcast.net; r-help@r-project.org
 Sent: Wed, March 3, 2010 2:10:34 PM
 Subject: Re: [R] Working with combinations
 
 The following code takes a combination of type n over k represented by an 
 increasing sequence
 as input an produces the lexicographically next combinations.
 So you can single step through all possible combinations and apply your 
 filter criteria
 before you produce the next combination.
 
 
 next.combin - function(oldcomb,n){
lcomb - length(oldcomb)
hole.pos - last.hole.pos(oldcomb,n)
if ((hole.pos == lcomb)  oldcomb[lcomb]==n) {
  return(NA)
}
newcomb-oldcomb
newcomb[hole.pos]-oldcomb[hole.pos]+1
return(newcomb)
 }
 
 last.hole.pos - function(comb,n){
   lcomb - length(comb)
   diffs - comb[-1]-comb[-lcomb]
   if (max(diffs)==1) {
 return(lcomb)
   } 
   diffpos - which(diffs1)
   return(diffpos[length(diffpos)])  

 }
 
 On Mar 3, 2010, at 7:35 PM, Herm Walsh wrote:
 
 Thanks David for the thoughts.  The challenge I have with this approach is 
 that the criteria I have is defined by a series of tests--which I do not 
 think I could substitute in in place of the logical indexing.
 
 In the combinations code I was hoping there is a step where, each new 
 combination is added to the current list of combinations.  If this were the 
 case, I could put my series of tests in the code right there and then store 
 the combination if appropriate.
 
 However, evalutating the code--which uses recursion--I am not sure if this 
 approach will work.  The combinations code is listed below.  Is there a 
 simple place(s) where I could insert my tests, operating on the current 
 combination?
 
 function (n, r, v = 1:n, set = TRUE, repeats.allowed = FALSE) 
 {
 if (mode(n) != numeric || length(n) != 1 || n  1 || (n%%1) != 
 0) 
 stop(bad value of n)
 if (mode(r) != numeric || length(r) != 1 || r  1 || (r%%1) != 
 0) 
 stop(bad value of r)
 if (!is.atomic(v) || length(v)  n) 
 stop(v is either non-atomic or too short)
 if ((r  n)  repeats.allowed == FALSE) 
 stop(r  n and repeats.allowed=FALSE)
 if (set) {
 v - unique(sort(v))
 if (length(v)  n) 
 stop(too few different elements)
 }
 v0 - vector(mode(v), 0)
 if (repeats.allowed) 
 sub - function(n, r, v) {
 if (r == 0) 
 v0
 else if (r == 1) 
 matrix(v, n, 1)
 else if (n == 1)

[R] Executable for Production Use

2010-03-08 Thread Ma Ismail - NewYork-MEAG-NY

Hi,

A few of the developers on our Quant team are using R for data calculation and 
to generate a resulting CSV file.  They have R installed on their workstations. 
 We are interested in having this deployed to user workstations where the users 
will not have R installed on their workstations.  Is there a way to create an 
executable that the users can just run without R installed on their workstation?

Thanks in advance for your help.

Ismail Ma
Head of IT Applications
MEAG New York Corporation
Telephone: (212) 583-4850
Fax: (646) 521-7950
E-Mail: i...@meag-ny.commailto:i...@meag-ny.com

This e-mail and any files transmitted with it are confid...{{dropped:14}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Executable for Production Use

2010-03-08 Thread Barry Rowlingson

On Mon, Mar 8, 2010 at 6:44 PM, Ma Ismail - NewYork-MEAG-NY
i...@meag-ny.com wrote:
 Hi,

 A few of the developers on our Quant team are using R for data calculation 
 and to generate a resulting CSV file.  They have R installed on their 
 workstations.  We are interested in having this deployed to user workstations 
 where the users will not have R installed on their workstations.  Is there a 
 way to create an executable that the users can just run without R installed 
 on their workstation?

 No, not for any meaningful value of the word 'installed'. If you want
to run R, you need the R interpreter and all the functions and stuff
it carries around with it.Maybe you could bundle this all up into one
monster executable, but why bother? R install is easy and portable
(stick it anywhere, it runs). Even probably a network share.

 Oh, you didn't say what OS you were using. Did you read the posting
guide? Also, this has been asked before. Did you search the mailing
lists?

 I've noticed a lot of financial corporates getting into R and getting
free help from R-help. From now on, I'm charging $50 per email to
answer questions from anyone advertising as a 'quant'. Who do I
invoice?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using sprintf to pass a variable to a RMySQL query

2010-03-08 Thread Don MacQueen


I always use paste()

i - 1
sqlcmd_ScaffLen - paste(SELECT scaffold.length
FROM scaffold, scaffold2contig, contig2read
WHERE scaffold.scaffold_id=scaffold2contig.scaffold_id AND
scaffold2contig.contig_id=contig2read.contig_id AND
contig2read.read_id LIKE '%MG, i ,%', sep='')

That should create bits like
   LIKE '%MG1%'
   LIKE '%MG2%'
and so on.

You just have to get the nesting of the single and double quotes 
correct - the SQL requires single quotes, so use double quotes for 
the fixed character strings insidte paste(). That, and use sep='' to 
get rid of unwanted space characters.


Using paste is also effective for constructs like
  IN (3,4,5)
or
  IN ('a','b','c')
though it can be necessary to nest one paste within another

-Don

At 2:06 PM +0100 3/8/10, alison waller wrote:

Hello,

I am using RmySQL and would like to iterate through a few queries.

I would like to use sprintf but I think I'm having problems mixing and
matching the sprintf syntax and the SQL regex.

I have checked my sqlcmd and it works when I wan to match %MG1% but how
do I iterate for i 1-72?  Escape characters,?

thanks in advance

i-1
sqlcmd_ScaffLen-sprintf('SELECT scaffold.length
FROM scaffold,scaffold2contig,contig2read
WHERE scaffold.scaffold_id=scaffold2contig.scaffold_id AND
scaffold2contig.contig_id=contig2read.contig_id AND contig2read.read_id LIKE
'%MG%s%' ,i)

= Here is my vague error message

Error: unexpected input in:

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Monetary support to the R-project (Was: Re: Executable for Production Use)

2010-03-08 Thread Henrik Bengtsson

On Mon, Mar 8, 2010 at 8:46 PM, Barry Rowlingson
b.rowling...@lancaster.ac.uk wrote:
 On Mon, Mar 8, 2010 at 6:44 PM, Ma Ismail - NewYork-MEAG-NY
 i...@meag-ny.com wrote:
 Hi,

 A few of the developers on our Quant team are using R for data calculation 
 and to generate a

[snip]

  I've noticed a lot of financial corporates getting into R and getting
 free help from R-help. From now on, I'm charging $50 per email to
 answer questions from anyone advertising as a 'quant'. Who do I
 invoice?

For companies and others wondering how to give something back, it is
possible to support R and the R Foundation either through a donation:

http://www.r-project.org/ - Foundation - Donations
[http://www.r-project.org/foundation/donations.html]

or via a membership:

http://www.r-project.org/ - Foundation - Membership
[http://www.r-project.org/foundation/membership.html]

or both.

/Henrik


 Barry

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Executable for Production Use

2010-03-08 Thread Stefan

Ma Ismail - NewYork-MEAG-NY ima at meag-ny.com writes:

 
 Hi,
 
 A few of the developers on our Quant team are using R for data 
calculation andto generate a resulting CSV file. They have R 
installed on their workstations. We are interested in having this 
deployed to user workstations where the users will not have R 
installed on their workstations. Is there a way to create an 
executable that the users can  just run without R installed on 
their workstation?
 
 Thanks in advance for your help.
 
 Ismail Ma
 Head of IT Applications
 MEAG New York Corporation
 Telephone: (212) 583-4850
 Fax: (646) 521-7950
 E-Mail: ima at meag-ny.commailto:ima at meag-ny.com
 
 
 
 

Maybe I've misunderstood You, but You can manage a single R 
installation on a central server, and let clients hook up to 
it from Emacs+ESS or Eclipse+StatET. I've used both solutions, 
and they work like a charm... The emacs solution uses SSH and 
Eclipse a Java server thingy.

Plz note that I'm from the Linux side of things. So my server 
is a Debian, and my clients are all Ubuntu. I have no clue how 
(or even if) these setups work on windows.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] likelihood

2010-03-08 Thread Ashta

Hi all,

Does any one know how to write the likelihood function for Poisson distribution
in R when  P(x=0).

 For normal case, it an be written as follows,


  n  *  log(lambda)  -  lambda  *  n  *  mean(dat)



Any help is highly appreciated

Ashta

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error when using svm routine: Error in if (any(co)) { : missing value where TRUE/FALSE needed

2010-03-08 Thread Xumin Zeng

Hi,

I met with this error message with the following data set. Do you know how 
to resolve it? Thanks.


 data-read.table(c://temp3//abc.csv, sep = ,, header=T)
 classwt-c( 0.5806452, 0.4193548)
 y-data[,1]
 x-data[,2:ncol(data)]
 print(y)
 [1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1
[36] 1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2
 print(x)
   rs2289472 rs1551398 rs7927894
1 CTAACT
2 TTAACC
3 TTAGTT
4 TTAACT
5 CCAACT
6 CTAACT
7 TTAACT
8 CTAGCT
9 CCAACT
10TTAGTT
11CTGGCC
12CTGGCC
13CTAGCC
14CTAGTT
15CTAACT
16TTAACT
17CTAGCT
18TTAACT
19TTAATT
20CTAGCT
21CTAGTT
22CTAACT
23TTAACT
24CCAACC
25CTAGCC
26CTAGCT
27CTGGCT
28CTAGCC
29CTGGCC
30TTGGCT
31CTAGCT
32TTAGTT
33CCGGCC
34TTAACT
35CTGGTT
36TTAGCT
37CTAGCC
38TTAACC
39TTAATT
40TTAATT
41CTGGCT
42CTAGTT
43TTAACT
44CTAGCC
45TTAGCC
46CTAACC
47CCAACT
48CTAACT
49CCAACC
50TTAATT
51TTGGCT
52CTAGCT
53TTAGTT
54TTAACT
55CTAACC
56CTAGCT
57CTAGCC
58CTAACC
59CTAGCC
60CTAGTT
61TTAACC
62TTGGCT
 svm.fit=svm(y=as.factor(y),x=x,class.weights = classwt)
Error in if (any(co)) { : missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In FUN(newX[, i], ...) : NAs introduced by coercion
2: In FUN(newX[, i], ...) : NAs introduced by coercion
3: In FUN(newX[, i], ...) : NAs introduced by coercion
 


Xumin
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Making FTP operations with R

2010-03-08 Thread Orvalho Augusto

Dears I need to make some very basic FTP operations with R.

I need to do a lot of get and issue a respective delete command
too on the same connection.

How can I do that?

Thanks in advance

Caveman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] why this function does not run correctly?

2010-03-08 Thread Marco Bressan

Hi, 
my name is Marco Bressan i'm working to improve ADati package. I study 
psicology ad Padua University (Italy). I have this problem: why bartlett.test 
function running good and my anova.welch function no?

Ciao,
il mio nome è Marco Bressan e sto lavorando per migliorare il pacchetto ADati. 
Studio psicologia all'università di Padova. Non capisco come mai la funzione 
che ho fatto mi dia quell'errore

this is anova.welch:

anova.welch - function(x, ...) UseMethod(anova.welch)

anova.welch.default -
## this is the algoritm, I think it's ok (I copy this from Welch() inside ADati)
function (x, y = NULL, nu = c(0,0) ,...)
{
mx - tapply(x,y,mean)
s2x - tapply(x,y,var)
k - length(nu)
w - nu/s2x
Xp - sum(w * mx)/sum(w)
Fnum - sum(w * (mx - Xp)^2)/(k - 1)
Fden - 0
for (h in 1:k) {
a - 1/(nu[h] - 1)
b - (1 - (w[h]/sum(w)))^2
Fden - Fden + (a * b)
}
gl2.den - Fden * 3
Fden - Fden * ((2 * (k - 2))/(k^2 - 1)) + 1
Fw - Fnum/Fden
STATISTIC - Fw
gl1 - k - 1
gl2.num - k^2 - 1
gl2 - gl2.num/gl2.den
PARAMETER - c(gl1, gl2)
PVAL - pf(Fw, gl1, gl2, lower.tail = FALSE)
METHOD - Welch ANOVA
DNAME - NA
names(STATISTIC) - F
names(PARAMETER) - c(num df, denom df)
RVAL - list(statistic = STATISTIC, parameter = PARAMETER, 
p.value = PVAL, method = METHOD)
class(RVAL) - htest
return(RVAL)
}

anova.welch.formula -  
  ## I copy this from bartlett.test
function(formula, data, subset, na.action, ...)
{
if(missing(formula) || (length(formula) != 3L))
stop('formula' mancante o incorretta)
m - match.call(expand.dots = FALSE)
if(is.matrix(eval(m$data, parent.frame(
m$data - as.data.frame(data)
m[[1L]] - as.name(model.frame)
mf - eval(m, parent.frame())
DNAME - paste(names(mf), collapse =  by )
names(mf) - NULL
y - do.call(anova.welch, as.list(mf))
y$data.name - DNAME
y
}

 n1=10 
  ## this is the test
 n2=15
 n3=20
 y=c(rnorm((n1+n2),5,2),rnorm(n3,7,8))
 A=factor(c(rep(1,n1),rep(2,n2),rep(3,n3)))
 anova.welch(y,A,c(n1,n2,n3))

Welch ANOVA

data:  
F = 2.3025, num df = 2.000, denom df = 27.384, p-value = 0.1191

 anova.welch(y~A,nu=c(n1,n2,n3))
Errore in model.frame.default(formula = y ~ A, ... = list(nu = c(n1, n2,  : 
  invalid type (pairlist) for variable '(...)'


Sorry for my english, tanks you if you can help me :)
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setClass or setValidity?

2010-03-08 Thread Martin Morgan

On 03/08/2010 07:18 AM, Albert-Jan Roskam wrote:
 Sorry: there was an error in the last sentence:
 And, inside those validity checks, is most of the checking done with 'if' 
 'else' computations, or is it also common to use try()?

For me it's a matter of taste, and usual to use if... (because you know
explicitly what you're trying to validate, whereas try() implies a kind
of 'something might go wrong...'). I find myself using setValidity() to
 separate out class definition from implementation.

Best,

Martin

 
 Cheers!!
 Albert-Jan
 
 ~~
 In the face of ambiguity, refuse the temptation to guess.
 ~~
 
 --- On Mon, 3/8/10, Albert-Jan Roskam fo...@yahoo.com wrote:
 
 
 From: Albert-Jan Roskam fo...@yahoo.com
 Subject: [R] setClass or setValidity?
 To: r-help@r-project.org
 Date: Monday, March 8, 2010, 4:14 PM
 
 
 Hi, 
  
 I'm reading up on S4 classes *). There seem to be at least two ways of input 
 validation:
 setClass() (using the 'validity' argument)  and setValidity(). Is it a matter 
 of taste which function is used? Or should more complex validation code 
 better be put in a setValiditity call?
 
 *) A (Not So) Short Introduction to S4 Object Oriented Programming in R 
 V0.5.1 Christophe Genolini August 20, 2008
  
 And, inside those validity checks, is most of the checking done with 'if' 
 'else' computations, or is it also common to use except()?
 
 Cheers!!
 Albert-Jan
 
 ~~
 In the face of ambiguity, refuse the temptation to guess.
 ~~
 
 
   
 [[alternative HTML version deleted]]
 
 
 -Inline Attachment Follows-
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
   
   [[alternative HTML version deleted]]
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] questions about Cusum

2010-03-08 Thread Christopher W. Ryan

I've found the surveillance package useful for monitoring walk-in 
clinic visits in our county as the influenza pandemic evolved. It might 
serve your needs for monitoring IFI (invasive fungal infections?) in 
your hospital.


--Chris
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY  13904
cryanatbinghamtondotedu

If you want to build a ship, don't drum up the men to gather wood, 
divide the work and give orders. Instead, teach them to yearn for the 
vast and endless sea.  [Antoine de St. Exupery]


Achim Zeileis wrote:

On Sun, 7 Mar 2010, sdzhangping wrote:


Dear friends:
  I have just read an article entitled  Monitoring of nosocomial 
invasive aspergillosis and early evidence of an outbreak using 
cumulative sum tests (CUSUM), which is published in Clinical 
Microbiology and Infection. We have great need to estimate the 
fluctuation of incidence of IFI in our hospital. But I don't know the 
details of the stastical method and don't know where can I get a 
Cusum package. Can you  give me some materials about Cusum test? An 
example is more appreciated.


There are many techniques with the label CUSUM and I didn't look up what 
the specific meaning is in the article you cite. There are tests based 
on (mostly linear) regression models with that label, many (but not all 
conceivable) of these are implemented in the strucchange package. See

  vignette(strucchange-intro, package = strucchange)
as well as
  citation(strucchange)
for pointers to papers explaining the ideas behind it. But beyond that 
there are also other techniques, in particular from statistical process 
control. See the spc, qcc, IQCC packages among others.


hth,
Z


Yours sincerely Ping Zhang
March 7, 2010
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] page boundaries for latex printing of summary.formula objects in Hmisc

2010-03-08 Thread Erik Iverson


Hello,

Warning, I'm guessing only those who have used the Hmisc package's 
summary.formula function with LaTeX will be able to offer much help here.


I am using the Hmisc package's summary.formula function to produce 
tables for a LaTeX report.  The latex function in the same package 
supports longtables in LaTeX.  Ideally, I would like for page breaks in 
the LaTeX output to only occur at variable boundaries.


## Sample R code

## load the Hmisc package
library(Hmisc)

## create an example data.frame
test.df - data.frame(sex = gl(2, 110, labels = c(Male, Female)),
fac1 = sample(gl(11, 100, labels = paste(V1 Level, 1:11))),
fac2 = sample(gl(11, 100, labels = paste(V2 Level, 1:11))),
fac3 = sample(gl(11, 100, labels = paste(V3 Level, 1:11))),
fac4 = sample(gl(11, 100, labels = paste(V4 Level, 1:11))),
fac5 = sample(gl(11, 100, labels = paste(V5 Level, 1:11))),
fac6 = sample(gl(11, 100, labels = paste(V6 Level, 1:11


## create the summary.formula object
sf - summary.formula(sex ~ fac1 + fac2 + fac3 + fac4 + fac5 + fac6,
  data = test.df,
  method = reverse)


## print out the LaTeX code to the screen, not a file
latex(sf, file = , longtable = TRUE)


Notice how the LaTeX output puts the newline in the middle of factor 4, 
instead of before or after.


excerpt of LaTeX ouput follows

fac4~:~V4~Level~111\%~{\scriptsize~(59)}~7\%~{\scriptsize~(41)}\tabularnewline
V4~Level~210\%~{\scriptsize~(56)}~8\%~{\scriptsize~(44)}\tabularnewline
V4~Level~3~9\%~{\scriptsize~(47)}10\%~{\scriptsize~(53)}\tabularnewline
V4~Level~4~9\%~{\scriptsize~(49)}~9\%~{\scriptsize~(51)}\tabularnewline
V4~Level~5~9\%~{\scriptsize~(51)}~9\%~{\scriptsize~(49)}\tabularnewline
V4~Level~6~9\%~{\scriptsize~(50)}~9\%~{\scriptsize~(50)}\tabularnewline
V4~Level~7~7\%~{\scriptsize~(39)}11\%~{\scriptsize~(61)}\tabularnewline
\newpage
V4~Level~8~9\%~{\scriptsize~(51)}~9\%~{\scriptsize~(49)}\tabularnewline
V4~Level~9~9\%~{\scriptsize~(51)}~9\%~{\scriptsize~(49)}\tabularnewline
V4~Level~10~9\%~{\scriptsize~(47)}10\%~{\scriptsize~(53)}\tabularnewline
V4~Level~11~9\%~{\scriptsize~(50)}~9\%~{\scriptsize~(50)}\tabularnewline


I expect this given the documentation in ?latex of lines.page, which is 
set to 40 by default.


lines.page:
Applies if ‘longtable=TRUE’. No more than ‘lines.page’
  lines in the body of a table will be placed on a single page.
  Page breaks will only occur at ‘rgroup’ boundaries.

The problem is that variable boundaries don't in general correspond to 
constants, like 40 lines.


So, rgroup sounds promising.  I want the lines per variable to 
correspond to be the n.rgroup values, but since my tables are dynamic, 
in that the variables and number of levels in them change over time, I 
can't think of a way to define n.rgroup without specifying it per 
variable.  My first thought was to compute it from the number of levels 
per variable in the formula.


In fact, I did try this but immediately ran into some misconceptions I 
had about how continuous variables are represented internally within the 
latex function.


Is there any easier way to accomplish this breaking of pages on variable 
boundaries using this set of functions?  I suspect not, but thought I'd 
ask.  I think I can figure out the approach I suggested in the preceding 
2 paragraphs, but just want to make sure I'm not missing something ...


Thanks a lot!
Erik Iverson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate for zoo or its?

2010-03-08 Thread Jeffrey J. Hallman

Or for real power and flexibility, see the 'convert()' function in package
tis.

Jeff

Gabor Grothendieck ggrothendi...@gmail.com writes:
 See ?aggregate.zoo, e.g.

 library(zoo)
 z - zoo(1:1000, as.Date(2000-01-01) + 0:999)
 aggregate(z, as.yearmon, mean)

 or replace mean with whatever summarization you want.

 On Sun, Mar 7, 2010 at 5:29 PM, Erin Hodgess erinm.hodg...@gmail.com wrote:
 Dear R People:

 The aggregate function works very well on regular time series.

 Is there a version for zoo or its that would take daily data and
 convert it to monthly, please?


-- 
Jeff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] confused by classes and methods.

2010-03-08 Thread Rob Forler

Hello, I have a simple class that looks like:

setClass(statisticInfo,
representation( max = numeric,
min = numeric,
beg = numeric,
current = numeric,
avg = numeric,
obs = vector
   )
 )

and the following function

updateStatistic - function(statistic, newData){
statis...@obs = c(statis...@obs, newData)
statis...@max = max(newData, statis...@max, na.rm=T)
statis...@min = min(newData, statis...@min, na.rm=T)
statis...@avg = mean(statis...@obs)
statis...@current = newData
if(length(statis...@obs)==1 || is.na(statis...@beg)){
statis...@beg = newData
}
return(statistic)
}

Firstly,

I know you can use methods which seems to add some value. I looked at
http://developer.r-project.org/methodDefinition.html but I try

setMethod(update, signature(statistic=statisticInfo, newData=numeric),

function(statistic, newData){
statis...@obs = c(statis...@obs, newData)
statis...@max = max(newData, statis...@max, na.rm=T)
statis...@min = min(newData, statis...@min, na.rm=T)
statis...@avg = mean(statis...@obs)
statis...@current = newData
if(length(statis...@obs)==1 || is.na(statis...@beg)){
statis...@beg = newData
}
return(statistic)
}
)

Creating a new generic function for update in .GlobalEnv
Error in match.call(fmatch, fcall) :
  unused argument(s) (statistic = statisticInfo, newData = numeric)
 1: source(tca.init.R, chdir = T)
 2: eval.with.vis(ei, envir)
 3: eval.with.vis(expr, envir, enclos)
 4: source(../../studies/tca.tradeClassifyFuncs.R)
 5: eval.with.vis(ei, envir)
 6: eval.with.vis(expr, envir, enclos)
 7: setMethod(update, signature(statistic = statisticInfo, newData =
numeric), function(statistic, newData) {
 8: isSealedMethod(f, signature, fdef, where = where)
 9: getMethod(f, signature, optional = TRUE, where = where, fdef = fGen)
10: matchSignature(signature, f

I don't understand this any help would be appreciated.

Secondly, can anyone give any examples of where methods are used that makes
sense besides just checking the class inputs?

Thirdly, I've looked into passing by reference in R, and some options come
up, but in general they seem to be fairly complicated.

I would like update to work more like my update function to work without
having to return a a new object.

Something like
 statList = list(new(statisticInfo))
 updateStatistic(statList[[1]],3)
 statList[[1]]

#this would then have the updated one and not the old one.

Anyways,
The main reason I'm asking these questions is because I can't really find a
good online resource for this. Any help would be greatly appreciated.

Thanks,
Rob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] variance of discrete uniform distribution

2010-03-08 Thread casperyc


Hi all,

I am REALLY confused with the variance right now.

for a discrete uniform distribution on [1,12]

the mean is (1+12)/2=6.5
which is ok.

y=1:12
mean(y)

then var(y) 
gives me 13

1- on  http://en.wikipedia.org/wiki/Uniform_distribution_%28discrete%29 wiki 
the variance is (12^2-1)/12=143/12

2- 
http://www.solvemymath.com/online_math_calculator/statistics/continuous_distributions/uniform/param_uniform.php
here 
which used (12-1)^2/12=121/12

all different 3 answers!!!

All I am looking for is the variance of a random variable from discrete
uniform distribution.

Can someone clearify that for me please?

Thanks.

-- 
View this message in context: 
http://n4.nabble.com/variance-of-discrete-uniform-distribution-tp1585328p1585328.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variance of discrete uniform distribution

2010-03-08 Thread Rolf Turner

On 9/03/2010, at 12:13 PM, casperyc wrote:

Hi all,

I am REALLY confused with the variance right now.

You need to learn the difference

(a) Between sample variance (*estimate* of population variance)
and
population variance.

and

(b) Between discrete and continuous distributions.

Given that you understand those differences you will see that all three
answers are correct.

cheers,

Rolf Turner

for a discrete uniform distribution on [1,12]

the mean is (1+12)/2=6.5
which is ok.

y=1:12
mean(y)

then var(y)
gives me 13

1- on http://en.wikipedia.org/wiki/Uniform_distribution_%28discrete%29 wiki
the variance is (12^2-1)/12=143/12

2-
http://www.solvemymath.com/online_math_calculator/statistics/continuous_distributions/uniform/param_uniform.php

***LOOK*** at the above. Does it or does it not contain the string
``continuous_distributions''??? And doesn't your question involve
the ***discrete*** uniform distribution???

R. T.
here
which used (12-1)^2/12=121/12

all different 3 answers!!!

All I am looking for is the variance of a random variable from discrete
uniform distribution.

Can someone clearify that for me please?

Thanks.

##
Attention:
This e-mail message is privileged and confidential. If you are not the
intended recipient please delete the message and notify the sender.
Any views or opinions presented are solely those of the author.

This e-mail has been scanned and cleared by MailMarshal
www.marshalsoftware.com
##

Re: [R] variance of discrete uniform distribution

2010-03-08 Thread casperyc


Hi Rolf Turner ,

God, it directed to the wrong page.

I firstly find the formula in wiki, than tried to verify the answer in R,
now, given that 143/12 ((n^2-1)/12 ) is the correct answer for a discrete
uniform random variable,
I am still not sure what R is calculating there?
why it gives me 13?

Thanks!
-- 
View this message in context: 
http://n4.nabble.com/variance-of-discrete-uniform-distribution-tp1585328p1585355.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] test the goodness of it for negative binomial type 2

2010-03-08 Thread casperyc


Hi Achim Zeileis-4,

That's very helpful.

Thanks!

-- 
View this message in context: 
http://n4.nabble.com/test-the-goodness-of-it-for-negative-binomial-type-2-tp1575892p1585357.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variance of discrete uniform distribution

2010-03-08 Thread Michael Erickson

On Mon, Mar 8, 2010 at 3:44 PM, casperyc caspe...@hotmail.co.uk wrote:

 Hi Rolf Turner ,

 God, it directed to the wrong page.

 I firstly find the formula in wiki, than tried to verify the answer in R,
 now, given that 143/12 ((n^2-1)/12 ) is the correct answer for a discrete
 uniform random variable,
 I am still not sure what R is calculating there?
 why it gives me 13?

Of RT's two points, you addressed (b) continuous vs. discrete, but you
have yet to address (a) population estimate based on a sample.  Hint:
var(1:12) tries to estimate the population variance based on a sample.
 You are interested in the population variance.  They are calculated
different formulas that differ *only in the denominator*.

Michael



 Thanks!
 --
 View this message in context: 
 http://n4.nabble.com/variance-of-discrete-uniform-distribution-tp1585328p1585355.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compare tables

2010-03-08 Thread jim holtman

You can just access the data from the list:

result - lapply(output, function(.data){
lettermatch(creator, .data)
})

You can then take the result and possibly 'cbind' back into the matrix you
want.

On Mon, Mar 8, 2010 at 10:59 AM, Laetitia Schmid laetitia.sch...@gmx.chwrote:

 Hi!
 I need some help to finish my script.

 I have two tables that I combine randomly to produce a third table.
 This I do for hundreds of iterations. In the output file I get all the
 simulated tables after each other. It looks like this (in this case 3
 iterations):

 output file:

 [[1]]
   [,1]   [,2]   [,3]
  [1,] GM030005 WI920024 
  [2,] GM930026 WI920362 
  [3,] GM980051 WI920007 CGCC
  [4,] GM970009 WI920417 
  [5,] GM920089 WI920023 
  [6,] GM930109 WI920359 
  [7,] GM980007 WI920428 CGCC
  [8,] GM940039 WI920430 
  [9,] GM990027 WI920349 
 [10,] GM920222 WI920410 CGCC
 [11,] GM930029 WI920001 CGCC
 [12,] GM990105 WI920431 
 [13,] GM050009 WI920430 
 [14,] GM920224 WI920369 
 [15,] GM920224 WI920352 
 [16,] GM960028 WI920427 
 [17,] GM940031 WI920004 
 [18,] GM930040 WI920441 
 [19,] GM930040 WI920441 
 [20,] GM050099 WI920417 
 [21,] GM050099 WI920423 CCCG
 [22,] GM920096 WI920370 
 [23,] GM920034 WI920437 
 [24,] GM960023 WI920017 
 [25,] GM920031 WI920430 
 [26,] GM920202 WI920367 CCCG
 [27,] GM990066 WI920410 

 [[2]]
   [,1]   [,2]   [,3]
  [1,] GM030005 WI920017 
  [2,] GM930026 WI920415 
  [3,] GM980051 WI920028 CGCC
  [4,] GM970009 WI920017 
  [5,] GM920089 WI920028 
  [6,] GM930109 WI920353 
  [7,] GM980007 WI920009 CGCT
  [8,] GM940039 WI920415 
  [9,] GM990027 WI920423 CCCG
 [10,] GM920222 WI920423 CGCG
 [11,] GM930029 WI920363 CGCC
 [12,] GM990105 WI920362 
 [13,] GM050009 WI920365 
 [14,] GM920224 WI920362 
 [15,] GM920224 WI920410 
 [16,] GM960028 WI920355 CCCG
 [17,] GM940031 WI920361 
 [18,] GM930040 WI920356 
 [19,] GM930040 WI920353 
 [20,] GM050099 WI920360 
 [21,] GM050099 WI920353 
 [22,] GM920096 WI920023 
 [23,] GM920034 WI920426 
 [24,] GM960023 WI920024 
 [25,] GM920031 WI920022 
 [26,] GM920202 WI920009 CCCG
 [27,] GM990066 WI920001 

 [[3]]
   [,1]   [,2]   [,3]
  [1,] GM030005 WI920433 
  [2,] GM930026 WI920408 
  [3,] GM980051 WI920352 CGCC
  [4,] GM970009 WI920416 
  [5,] GM920089 WI920022 
  [6,] GM930109 WI920369 
  [7,] GM980007 WI920415 CGCC
  [8,] GM940039 WI920022 
  [9,] GM990027 WI920361 
 [10,] GM920222 WI920024 CGCC
 [11,] GM930029 WI920437 CGCC
 [12,] GM990105 WI920423 CCCG
 [13,] GM050009 WI920416 
 [14,] GM920224 WI920423 CCCG
 [15,] GM920224 WI920427 
 [16,] GM960028 WI920437 
 [17,] GM940031 WI920441 
 [18,] GM930040 WI920417 
 [19,] GM930040 WI920370 
 [20,] GM050099 WI920015 
 [21,] GM050099 WI920428 
 [22,] GM920096 WI920007 
 [23,] GM920034 WI920009 CCCG
 [24,] GM960023 WI920410 
 [25,] GM920031 WI920430 
 [26,] GM920202 WI920015 
 [27,] GM990066 WI920415 

 Now I would like to compare one of the tables used to create the
 output tables with every output table, one after the other. In detail,
 I am comparing row 1 of the creator table with row 1 of the first
 output table and then row 2 of the creator table with row 2 of the
 first output table and so on until row 27 and each row for all
 columns. Then, when the first output table is finished I go on
 comparing the first creator table with the second table in the
 output, row for row for all columns. I do this for all iterations.

 The first creator table is called data_mc.

 # apply similarity function (lettermatch) to my data
 for (i in 1:(nrow(data_mc))){
   for (y in 1:(ncol(data_mc))) {
 creator_table - data_mc[data_mc$Status==mother,y]
 output_tables - ???
 output[i,y]-(lettermatch(creator_table, output_tables))
   }
 }

 Could you please help me how I have to call up the output tables in
 the way I need them (described above) for the function lettermatch?
 Maybe I need to change the format of the output file?

 Thank you.
 Laetitia















[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible

Re: [R] How can I understand this sentence,and express it by means of Mathematical approach？

2010-03-08 Thread Rolf Turner


On 9/03/2010, at 3:48 AM, Liaw, Andy wrote:

(in response to a question on the meaning of the sentence:

 Independent variables whose correlation with the response 
 variable was not significant at 5% level were removed)


 If your ultimate interest is in real scientific progress, I'd suggest that you
 ignore that sentence (and any conclusion drawn subsequent to it).

Surely a fortune candidate.

cheers,

Rolf Turner

##
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.

This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com
##

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditioning variable in panel.xyplot?

2010-03-08 Thread Seth W Bigelow

Ah, wonderful, thank you for the code Deepayan. To recap for posterity: I 
have two datafiles, d and q: each has x-y coordinates that are conditioned 
by site (The actual data, for me, is maps of parent trees and their 
seedlings). I wanted to superimpose the xy plots of d and q, by site, 
without going to the trouble of merging the d  q datasets into a single 
dataset. The solution is to use the which.packet statement is 


d - data.frame(site  = c(rep(A,12), rep(B,12)), 
x=rnorm(24),y=rnorm(24))# Create the main xy dataset
q - data.frame(site  = c(rep(A,7), rep(B,7)), 
x=rnorm(14),y=rnorm(14))# Create the alternate xy dataset


q.split - split(q, q$site) # Split up the alternate 
dataset by site

mypanel - function(..., alt.data) {
with(alt.data[[ which.packet()[1] ]],   # 
which.packet passes index of the relevant data subset...
 panel.xyplot(x = x, y = y, col=red)) # ... to 
panel.xyplot()
panel.xyplot(...)
}

xyplot(y ~ x | site, d, alt.data = q.split, # After providing 
the alternative dataset and the panel...
   panel = mypanel) # ...everything prints out 
properly, like magic!



Dr. Seth  W. Bigelow
Biologist, USDA-FS Pacific Southwest Research Station
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditioning variable in panel.xyplot?

2010-03-08 Thread Felix Andrews

Alternatively

library(latticeExtra)

xyplot(y ~ x | site, d) +
  xyplot(y ~ x | site, q, col = red)



(which is a shortcut for:)

xyplot(y ~ x | site, d) +
  as.layer(xyplot(y ~ x | site, q, col = red))


On 9 March 2010 11:17, Seth W Bigelow sbige...@fs.fed.us wrote:
 Ah, wonderful, thank you for the code Deepayan. To recap for posterity: I
 have two datafiles, d and q: each has x-y coordinates that are conditioned
 by site (The actual data, for me, is maps of parent trees and their
 seedlings). I wanted to superimpose the xy plots of d and q, by site,
 without going to the trouble of merging the d  q datasets into a single
 dataset. The solution is to use the which.packet statement is


 d - data.frame(site  = c(rep(A,12), rep(B,12)),
 x=rnorm(24),y=rnorm(24))                # Create the main xy dataset
 q - data.frame(site  = c(rep(A,7), rep(B,7)),
 x=rnorm(14),y=rnorm(14))                # Create the alternate xy dataset


 q.split - split(q, q$site)                     # Split up the alternate
 dataset by site

 mypanel - function(..., alt.data) {
    with(alt.data[[ which.packet()[1] ]],                       #
 which.packet passes index of the relevant data subset...
         panel.xyplot(x = x, y = y, col=red))                 # ... to
 panel.xyplot()
    panel.xyplot(...)
 }

 xyplot(y ~ x | site, d, alt.data = q.split,             # After providing
 the alternative dataset and the panel...
       panel = mypanel)                         # ...everything prints out
 properly, like magic!



 Dr. Seth  W. Bigelow
 Biologist, USDA-FS Pacific Southwest Research Station
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Felix Andrews / 安福立
Postdoctoral Fellow
Integrated Catchment Assessment and Management (iCAM) Centre
Fenner School of Environment and Society [Bldg 48a]
The Australian National University
Canberra ACT 0200 Australia
M: +61 410 400 963
T: + 61 2 6125 4670
E: felix.andr...@anu.edu.au
CRICOS Provider No. 00120C
-- 
http://www.neurofractal.org/felix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] varComb in gls/lme

2010-03-08 Thread Yu, Xuesong

Dear R-help members,

 

I have a question regarding how to use varComb function to specify a
variance function for the weights  in the gls.  I  need to fit a
linear model with heteroscedasticity. The variance function is
exp(c0+nu0*W +nu1*W^2) where W is  a covariate.  Initially I want to use
varFunc to define my own variance function following the instruction in
the Pinheiro and Bates (2000), but I could not make it work. Then I used
varComb in gls with  weights=varComb(varExp(form=~W),
varExp(form=~I(W^2).  But the estimated variance parameters seems to
have a large discrepancy from the true values 

(I used the simulated data).  This makes me wonder if  it is a right way
to model variance function  exp(c0+nu0*W +nu1*W^2) using varComb.  The
codes and outputs are copied below.   

 

Any suggestions and help are very apprecited

 

library(nlme)

 

simulate.pilot = function(m, mn, sigma, alpha0, alpha1, c0, nu0, nu1) {

 

   pilot.dat=data.frame(W=rnorm(m, mean=mn, sd=sigma))

   pilot.dat=transform(pilot.dat, Y=rnorm(m, mean=alpha0 + alpha1*W,
sd=sqrt(exp(c0+nu0*W+nu1*W^2



   pilot.dat   

}

 

mn=3.3

sigma=sqrt(0.5)

 

alpha0=0.1

alpha1=3

 

m=200

n=200

 

c0=-2.413;   nu0=-0.2; nu1=0.3

 

simu.dat=simulate.pilot(m, mn, sigma, alpha0, alpha1, c0, nu0, nu1)

   

fit1=try(gls(Y~W, data=simu.dat, weights=varComb(varExp(form=~W),
varExp(form=~I(W^2)

c0.hat= log(fit1$sigma^2)

 nu0.hat=2*fit1$modelStruct$varStruct$A[1]

 nu1.hat=2*fit1$modelStruct$varStruct$B[1]

 

 c0.hat

[1] -1.570104

 nu0.hat

[1] -0.787264

 nu1.hat

[1] 0.4057129

 

 

Thanks

 

Xuesong 

 

CONFIDENTIALITY NOTICE: This e-mail message, including a...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A slight trap in read.table/read.csv.

2010-03-08 Thread Mike Prager

Rolf Turner r.tur...@auckland.ac.nz wrote:
 
 I solved the problem by putting in a colClasses argument in my
 call to read.csv().  But I really think that the read functions
 are being too clever by half here.  If field entries are surrounded
 by quotes, shouldn't they be left as character?  Even if they are
 all F's and T's?
 
 Furthermore using F's and T's to represent TRUE's and FALSE's is
 bad practice anyway.  Since FALSE and TRUE are reserved words it
 would make sense for the read function to assume that a field is
 logical if it consists entirely of these words.  But T's and F's
  I don't think so.
 
 I would argue that this behaviour should be changed.  I can see no
 downside to such a change.
 

I agree with you, Rolf, that this is horrid behavior. It is such
automatic devices that have made people hate (e.g.) Microsoft
Word with a passion. 

Yet, in R this is a designed-in bug (e.g., feature) that
probably can't be changed without making some legacy code not
work. But at least, T and F could be removed soon as synonms for
TRUE and FALSE. We have seen that _ was removed as an
assignment operator, and the world did not crumble. The use of T
and F is no less error-prone, and possibly more.

The only immediate solution to this accretion of overly clever
behavior would be for someone to write new functions (say,
Read.csv) that didn't do all those conversions behind the
scenes. I'm not about to do that. Are you?

Best of luck!

-- 
Mike Prager, NOAA, Beaufort, NC
* Opinions expressed are personal and not represented otherwise.
* Any use of tradenames does not constitute a NOAA endorsement.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A slight trap in read.table/read.csv.

2010-03-08 Thread Rolf Turner


On 9/03/2010, at 11:17 AM, Mike Prager wrote:

 Rolf Turner r.tur...@auckland.ac.nz wrote:
 
 I solved the problem by putting in a colClasses argument in my
 call to read.csv().  But I really think that the read functions
 are being too clever by half here.  If field entries are surrounded
 by quotes, shouldn't they be left as character?  Even if they are
 all F's and T's?
 
 Furthermore using F's and T's to represent TRUE's and FALSE's is
 bad practice anyway.  Since FALSE and TRUE are reserved words it
 would make sense for the read function to assume that a field is
 logical if it consists entirely of these words.  But T's and F's
  I don't think so.
 
 I would argue that this behaviour should be changed.  I can see no
 downside to such a change.
 
 
 I agree with you, Rolf, that this is horrid behavior. It is such
 automatic devices that have made people hate (e.g.) Microsoft
 Word with a passion. 
 
 Yet, in R this is a designed-in bug (e.g., feature) that
 probably can't be changed without making some legacy code not
 work. But at least, T and F could be removed soon as synonms for
 TRUE and FALSE. We have seen that _ was removed as an
 assignment operator, and the world did not crumble. The use of T
 and F is no less error-prone, and possibly more.

I would definitely support the removal of the use of T
and F for TRUE and FALSE.  Some code would break, but
it would be easy to trace the source of the problem and
easy to fix.
 
 The only immediate solution to this accretion of overly clever
 behavior would be for someone to write new functions (say,
 Read.csv) that didn't do all those conversions behind the
 scenes. I'm not about to do that. Are you?


NFL!!!

cheers,

Rolf

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why this function does not run correctly?


On 2010-03-08 14:45, Marco Bressan wrote:

Hi,
my name is Marco Bressan i'm working to improve ADati package. I study 
psicology ad Padua University (Italy). I have this problem: why bartlett.test 
function running good and my anova.welch function no?

Ciao,
il mio nome � Marco Bressan e sto lavorando per migliorare il pacchetto ADati. 
Studio psicologia all'universit� di Padova. Non capisco come mai la funzione 
che ho fatto mi dia quell'errore

this is anova.welch:

anova.welch- function(x, ...) UseMethod(anova.welch)

anova.welch.default-## 
this is the algoritm, I think it's ok (I copy this from Welch() inside ADati)
function (x, y = NULL, nu = c(0,0) ,...)
{
 mx- tapply(x,y,mean)
 s2x- tapply(x,y,var)
 k- length(nu)
 w- nu/s2x
 Xp- sum(w * mx)/sum(w)
 Fnum- sum(w * (mx - Xp)^2)/(k - 1)
 Fden- 0
 for (h in 1:k) {
 a- 1/(nu[h] - 1)
 b- (1 - (w[h]/sum(w)))^2
 Fden- Fden + (a * b)
 }
 gl2.den- Fden * 3
 Fden- Fden * ((2 * (k - 2))/(k^2 - 1)) + 1
 Fw- Fnum/Fden
 STATISTIC- Fw
 gl1- k - 1
 gl2.num- k^2 - 1
 gl2- gl2.num/gl2.den
 PARAMETER- c(gl1, gl2)
 PVAL- pf(Fw, gl1, gl2, lower.tail = FALSE)
 METHOD- Welch ANOVA
 DNAME- NA
 names(STATISTIC)- F
 names(PARAMETER)- c(num df, denom df)
 RVAL- list(statistic = STATISTIC, parameter = PARAMETER,
 p.value = PVAL, method = METHOD)
 class(RVAL)- htest
 return(RVAL)
}

anova.welch.formula-   
 ## I copy this from bartlett.test
function(formula, data, subset, na.action, ...)
{
 if(missing(formula) || (length(formula) != 3L))
 stop('formula' mancante o incorretta)
 m- match.call(expand.dots = FALSE)
 if(is.matrix(eval(m$data, parent.frame(
 m$data- as.data.frame(data)
 m[[1L]]- as.name(model.frame)
 mf- eval(m, parent.frame())
 DNAME- paste(names(mf), collapse =  by )
 names(mf)- NULL
 y- do.call(anova.welch, as.list(mf))
 y$data.name- DNAME
 y
}


bartlett.test doesn't have a 'nu' argument. Try this:

anova.welch.formula -
function(formula, data, subset, na.action, nu = c(0,0), ...)
{
if(missing(formula) || (length(formula) != 3L))
stop('formula' mancante o incorretta)
m - match.call(expand.dots = FALSE)
m[[nu]] - NULL  ## needed so that eval(m,) will work
if(is.matrix(eval(m$data, parent.frame(
m$data - as.data.frame(data)
m[[1L]] - as.name(model.frame)
mf - eval(m, parent.frame())
DNAME - paste(names(mf), collapse =  by )
names(mf) - NULL
y - do.call(anova.welch, c(as.list(mf), list(nu)))
 ## include 'nu' in the parameters passed to
 ##anova.welch.default
y$data.name - DNAME
y
}

(You might also find the code for lm() instructive.)

I assume that you're aware of the oneway.test() function in
package:stats. So why re-invent the wheel?

 -Peter Ehlers




n1=10   
   ## this is the test
n2=15
n3=20
y=c(rnorm((n1+n2),5,2),rnorm(n3,7,8))
A=factor(c(rep(1,n1),rep(2,n2),rep(3,n3)))
anova.welch(y,A,c(n1,n2,n3))


 Welch ANOVA

data:
F = 2.3025, num df = 2.000, denom df = 27.384, p-value = 0.1191


anova.welch(y~A,nu=c(n1,n2,n3))

Errore in model.frame.default(formula = y ~ A, ... = list(nu = c(n1, n2,  :
   invalid type (pairlist) for variable '(...)'


Sorry for my english, tanks you if you can help me :)
[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A slight trap in read.table/read.csv.


Ditching T/F for TRUE/FALSE would get my vote, too.

 -Peter Ehlers

On 2010-03-08 17:44, Rolf Turner wrote:


On 9/03/2010, at 11:17 AM, Mike Prager wrote:


Rolf Turnerr.tur...@auckland.ac.nz  wrote:


I solved the problem by putting in a colClasses argument in my
call to read.csv().  But I really think that the read functions
are being too clever by half here.  If field entries are surrounded
by quotes, shouldn't they be left as character?  Even if they are
all F's and T's?

Furthermore using F's and T's to represent TRUE's and FALSE's is
bad practice anyway.  Since FALSE and TRUE are reserved words it
would make sense for the read function to assume that a field is
logical if it consists entirely of these words.  But T's and F's
 I don't think so.

I would argue that this behaviour should be changed.  I can see no
downside to such a change.



I agree with you, Rolf, that this is horrid behavior. It is such
automatic devices that have made people hate (e.g.) Microsoft
Word with a passion.

Yet, in R this is a designed-in bug (e.g., feature) that
probably can't be changed without making some legacy code not
work. But at least, T and F could be removed soon as synonms for
TRUE and FALSE. We have seen that _ was removed as an
assignment operator, and the world did not crumble. The use of T
and F is no less error-prone, and possibly more.


I would definitely support the removal of the use of T
and F for TRUE and FALSE.  Some code would break, but
it would be easy to trace the source of the problem and
easy to fix.


The only immediate solution to this accretion of overly clever
behavior would be for someone to write new functions (say,
Read.csv) that didn't do all those conversions behind the
scenes. I'm not about to do that. Are you?



NFL!!!

cheers,

Rolf

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with Hmisc, cut2, split and quantile

2010-03-08 Thread David Freedman


try 
as.numeric(read_data$DEC)

this should turn it into a numeric variable that you can work with

hth
David Freedman
CDC, Atlanta


Guy Green wrote:
 
 Hi Peter  others,
 
 Thanks (Peter) - that gets me really close to what I was hoping for.
 
 The one problem I have is that the cut approach breaks the data into
 intervals based on the absolute value of the Target data, rather than
 their frequency.  In other words, if the data ranged from 0 to 50, the
 data would be separated into 0-5, 5-10 and so on, regardless of the
 frequency within those categories.  However I want to get the data into
 deciles.
 
 The code that does this (incorporating Peter's) is:
 
 read_data=read.table(C:/Sample table.txt, head = T)
 read_data$DEC - with(read_data, cut(Target, breaks=10, labels=1:10))
 L - split(read_data, read_data$DEC)
 
 This means that I can get separate data frames, such as L$'10', which
 comes out tidy, but only containing 2 data items (the sample has 63 rows,
 so each decile should have 6+ data items):
  ActualTarget   DEC
 9   0.572 0.3778386   10
 31  0.2990.3546606   10
 
 If I try to adjust this to get deciles using cut2(), I can break the data
 into deciles as follows:
 
 read_data=read.table(C:/Sample table.txt, head = T)
 read_data$DEC - with(read_data, cut2(read_data$Target, g=10),
 labels=1:10)
 L - split(read_data, read_data$DEC)
 
 However this time, while the data is broken into even data frames, the
 labels for the separate data frames are unuseable, e.g.:
 $`[ 0.26477, 0.37784]`
 ActualTarget DEC
 6   0.243   0.2650960[ 0.26477, 0.37784]
 9   0.572   0.3778386[ 0.26477, 0.37784]
 10 -0.049  0.3212681[ 0.26477, 0.37784]
 15  0.780  0.2778518[ 0.26477, 0.37784]
 31  0.299  0.3546606[ 0.26477, 0.37784]
 33  0.105  0.2647676[ 0.26477, 0.37784]
 
 Could anyone suggest a way of rearranging this to make the labels useable
 again?  Sample data is reattached
 http://n4.nabble.com/file/n1585427/Sample_table.txt Sample_table.txt .
 
 Thanks,
 Guy
 
 
 
 Peter Ehlers wrote:
 
 On 2010-03-08 8:47, Guy Green wrote:

 Hello,
 I have a set of data with two columns: Target and Actual.  A
 http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt  is
 attached but the data looks like this:

 Actual  Target
 -0.125  0.016124906
 0.135   0.120799865
 ... ...
 ... ...

 I want to be able to break the data into tables based on quantiles in
 the
 Target column.  I can see (using cut2, and also quantile) how to get
 the
 barrier points between the different quantiles, and I can see how I
 would
 achieve this if I was just looking to split up a vector.  However I am
 trying to break up the whole table based on those quantiles, not just
 the
 vector.

 However I would like to be able to break the table into ten separate
 tables,
 each with both Actual and Target data, based on the Target data
 deciles:

 top_decile = ...(top decile of read_data, based on Target data)
 next_decile = ...and so on...
 bottom_decile = ...
 
 I would just add a factor variable indicating to which decile
 a particular observation belongs:
 
   dat$DEC - with(dat, cut(Target, breaks=10, labels=1:10))
 
 If you really want to have separate data frames you can then
 split on the decile:
 
   L - split(dat, dat$DEC)
 
 -Peter Ehlers
 -- 
 Peter Ehlers
 University of Calgary
 
 
 
 
-- 
View this message in context: 
http://n4.nabble.com/Help-with-Hmisc-cut2-split-and-quantile-tp1584647p1585503.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RGtk2:::gdkColorToString throws an error

2010-03-08 Thread Wincent

Dear Michael, thanks. I have installed gtk+ 2-12.9 revision 2 from
http://gladewin32.sourceforge.net/, it still doesn't work.
I tried gtk2-runtime-2.16.6-2010-02-24-ash.exe from
http://gtk-win.sourceforge.net/home/index.php/en/Downloads, it did not
solve the issue neither.

Could you please give me more hints?  My OS is windows vista.

Regards

Ronggui

On 26 October 2009 23:27, Michael Lawrence mflaw...@fhcrc.org wrote:
 Hi and sorry for the late reply. The Gdk library is part of the GTK+ bundle
 (GTK+, Gdk and GdkPixbuf are distributed together and have synchronized
 versions). So you'll just need GTK+ 2.12 or higher.

 Michael

 On Tue, Oct 20, 2009 at 8:16 AM, Ronggui Huang ronggui.hu...@gmail.com
 wrote:

 Dear all,

 I try to use RGtk2:::gdkColorToString, but it throws an error:
 Error in .RGtkCall(S_gdk_color_to_string, object, PACKAGE = RGtk2) :
  gdk_color_to_string exists only in Gdk = 2.12.0

 I know what it means, but don't know to solve this problem because I
 don't know where I can download the referred gdk library. Any
 information? Thank you.

 --
 HUANG Ronggui, Wincent
 Doctoral Candidate
 Dept of Public and Social Administration
 City University of Hong Kong
 Home page: http://asrr.r-forge.r-project.org/rghuang.html





-- 
Wincent Ronggui HUANG
Doctoral Candidate
Dept of Public and Social Administration
City University of Hong Kong
http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ctree - party package multivariate response variables

2010-03-08 Thread valeriano . parravicini

Hi,

I have a problem with ctree of party package.
I have data on distribution of more than one species (about 50 species) and I
would like identify the relation of this multivariate object (species
distribution) with a number of explanatory variables.

rs is the name of my dataframe containing the species (columns from 2 to 51) and
the explanatory variables (columns 52 and 53). Rows are my sampling sites.

I wrote:

species-rs[,2:51]
v1-rs[,52]
v2-rs[53]
tree-ctree(species~v1+v2)

It does not work , but when I use the same formula for the univariate case (i.e.
a single column - e.g. the total number of species in each samplig sites) it
works. I know that ctree can handle multivariate response variables, but I
cannot figure out how to do that.

Someone can help me?

Thank you



Valeriano

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with Hmisc, cut2, split and quantile

2010-03-08 Thread Guy Green


Hi Peter  others,

Thanks (Peter) - that gets me really close to what I was hoping for.

The one problem I have is that the cut approach breaks the data into
intervals based on the absolute value of the Target data, rather than
their frequency.  In other words, if the data ranged from 0 to 50, the data
would be separated into 0-5, 5-10 and so on, regardless of the frequency
within those categories.  However I want to get the data into deciles.

The code that does this (incorporating Peter's) is:

read_data=read.table(C:/Sample table.txt, head = T)
read_data$DEC - with(read_data, cut(Target, breaks=10, labels=1:10))
L - split(read_data, read_data$DEC)

This means that I can get separate data frames, such as L$'10', which comes
out tidy, but only containing 2 data items (the sample has 63 rows, so each
decile should have 6+ data items):
 ActualTarget   DEC
9   0.572 0.3778386   10
31  0.2990.3546606   10

If I try to adjust this to get deciles using cut2(), I can break the data
into deciles as follows:

read_data=read.table(C:/Sample table.txt, head = T)
read_data$DEC - with(read_data, cut2(read_data$Target, g=10), labels=1:10)
L - split(read_data, read_data$DEC)

However this time, while the data is broken into even data frames, the
labels for the separate data frames are unuseable, e.g.:
$`[ 0.26477, 0.37784]`
ActualTarget DEC
6   0.243   0.2650960[ 0.26477, 0.37784]
9   0.572   0.3778386[ 0.26477, 0.37784]
10 -0.049  0.3212681[ 0.26477, 0.37784]
15  0.780  0.2778518[ 0.26477, 0.37784]
31  0.299  0.3546606[ 0.26477, 0.37784]
33  0.105  0.2647676[ 0.26477, 0.37784]

Could anyone suggest a way of rearranging this to make the labels useable
again?  Sample data is reattached
http://n4.nabble.com/file/n1585427/Sample_table.txt Sample_table.txt .

Thanks,
Guy



Peter Ehlers wrote:
 
 On 2010-03-08 8:47, Guy Green wrote:

 Hello,
 I have a set of data with two columns: Target and Actual.  A
 http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt  is
 attached but the data looks like this:

 Actual   Target
 -0.125   0.016124906
 0.1350.120799865
 ...  ...
 ...  ...

 I want to be able to break the data into tables based on quantiles in the
 Target column.  I can see (using cut2, and also quantile) how to get
 the
 barrier points between the different quantiles, and I can see how I would
 achieve this if I was just looking to split up a vector.  However I am
 trying to break up the whole table based on those quantiles, not just the
 vector.

 However I would like to be able to break the table into ten separate
 tables,
 each with both Actual and Target data, based on the Target data
 deciles:

 top_decile = ...(top decile of read_data, based on Target data)
 next_decile = ...and so on...
 bottom_decile = ...
 
 I would just add a factor variable indicating to which decile
 a particular observation belongs:
 
   dat$DEC - with(dat, cut(Target, breaks=10, labels=1:10))
 
 If you really want to have separate data frames you can then
 split on the decile:
 
   L - split(dat, dat$DEC)
 
 -Peter Ehlers
 -- 
 Peter Ehlers
 University of Calgary
 
 

-- 
View this message in context: 
http://n4.nabble.com/Help-with-Hmisc-cut2-split-and-quantile-tp1584647p1585427.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making FTP operations with R

2010-03-08 Thread Duncan Temple Lang


R does provide support for basic FTP requests. Not for DELETE
requests. And not for communication on the same connection.

I think your best approach is to use the RCurl package
(http://www.omegahat.org/RCurl).

  D.

Orvalho Augusto wrote:
 Dears I need to make some very basic FTP operations with R.
 
 I need to do a lot of get and issue a respective delete command
 too on the same connection.
 
 How can I do that?
 
 Thanks in advance
 
 Caveman
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with Hmisc, cut2, split and quantile


On 2010-03-08 18:00, Guy Green wrote:


Hi Peter  others,

Thanks (Peter) - that gets me really close to what I was hoping for.

The one problem I have is that the cut approach breaks the data into
intervals based on the absolute value of the Target data, rather than
their frequency.  In other words, if the data ranged from 0 to 50, the data
would be separated into 0-5, 5-10 and so on, regardless of the frequency
within those categories.  However I want to get the data into deciles.

The code that does this (incorporating Peter's) is:

read_data=read.table(C:/Sample table.txt, head = T)
read_data$DEC- with(read_data, cut(Target, breaks=10, labels=1:10))
L- split(read_data, read_data$DEC)

This means that I can get separate data frames, such as L$'10', which comes
out tidy, but only containing 2 data items (the sample has 63 rows, so each
decile should have 6+ data items):
  ActualTarget   DEC
9   0.572 0.3778386   10
31  0.2990.3546606   10

If I try to adjust this to get deciles using cut2(), I can break the data
into deciles as follows:

read_data=read.table(C:/Sample table.txt, head = T)
read_data$DEC- with(read_data, cut2(read_data$Target, g=10), labels=1:10)
L- split(read_data, read_data$DEC)

However this time, while the data is broken into even data frames, the
labels for the separate data frames are unuseable, e.g.:
$`[ 0.26477, 0.37784]`
 ActualTarget DEC
6   0.243   0.2650960[ 0.26477, 0.37784]
9   0.572   0.3778386[ 0.26477, 0.37784]
10 -0.049  0.3212681[ 0.26477, 0.37784]
15  0.780  0.2778518[ 0.26477, 0.37784]
31  0.299  0.3546606[ 0.26477, 0.37784]
33  0.105  0.2647676[ 0.26477, 0.37784]

Could anyone suggest a way of rearranging this to make the labels useable
again?  Sample data is reattached
http://n4.nabble.com/file/n1585427/Sample_table.txt Sample_table.txt .


I think that the easiest way would be to relabel the levels of DEC:

 read_data$DEC - factor(read_data$DEC, labels = 1:10)

or, since I would prefer letters as factor levels:

 read_data$DEC - factor(read_data$DEC, labels = LETTERS[1:10])

Another way would be to use cut2() with onlycuts=TRUE to get the
breaks and then use these with cut() as in my original post:

 brks - cut2(read_data$Target, g=10, onlycuts=TRUE)
 read_data$DEC- with(read_data,
  cut(Target, breaks=brks, labels=1:10))

But I still don't see why you want a list of separate data
frames. For most analyses, it's more convenient to just use the
factor variable to subset the data as needed.

 -Peter Ehlers



Thanks,
Guy



Peter Ehlers wrote:


On 2010-03-08 8:47, Guy Green wrote:


Hello,
I have a set of data with two columns: Target and Actual.  A
http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt  is
attached but the data looks like this:

Actual  Target
-0.125  0.016124906
0.135   0.120799865
... ...
... ...

I want to be able to break the data into tables based on quantiles in the
Target column.  I can see (using cut2, and also quantile) how to get
the
barrier points between the different quantiles, and I can see how I would
achieve this if I was just looking to split up a vector.  However I am
trying to break up the whole table based on those quantiles, not just the
vector.

However I would like to be able to break the table into ten separate
tables,
each with both Actual and Target data, based on the Target data
deciles:

top_decile = ...(top decile of read_data, based on Target data)
next_decile = ...and so on...
bottom_decile = ...


I would just add a factor variable indicating to which decile
a particular observation belongs:

   dat$DEC- with(dat, cut(Target, breaks=10, labels=1:10))

If you really want to have separate data frames you can then
split on the decile:

   L- split(dat, dat$DEC)

 -Peter Ehlers
--
Peter Ehlers
University of Calgary






--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help! I turned my data into junk!

2010-03-08 Thread Ravi Kulkarni


It's probably binary data - which implies that you can only read it with
the application that created it. What is the filename extension?

Ravi
-- 
View this message in context: 
http://n4.nabble.com/Help-I-turned-my-data-into-junk-tp1585481p1585585.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: how to convert character variables into numeric variables directly