date:20111026

Re: [R] difftime producing NA values in R 2.12.2

2011-10-26 Thread Jeff Newmiller

This is daylight savings time issue.
Use chron or set your TZ environment variable to a standard-time-only timezone 
(or don't enter nonexistent time values for the timezone in which you wish to 
compute).
---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Adrienne Wootten amwoo...@ncsu.edu wrote:

R-listers,

I have noticed several posts on issues with difftime producing NA's
but they have been for older versions of R. Here's the issue
associated with difftime that I am dealing with in R 2.12.2.

 preciptime = strptime(01/10/2007 14:00,format=%m/%d/%Y %H:%M)
 class(preciptime)
[1] POSIXlt POSIXt
 # Now using difftime, this is what happens


 difftime(strptime(03/11/2007 01:00,format=%m/%d/%Y 
 %H:%M),preciptime,units=hours)
Time difference of 1427 hours

 difftime(strptime(03/11/2007 02:00,format=%m/%d/%Y 
 %H:%M),preciptime,units=hours)
Time difference of NA hours

 difftime(strptime(03/11/2007 03:00,format=%m/%d/%Y 
 %H:%M),preciptime,units=hours)
Time difference of 1428 hours

This doesn't make sense to me since both times used in difftime are in
the same format after using strptime, but the differences are coming
out wrong. It should be 1427, 1428, and 1429, so I'm confused as to
how to fix this. The idea with the program is to compute the time in
hours since last rainfall, so everything gets thrown off with this
producing NA's. For reference, Operating system is Windows 7
Enterprise, R is version 2.12.2 (64-bit), any guidance is appreciated.

Thanks in advance!

A
-- 
Adrienne Wootten
Graduate Research Assistant
State Climate Office of North Carolina
Department of Marine, Earth and Atmospheric Sciences
North Carolina State University

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Example(chron) doesn't work

2011-10-26 Thread hchui

Hi, there,

I have a similar problem. The chron example gives NA. dates doesn't work but
times does.

I would appreciate it if there's a fix for it.

Thanks,
Helena

 example(chron)

chron dts - dates(c(02/27/92, 02/27/92, 01/14/92,
chron+02/28/92, 02/01/92))

chron dts
[1] NA NA NA NA NA

chron # [1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92
chron tms - times(c(23:03:20, 22:29:56, 01:03:30,
chron+18:21:03, 16:56:26))

chron tms
[1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26

chron # [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26
chron x - chron(dates = dts, times = tms)

chron x
[1] (NA NA) (NA NA) (NA NA) (NA NA) (NA NA)

chron # [1] (02/27/92 23:03:19) (02/27/92 22:29:56) (01/14/92 01:03:30)
chron # [4] (02/28/92 18:21:03) (02/01/92 16:56:26)
chron 
chron # We can add or subtract scalars (representing days) to dates or
chron # chron objects:
chron c(dts[1], dts[1] + 10)
Error in y + ifelse(m  2, 0, -1) : 
  non-numeric argument to binary operator
In addition: Warning message:
In matrix(unlist(lapply(dots, origin)), nrow = 3) :
  data length [2] is not a sub-multiple or multiple of the number of rows
[3]
 packageDescription(chron)$Version 
[1] 2.3-42
 R.version.string 
[1] R version 2.13.1 (2011-07-08)
 win.version() 
[1] Windows 7 x64 (build 7600)


--
View this message in context: 
http://r.789695.n4.nabble.com/Example-chron-doesn-t-work-tp801580p3939363.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Library chron

2011-10-26 Thread hchui

if else worked. many thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/Library-chron-tp3935969p3939374.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with a scatter plot

2011-10-26 Thread RanRL

Hi everyone,

I have some data about a market research which I want to arrange in one plot
for easy viewing, 
the data looks something like:

ProductColorStoreA  StoreB  StoreC  StoreD  Price

ProdA   R   NA4.33 2 4.33 35
  GNA4.33 2 4.33
35
   B   NA4.33 2 3.76
58
  YNA 3.723 5.33
23

ProdB   B5.44   NA  4.22  3.7687

ProdC   G 4.77  3.224.77 2.10 65
   B ...   ... ......   

..

And so on...

I want to create a plot where the colors of the hits represent the Product
(A,B,C..), the characther represent the color (X for yellow, box for green,
etc..), the X axis is the price and the Y axis is the number (0-5) from the
different Stores (A,B,C,D). I've thought either to create a matrix of 4
plots ( for the 4 stores) or in some creative way combine them into one
plot?

Please help me or point me in the right direction as to which functions to
look into, I've been playing around with ggplot for a few days, but can't
seem to wrap my head around it yet...

Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-a-scatter-plot-tp3939585p3939585.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building package/DESCRIPTION file not existing?

2011-10-26 Thread Uwe Ligges


As a first step try to get rid of the warning by doing what it says:
CYGWIN environment variable option nodosfilewarning turns off this 
warning.


So set (at least):
CYGWIN=nodosfilewarnings

and go ahead.

Uwe Ligges





On 25.10.2011 02:10, Francois Rousseu wrote:



Hello useRs

I am trying to build a package for personal use and for making easier working 
with other people but I keep getting the same error message about the 
DESCRIPTION file not existing.

when trying to install from a source tar.gz file:

Error in .read_description(dfile) :
   file 
'C:/Users/Propriétaire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION'
 does not exist

when trying to build a binary version:

Error in .read_description(dfile) :
   file 'C:/Users/Propriétaire/Documents/RETROBIRD/mypkg/DESCRIPTION' does not 
exist

In this last case, the DESCRIPTION file is certainly there! Also, help and 
DESCRIPTION files are edited and my path variable seems to be set correctly as 
I can access R and tex (form miktex 2.9) from the console. I feel it might be 
related to language issues (windows on my system is in french, see 
sessionInfo() at bottom of message) or something about temporary directories, 
but I really can't find the problem. I've looked into the cygwin warning, but 
it didn't seemed to be the problem, though I may be wrong.

Any hints? Below is the complete sequence with errors.

Thanks,
Francois Rousseu



setwd(C:/Users/Propriétaire/Documents/RETROBIRD/)
library(devtools)
f- function(x,y) x+y
d- data.frame(a=1, b=2)
package.skeleton(list=c(f,d), name=mypkg)


## editing of help and description files

Creating directories ...
Creating DESCRIPTION ...
Creating Read-and-delete-me ...
Saving functions and data ...
Making help files ...
Done.
Further steps are described in './mypkg/Read-and-delete-me'.


build(C:/Users/Propriétaire/Documents/RETROBIRD/mypkg)


* checking for file 
'C:\Users\Propriétaire\Documents\RETROBIRD\mypkg/DESCRIPTION' ... OK
* preparing 'mypkg':
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* looking to see if a 'data/datalist' file should be added
* building 'mypkg_1.0.tar.gz'
cygwin warning:
   MS-DOS style path detected: 
C:/Users/Propri\xC3\xA9taire/Documents/RETROBIRD/mypkg_1.0.tar.gz
   Preferred POSIX equivalent is: 
/cygdrive/c/Users/Propri\xC3\xA9taire/Documents/RETROBIRD/mypkg_1.0.tar.gz
   CYGWIN environment variable option nodosfilewarning turns off this warning.
   Consult the user's guide for more details about POSIX paths:
 http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
[1] C:/Users/Propriétaire/Documents/RETROBIRD/mypkg_1.0.tar.gz


install.packages(pkgs=mypkg_1.0.tar.gz,lib=C:/Users/Propriétaire/Documents/R/win-library/2.13,repos=NULL,type=source)


* installing *source* package 'mypkg' ...
Error in .read_description(dfile) :
   file 
'C:/Users/Propriétaire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION'
 does not exist
ERROR: installing package DESCRIPTION failed for package 'mypkg'
* removing 'C:/Users/Propriétaire/Documents/R/win-library/2.13/mypkg'
Warning messages:
1: running command 'C:/PROGRA~1/R/R-213~1.0/bin/x64/R CMD INSTALL -l 
C:/Users/Propriétaire/Documents/R/win-library/2.13   mypkg_1.0.tar.gz' had 
status 1
2: In install.packages(pkgs = mypkg_1.0.tar.gz, lib = 
C:/Users/Propriétaire/Documents/R/win-library/2.13,  :
   installation of package 'mypkg_1.0.tar.gz' had non-zero exit status


build(C:/Users/Propriétaire/Documents/RETROBIRD/mypkg,binary=T)


* installing to library 'C:/Users/Propriétaire/Documents/R/win-library/2.13'
* installing *source* package 'mypkg' ...
Error in .read_description(dfile) :
   file 'C:/Users/Propriétaire/Documents/RETROBIRD/mypkg/DESCRIPTION' does not 
exist
ERROR: installing package DESCRIPTION failed for package 'mypkg'
* removing 'C:/Users/Propriétaire/Documents/R/win-library/2.13/mypkg'
Error: Command failed (1)
In addition: Warning message:
running command 'C:/PROGRA~1/R/R-213~1.0/bin/x64/R CMD INSTALL 
C:\Users\Propriétaire\Documents\RETROBIRD\mypkg --build' had status 1


sessionInfo()


R version 2.13.0 (2011-04-13)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=French_Canada.1252  LC_CTYPE=French_Canada.1252
[3] LC_MONETARY=French_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=French_Canada.1252
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
other attached packages:
[1] roxygen2_2.1 digest_0.5.1 devtools_0.4
loaded via a namespace (and not attached):
[1] brew_1.0-6 plyr_1.6   RCurl_1.6-10.1 stringr_0.5tools_2.13.0

[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and

Re: [R] Reading in and modifying multiple datasets in a loop

2011-10-26 Thread Uwe Ligges




On 24.10.2011 23:10, Debs Majumdar wrote:

Thanks Uwe. This works perfectly.

###


owd- setwd(pth)
fls- list.files(pattern=^chr)
ufls- unique(sapply(strsplit(fls, _), [, 1))
for(i in ufls){
  of- strsplit(i, \\.)[[1]]
  of- paste(of[1], tail(of, 1), sep=.)
  impute2databel(genofile = i,
 samplefile = paste(i, info, sep=_),
 outfile = of,
 makeprob=TRUE, old=FALSE)
}
setwd(owd)




I have a question regarding how strsplit works.

When my files are the following:

 chr1.one.phased.impute2.chunk1
 chr1.one.phased.impute2.chunk1_info
 chr1.one.phased.impute2.chunk1_info_by_sample
 chr1.one.phased.impute2.chunk1_summary
 chr1.one.phased.impute2.chunk1_warnings
ufls- unique(sapply(strsplit(fls, _), [, 1))

This works like a charm.

I have another dataset where the files are


 study1_chr1.one.phased.impute2.chunk1
 study1_chr1.one.phased.impute2.chunk1_info
 study1_chr1.one.phased.impute2.chunk1_info_by_sample
 study1_chr1.one.phased.impute2.chunk1_summary
 study1_chr1.one.phased.impute2.chunk1_warnings

... and so on.

and I wanted to run the same loop but I was unable to change strsplit so that 
it will work when the files are names ads above:

I tried

ufls- unique(sapply(strsplit(fls, _), [, 2))



unique(gsub((_.*)_.*, \\1, x))

Should do if there is a first underscore.

Uwe Ligges




but this knocks off study1 (modified code below).  What modification do I 
need to make to make this run:



fls- list.files(pattern=study1_chr)
ufls- unique(sapply(strsplit(fls, _), [, 2))

library(GenABEL)

for(i in ufls){
  of- strsplit(i, \\.)[[1]]
  of- paste(of[1], tail(of, 1), sep=.)
  impute2databel(genofile = i,
 samplefile = paste(i, info, sep=_),
 outfile = of,
 makeprob=TRUE, old=FALSE)

}

#

Thanks,

  Debs


- Original Message -
From: Debs Majumdardebs_st...@yahoo.com
To: r-help@r-project.orgr-help@r-project.org
Cc:
Sent: Friday, October 21, 2011 2:32 PM
Subject: Reading in and modifying multiple datasets in a loop



Hi,

   I have been given a set of around 300 files where there are 5 files 
corresponding to each chunk.

E.g. Chunk 1 for chr1 contains these 5 files:

 chr1.one.phased.impute2.chunk1
 chr1.one.phased.impute2.chunk1_info
 chr1.one.phased.impute2.chunk1_info_by_sample
 chr1.one.phased.impute2.chunk1_summary
 chr1.one.phased.impute2.chunk1_warnings

For chr 1 there are 47 chunks, chr2 has 42 chunks...and it ends at chr22 with 
23 chunks.

I am using the DatABEL package to  convert them databel format using the 
following command:


impute2databel(genofile=chr1.one.phased.impute2.chunk1, 
samplefile=chr1.one.phased.impute2.chunk1_info, outfile=chr1.chunk1, makeprob=TRUE, 
old=FALSE)

which uses two files per chunk.


Is there a way I can automate this so that the code goes through each chunk of 
each chromosome and does the conversion to databel format.


Thanks,

  -Debs

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation from discrete uniform

2011-10-26 Thread BSanders

If you wanted a discrete uniform from 1-10 use: ceiling(10*runif(1))
if you wanted from 0-12, use: ceiling(13*runif(1))-1

--
View this message in context: 
http://r.789695.n4.nabble.com/Simulation-from-discrete-uniform-tp3434980p3939694.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with a scatter plot

2011-10-26 Thread Jim Lemon


On 10/26/2011 05:48 PM, RanRL wrote:

Hi everyone,

I have some data about a market research which I want to arrange in one plot
for easy viewing,
the data looks something like:

ProductColorStoreA  StoreB  StoreC  StoreD  Price

ProdA   R   NA4.33 2 4.33 35
   GNA4.33 2 4.33
35
B   NA4.33 2 3.76
58
   YNA 3.723 5.33
23

ProdB   B5.44   NA  4.22  3.7687

ProdC   G 4.77  3.224.77 2.10 65
B ...   ... ......
..

And so on...

I want to create a plot where the colors of the hits represent the Product
(A,B,C..), the characther represent the color (X for yellow, box for green,
etc..), the X axis is the price and the Y axis is the number (0-5) from the
different Stores (A,B,C,D). I've thought either to create a matrix of 4
plots ( for the 4 stores) or in some creative way combine them into one
plot?

Please help me or point me in the right direction as to which functions to
look into, I've been playing around with ggplot for a few days, but can't
seem to wrap my head around it yet...


Hi RanRL,
I swapped the colors and product names, but this rather inelegant code 
might do what you want:


ranrl-read.table(ranrl.dat,header=TRUE)
plot(ranrl$Price,ranrl$StoreC,ylim=range(ranrl[,3:5],na.rm=TRUE),
 type=n,xlab=Price,ylab=Number sold)
text(ranrl$Price[1],ranrl[1,3:5],
 paste(ProdA,names(ranrl)[3:5],sep=\n),
 col=red)
text(ranrl$Price[2],ranrl[2,3:5],
 paste(ProdA,names(ranrl)[3:5],sep=\n),
 col=green)
text(ranrl$Price[3],ranrl[3,3:5],
 paste(ProdA,names(ranrl)[3:5],sep=\n),
 col=blue)
text(ranrl$Price[4],ranrl[4,3:5],
 paste(ProdA,names(ranrl)[3:5],sep=\n),
 col=yellow)
text(ranrl$Price[5],ranrl[5,3:5],
 paste(ProdB,names(ranrl)[3:5],sep=\n),
 col=blue)
text(ranrl$Price[6],ranrl[6,3:5],
 paste(ProdC,names(ranrl)[3:5],sep=\n),
 col=green)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Random Forest Classification

2011-10-26 Thread Mohammed Rashad

Hi All,
I wrant to do Random Forest classification. I installed R, randomForest
classifier package for R
but dont know how to use it.

Is there any Open Source Remote sensing application which do RF
classification on satellite images?

Anyone r has random forest classification example?

Any language or package  example no problem.

Does anyone did it in R?
if yes how?

I google RF Classification but most of them are for medical disease and
research not for Remote Sensing


-- 
Regards,
   Mohammed Rashad K M
   M.S. (By Research) student
   Lab for Spatial Informatics
   Department of CSE
   International Institute of Information Technology
   Hyderabad, India

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Want to exclude axis numbering in plot.ca

2011-10-26 Thread Mark Webb

plot.ca gives numbers on each axis. How do I stipulate to exclude these. 
Have read the R Documentation plot.ca but see no option to exclude axis 
numbers.

Any suggestions?

--
Mark Webb

Line +27 (21) 786 4379
Cell +27 (72) 199 1000 [Poor reception]
Fax  +27 (86) 260 1946

Skype   tomarkwebb
Email   targetlinkm...@gmail.com
Client ftp  http://targetlinkresearch.co.za/cftp/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lock a package to specific R version

2011-10-26 Thread Uwe Ligges




On 25.10.2011 11:42, Mehmet Suzen wrote:

Hi,

I was wondering if it is possible to lock a package to a specific
version of R. Dependency attribute in the package DESCRIPTION
only accepts= AFAIU
(http://cran.r-project.org/doc/manuals/R-exts.html#fn-3 )

Any work around?


Intervals are possible., and you can restrict them to one version as 
follows:


Depends: R (= 2.13.2), R (= 2.13.2)

Uwe Ligges





Thanks,

Mehmet
LEGAL NOTICE
This message is intended for the use o...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building package/DESCRIPTION file not existing?

2011-10-26 Thread Martin Maechler

Francois Rousseu francoisrous...@hotmail.com
on Mon, 24 Oct 2011 20:10:27 -0400 writes:

Hello useRs

I am trying to build a package for personal use and for
making easier working with other people but I keep getting
the same error message about the DESCRIPTION file not
existing.

when trying to install from a source tar.gz file:

Error in .read_description(dfile) : file

'C:/Users/Propri�taire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION'
does not exist

when trying to build a binary version:

Error in .read_description(dfile) : file
'C:/Users/Propri�taire/Documents/RETROBIRD/mypkg/DESCRIPTION'
does not exist

In this last case, the DESCRIPTION file is certainly
there! Also, help and DESCRIPTION files are edited and my
path variable seems to be set correctly as I can access R
and tex (form miktex 2.9) from the console. I feel it
might be related to language issues (windows on my system
is in french, see sessionInfo() at bottom of message) or
something about temporary directories, but I really can't
find the problem. I've looked into the cygwin warning, but
it didn't seemed to be the problem, though I may be wrong.

Yes, I'm almost sure it's the language issues.

I've recently taught a course on R Package building
and on Windows, the user had problems because of an 'ä'
(a-Umlaut) in one of the directories in her 'path'.

So if you work from another place than
'C:/Users/Propri�taire/' this may solve the main problem.

Bonnes salutations,
Martin Maechler, ETH Zurich

Any hints? Below is the complete sequence with errors.

Thanks, Francois Rousseu

[.]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gam predictions with negbin model

2011-10-26 Thread Kari Ruohonen


Hi,
I wonder if predict.gam is supposed to work with family=negbin() 
definition? It seems to me that the values returned by type=response 
are far off the observed values. Here is an example output from the 
negbin examples:


 set.seed(3)
 n-400
 dat-gamSim(1,n=n)
 g-exp(dat$f/5)
 dat$y-rnbinom(g,size=3,mu=g)
 b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(3),data=dat)
 summary(y)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
 0.6061  1.6340  2.8120  2.7970  3.9250  4.9830
 summary(predict(b,type=response))
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
 0.8972  3.1610  4.8140  6.1170  8.1300 28.0100

I.e. the range and mean of observed values (y) are smaller than those of 
the predictions from the gam model. Should I somehow apply the estimated 
theta on these predictions?


regards, Kari

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lock a package to specific R version

2011-10-26 Thread Prof Brian Ripley


On Wed, 26 Oct 2011, Uwe Ligges wrote:




On 25.10.2011 11:42, Mehmet Suzen wrote:

Hi,

I was wondering if it is possible to lock a package to a specific
version of R. Dependency attribute in the package DESCRIPTION
only accepts= AFAIU
(http://cran.r-project.org/doc/manuals/R-exts.html#fn-3 )

Any work around?


Intervals are possible., and you can restrict them to one version as follows:

Depends: R (= 2.13.2), R (= 2.13.2)


Or even use ==

The point of the footnote is that install.packages() will download a 
package only checking any = requirements (and I suspect it will then 
install a binary version of a package).  R CMD INSTALL will not 
install it from the sources, and library() will not load it.


I don't see why you would want to do this: why would a package work 
with 2.13.1 and not 2.13.2, or 2.13.2 and not 2.14.0?  Ranges may make 
sense.




Uwe Ligges





Thanks,

Mehmet
LEGAL NOTICE
This message is intended for the use o...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gam predictions with negbin model

2011-10-26 Thread Achim Zeileis


On Wed, 26 Oct 2011, Kari Ruohonen wrote:


Hi,
I wonder if predict.gam is supposed to work with family=negbin() definition? 
It seems to me that the values returned by type=response are far off the 
observed values. Here is an example output from the negbin examples:



set.seed(3)
n-400
dat-gamSim(1,n=n)
g-exp(dat$f/5)
dat$y-rnbinom(g,size=3,mu=g)
b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(3),data=dat)
summary(y)

  Min. 1st Qu.  MedianMean 3rd Qu.Max.
0.6061  1.6340  2.8120  2.7970  3.9250  4.9830

summary(predict(b,type=response))

  Min. 1st Qu.  MedianMean 3rd Qu.Max.
0.8972  3.1610  4.8140  6.1170  8.1300 28.0100

I.e. the range and mean of observed values (y)


What exactly is y in the code above? I guess you mean dat$y:

R summary(dat$y)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  0.000   2.000   4.000   6.235   8.000  68.000

which looks rather reasonable...
Z

are smaller than those of the 
predictions from the gam model. Should I somehow apply the estimated theta on 
these predictions?


regards, Kari

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nls -- singular gradient problem

2011-10-26 Thread kv

Hi list,

i see this question is quiet a regular feature, 
but searching the past instances i could not 
find an answer to my specific problem.

Simply, trying to optimize this model gives 
a singular gradient problem -- tough optim()
seems to be able to solve it  would like to 
do these things in nls().

Treated-Puromycin[Puromycin$state==treated,]
weighted.MM-function(resp,conc,K){
pred-K[1]+(1-exp(K[2]))*conc
(resp-pred)
}
Pur.wt-nls(~weighted.MM(rate,conc,K),data=Treated,start=list(K=c(0,0.1)))


Please advise,

Best,

--
View this message in context: 
http://r.789695.n4.nabble.com/nls-singular-gradient-problem-tp3939939p3939939.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gam predictions with negbin model

2011-10-26 Thread Kari Ruohonen


On 26/10/11 12:10, Achim Zeileis wrote:

On Wed, 26 Oct 2011, Kari Ruohonen wrote:


Hi,
I wonder if predict.gam is supposed to work with family=negbin() 
definition? It seems to me that the values returned by 
type=response are far off the observed values. Here is an example 
output from the negbin examples:



set.seed(3)
n-400
dat-gamSim(1,n=n)
g-exp(dat$f/5)
dat$y-rnbinom(g,size=3,mu=g)
b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=negbin(3),data=dat)
summary(y)

  Min. 1st Qu.  MedianMean 3rd Qu.Max.
0.6061  1.6340  2.8120  2.7970  3.9250  4.9830

summary(predict(b,type=response))

  Min. 1st Qu.  MedianMean 3rd Qu.Max.
0.8972  3.1610  4.8140  6.1170  8.1300 28.0100

I.e. the range and mean of observed values (y)


What exactly is y in the code above? I guess you mean dat$y:

R summary(dat$y)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  0.000   2.000   4.000   6.235   8.000  68.000

which looks rather reasonable...
Z


Thanks - what a stupid mistake, an old .RData hanging around even if I 
started a new R instance. Terribly sorry and many apologies.


Kari

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with a scatter plot

2011-10-26 Thread Dennis Murphy

Hi:

The sort of thing you appear to want is fairly straightforward to in
lattice and ggplot2, as both have ways to automate conditioning plots.
Since you were looking at ggpot2, let's consider that problem. You
don't really show enough data to provide a useful demonstration, but
let's see if we can capture the essential structure.

 I want to create a plot where the colors of the hits represent the Product
 (A,B,C..), the character represents the color (X for yellow, box for green,
 etc..), the X axis is the price and the Y axis is the number (0-5) from the
 different Stores (A,B,C,D). I've thought either to create a matrix of 4
 plots ( for the 4 stores) or in some creative way combine them into one
 plot?

The first step is to melt the data so that Store becomes a factor
variable and its corresponding values are assigned to another
variable. To that end, one can invoke the very useful melt() function
in the reshape2 package:

library('reshape2')
mdata - melt(mydata, id = c('Product', 'Color', 'Price'))

This creates a new data frame with variables Product, Color, Price,
variable and value. variable contains StoreA, ... StoreD as factor
levels and value is a numeric variable consisting of the corresponding
values. For its structure, see
str(mdata)

If you want to change StoreA - StoreD to A - D, say, then you could
optionally do

mdata$variable - factor(mdata$variable, labels = LETTERS[1:4])


Assuming that you've done enough reading to understand what aesthetics
are about, the problem is essentially this:

x = Price
y = value
shape = Color
faceting variable = variable (the stores)

So a template of a ggplot2 graph for the melted data might look something like

library('ggplot2')
ggplot(mdata, aes(x = Price, y = value, shape = Color)) +
geom_point() +
facet_wrap( ~ variable, ncol = 2) +
   scale_shape_manual('Color', breaks = levels(mdata$Color),
 values = c(4, 0, 2, 8),
 labels = c('Blue',
'Green', 'Red', 'Yellow'))

This assigns  x, box, triangle and asterisk as shapes via their
numeric codes (see Hadley Wickham's ggplot2 book, p. 197 for the
reference). The labels = argument lets you change the letters B, G, R,
Y (which would comprise the default labels) to something more
evocative.

If for some odd reason you wanted to add corresponding colors to the
shapes, you could also do that, as follows:

ggplot(mdata, aes(x = Price, y = value, shape = Color, colour = Color)) +
geom_point() +
facet_wrap( ~ variable, ncol = 2) +
   scale_shape_manual('Color', breaks = levels(mdata$Color),
 values = c(4, 0, 2, 8),
 labels = c('Blue',
'Green', 'Red', 'Yellow')) +
   scale_colour_manual('Color', breaks = levels(mdata$Color),
 values = c('blue',
'green', 'red', 'yellow'),
 labels = c('Blue',
'Green', 'Red', 'Yellow'))

This should color the shapes in the graph and provide one (merged)
legend with colored shapes as symbols.

HTH,
Dennis


On Tue, Oct 25, 2011 at 11:48 PM, RanRL rnr...@gmail.com wrote:
 Hi everyone,

 I have some data about a market research which I want to arrange in one plot
 for easy viewing,
 the data looks something like:

 Product        Color        StoreA  StoreB  StoreC  StoreD      Price

 ProdA           R               NA        4.33     2         4.33         35
                  G                NA        4.33     2         4.33
 35
                   B               NA        4.33     2         3.76
 58
                  Y                NA         3.72    3         5.33
 23

 ProdB           B                5.44       NA      4.22      3.76        87

 ProdC           G                 4.77      3.22    4.77     2.10         65
                   B                 ...           ...     ...        ...
 ..

 And so on...

 I want to create a plot where the colors of the hits represent the Product
 (A,B,C..), the characther represent the color (X for yellow, box for green,
 etc..), the X axis is the price and the Y axis is the number (0-5) from the
 different Stores (A,B,C,D). I've thought either to create a matrix of 4
 plots ( for the 4 stores) or in some creative way combine them into one
 plot?

 Please help me or point me in the right direction as to which functions to
 look into, I've been playing around with ggplot for a few days, but can't
 seem to wrap my head around it yet...

 Thanks

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Help-with-a-scatter-plot-tp3939585p3939585.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal,

[R] set different font family for strings in mtext or text?

2011-10-26 Thread Jinsong Zhao


Hi there,

Is it possible to set different font family for strings in mtext or text?

For example, on windows platform with windows() device:

plot(1:10, type = n)
text(5,5, Chinese (English)) #Chinese for Chinese characters

it will give the correct Chinese and English characters with two 
different font family, i.e., English character in default sans family, 
and Chinese character in the system default font family (it seems that 
the Chinese font family can not be set or changed).


However, when using pdf() or postscript(), if setting the font family to 
Times, then error message will appear:

conversion failure on '...' in 'mbcsToSbcs': dot substituted for...

When set the family song (a CJK font family), the English character 
will be displayed in that CJK font family.


I hope to know, is there a mechanism that can be used to set different 
font family for one string, e.g., if one character can not be find in 
the default font family, then search for another font family?


Any suggestions or comments will be really appreciated?

Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strucchange Nyblom-Hansen Test?

2011-10-26 Thread buehlerman

Thank you, things seem to be clearer :-)

 Hansen extended this to the linear regression model and proposed to either
 compute one test statistic per parameter (which you can do with the parm
 argument of gefp) or a joint statistic for all parameters. Hansen included
 in all parameters also the variance,

The parm argument of gefp is a nice feature, but what is about the
significance level in test statistic compuation (sctest)? Is there multiple
testing correction applied or should I rather use for this case the double
max statistic as recommended below?

An excerpt from page 5 of the paper A Unified Approach to Structural Change
Tests Based obn F Statistics, OLS Residuals, and ML Scores (Achim Zeileis):
Hansen (1992) suggests to compute this statistic for the full process efp(t)
to test all coefficients
simultaneously and also for each component of the process (efp(t))j
(denoting the j-th component
of the process efp(t), j = 1, . . . , k) individually to assess which
parameter causes the instability.
*Note, that this approach leads to a violation of the significance level of
the procedure if no multiple
testing correction is applied.* This can be avoided if a functional is
applied to the empirical
fluctuation process which aggregates over time first yielding k independent
test statistics (see
Zeileis and Hornik 2003, for more details).

--
View this message in context: 
http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3940055.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] set different font family for strings in mtext or text?

2011-10-26 Thread Prof Brian Ripley


See ?par: check the 'family' paramater.
You can select 'family' for each call to mtext or text.

However, mixing families is rather ugly, and there are font families 
that cover both English and Chinese.


Note that the main problem with postscript() and pdf() is the limited 
support in those languages for non-8-bit character encodings: R cannot 
magically remove restrictions of languages designed in the 1970s.
See also http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf 
(referenced from ?pdf)


Users of other OSes have the option of using cairographics-based 
devices (e.g. cairo_pdf), and so will Windows' users as from 2.14.0 
(which is in RC): however, the font flexibility is far less on 
Windows.


On Wed, 26 Oct 2011, Jinsong Zhao wrote:


Hi there,

Is it possible to set different font family for strings in mtext or text?

For example, on windows platform with windows() device:

plot(1:10, type = n)
text(5,5, Chinese (English)) #Chinese for Chinese characters

it will give the correct Chinese and English characters with two different 
font family, i.e., English character in default sans family, and Chinese 
character in the system default font family (it seems that the Chinese font 
family can not be set or changed).


It certainly can, and the rw-FAQ describes how to do so.

However, when using pdf() or postscript(), if setting the font family to 
Times, then error message will appear:

conversion failure on '...' in 'mbcsToSbcs': dot substituted for...

When set the family song (a CJK font family), the English character will be 
displayed in that CJK font family.


I hope to know, is there a mechanism that can be used to set different font 
family for one string, e.g., if one character can not be find in the default 
font family, then search for another font family?


You have to specify the family: R will not guess what you wanted.


Any suggestions or comments will be really appreciated?

Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difficulties with MuMIn model generation with coxph

2011-10-26 Thread Kamil Bartoń


Dear Sophie,

The answer is 'typo'. 'dredge' does not have an argument named 'marge.ex'.

k


Dnia 2011-10-25 12:00, r-help-requ...@r-project.org pisze:

Message: 131
Date: Mon, 24 Oct 2011 17:08:41 -0700 (PDT)
From: sgilbertsophielgilb...@gmail.com
To:r-help@r-project.org
Subject: [R] difficulties with MuMIn model generation with coxph
Message-ID:1319501321733-3935078.p...@n4.nabble.com
Content-Type: text/plain; charset=us-ascii

Hi All,

I'm having trouble with the automatized model generation (dredge) function
in the MuMIn package. I'm trying to use it to automatically generate subsets
of models from a global cox proportional hazards model, and rank them based
on AICc. These seems like it's possible, and the Mumin documentation says
that coxph is supported. However, when I run the code (see below), it gives
me the following error message:

Error in UseMethod(logLik) :
   no applicable method for 'logLik' applied to an object of class logical

##RCode

#read in the data


data1-read.table('MaleData500.csv', sep=',', header=T)

survival-Surv(data1$Wks.at.dth, data1$Died)

#create the full (global) model, a coxph object


globemodel-coxph(survival~  edgeden + pctroad + pctcc90+ pctcc80 +

pctcrsog +  ravine + canfrag + pctoldc,  data=data1)

#evaluate all subsets of models using dredge


exhausting-dredge(globemodel, eval=TRUE, fixed=c(pctroad),m.max=3,

marge.ex=TRUE, rank=AICc)

Error in UseMethod(logLik) :
   no applicable method for 'logLik' applied to an object of class logical


any suggestions would be greatly appreciated. The globemodel works on its
own, and prints out a summary just fine. The only thing I can think of is
that in the names of globemodel, there is an attribute called loglik, not
logLik?

Thank you,

Sophie



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strucchange Nyblom-Hansen Test?

2011-10-26 Thread Achim Zeileis


On Wed, 26 Oct 2011, buehlerman wrote:


Thank you, things seem to be clearer :-)


Great.


Hansen extended this to the linear regression model and proposed to either
compute one test statistic per parameter (which you can do with the parm
argument of gefp) or a joint statistic for all parameters. Hansen included
in all parameters also the variance,


The parm argument of gefp is a nice feature, but what is about the
significance level in test statistic compuation (sctest)? Is there multiple
testing correction applied or should I rather use for this case the double
max statistic as recommended below?


By applying the functional in sctest(), you implicitly correct for the 
number of parameters tested. Thus, you don't need to apply another 
correction for multiple testing. (The only caveat with the p-values from 
sctest() is that these are always asymptotic p-values and may not be exact 
in finite samples. And for many functionals these have been determined by 
simulation.)


This is discussed in a little bit more detail in

 Zeileis A. (2006), Implementing a Class of Structural Change
 Tests: An Econometric Computing Approach. _Computational
 Statistics  Data Analysis_, *50*, 2987-3008.
 doi:10.1016/j.csda.2005.07.001.

The comment quoted below pertains to the fact that Hansen (1992) suggested 
to compute one p-value for each individual parameter as well as another 
p-value for all parameters jointly. In such a situation, you would have to 
apply some multiple testing procedure. The meanL2BB functional in 
strucchange only computes the joint p-value.


hth,
Z


An excerpt from page 5 of the paper A Unified Approach to Structural Change
Tests Based obn F Statistics, OLS Residuals, and ML Scores (Achim Zeileis):
Hansen (1992) suggests to compute this statistic for the full process efp(t)
to test all coefficients
simultaneously and also for each component of the process (efp(t))j
(denoting the j-th component
of the process efp(t), j = 1, . . . , k) individually to assess which
parameter causes the instability.
*Note, that this approach leads to a violation of the significance level of
the procedure if no multiple
testing correction is applied.* This can be avoided if a functional is
applied to the empirical
fluctuation process which aggregates over time first yielding k independent
test statistics (see
Zeileis and Hornik 2003, for more details).

--
View this message in context: 
http://r.789695.n4.nabble.com/strucchange-Nyblom-Hansen-Test-tp3887208p3940055.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] set different font family for strings in mtext or text?

2011-10-26 Thread Jinsong Zhao


Thank you very much for the quick reply.

On 2011-10-26 18:24, Prof Brian Ripley wrote:

See ?par: check the 'family' paramater.
You can select 'family' for each call to mtext or text.


Yes, I can select 'family' for each call to mtext or text. however, when 
it's necessary to put both Chinese and English in one line, I should 
call text or mtext several times with position explicitly. It will be 
really tedious. The following code have been used for this purpose, 
however, I don't like this design:


put.text - function(x, y, text, family, font, ...) {
   str.n - length(text)
   sw.n - numeric(length = str.n+1)
   sw.n[1] - 0

   if (missing(family)) family - rep(, str.n)
   if (missing(font)) font - rep(1, str.n)

   for (i in 1:str.n) sw.n[i+1] - strwidth(text[i], family = 
family[i], font = font[i])

   sw - sum(sw.n)
   for (i in 1:str.n)
  text(x+sum(sw.n[1:i]), y, text[i], family = family[i], font = 
font[i], adj = c(0,0.5), ...)

}

## usage
## plot 中文(English) with different font family
## 'song' is a user defined font family for CJK.
pdf()
plot(1:10, type = n)
put.text(5, 5, c(中文, (English)), c(song, Times))
dev.off()




However, mixing families is rather ugly, and there are font families
that cover both English and Chinese.


Yes, there are some font families that cover both English and Chinese, 
however, in those font families, the English characters are ugly...




Note that the main problem with postscript() and pdf() is the limited
support in those languages for non-8-bit character encodings: R cannot
magically remove restrictions of languages designed in the 1970s.
See also http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf
(referenced from ?pdf)


Well, I have read this paper very careful, so I can draw CJK on the plot 
in postscript() and pdf().




Users of other OSes have the option of using cairographics-based devices
(e.g. cairo_pdf), and so will Windows' users as from 2.14.0 (which is in
RC): however, the font flexibility is far less on Windows.


I will try this device. Thanks for the information.

Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Example(chron) doesn't work

2011-10-26 Thread Gabor Grothendieck

On Wed, Oct 26, 2011 at 12:21 AM, hchui helena.c...@flinders.edu.au wrote:
 Hi, there,

 I have a similar problem. The chron example gives NA. dates doesn't work but
 times does.

 I would appreciate it if there's a fix for it.

 Thanks,
 Helena

 example(chron)

 chron dts - dates(c(02/27/92, 02/27/92, 01/14/92,
 chron+                02/28/92, 02/01/92))

 chron dts
 [1] NA NA NA NA NA

 chron # [1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92
 chron tms - times(c(23:03:20, 22:29:56, 01:03:30,
 chron+                18:21:03, 16:56:26))

 chron tms
 [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26

 chron # [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26
 chron x - chron(dates = dts, times = tms)

 chron x
 [1] (NA NA) (NA NA) (NA NA) (NA NA) (NA NA)

 chron # [1] (02/27/92 23:03:19) (02/27/92 22:29:56) (01/14/92 01:03:30)
 chron # [4] (02/28/92 18:21:03) (02/01/92 16:56:26)
 chron
 chron # We can add or subtract scalars (representing days) to dates or
 chron # chron objects:
 chron c(dts[1], dts[1] + 10)
 Error in y + ifelse(m  2, 0, -1) :
  non-numeric argument to binary operator
 In addition: Warning message:
 In matrix(unlist(lapply(dots, origin)), nrow = 3) :
  data length [2] is not a sub-multiple or multiple of the number of rows
 [3]
 packageDescription(chron)$Version
 [1] 2.3-42
 R.version.string
 [1] R version 2.13.1 (2011-07-08)
 win.version()
 [1] Windows 7 x64 (build 7600)




Does it still occur if you start R in vanilla mode?  From Windows console:

Rgui --vanilla

(If you don't have Rgui.exe's directory on your path then cd to the
directory where Rgui.exe is located first.).

Also, does it still occur in the most recent version of R?


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dotPlot with diagonal

2011-10-26 Thread Jörg Reuter

Hi,
I want draw a dotPlot. All works fine:
(Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3,
4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6))
dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und
Sequenz 2, asp = 1)
Is there a way to draw a small diagonal, begin at (0/0) to (6/6)
(perhaps in red??) or must I use gimp? I have many dotPlots, so it is
fine if R can do this.

Thanks Joerg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation from discrete uniform

2011-10-26 Thread Mehmet Suzen

Why don't you use sample;
 sample(1:10,10,replace=TRUE)

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of BSanders
Sent: 26 October 2011 08:49
To: r-help@r-project.org
Subject: Re: [R] Simulation from discrete uniform

If you wanted a discrete uniform from 1-10 use: ceiling(10*runif(1))
if you wanted from 0-12, use: ceiling(13*runif(1))-1

--
View this message in context:
http://r.789695.n4.nabble.com/Simulation-from-discrete-uniform-tp3434980
p3939694.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
LEGAL NOTICE
This message is intended for the use o...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging two dataframes

2011-10-26 Thread dividend

Hello.

Now i tried to do what you told me.
I used the str(fuction), and data$date1 and data3$date1 where both listed
character. I changed name to character but it did not work either.
I also changed all variables to character, with no positive result.

str(data)
'data.frame':   14446 obs. of  15 variables:
 $ id : chr  1 1 1 1 ...
 $ compid : chr  2514 2514 2514 2514 ...
 $ secid  : chr  15856 15856 15856 15856 ...
 $ name   : chr  A-pressen A-pressen A-pressen A-pressen ...
 $ period : chr  1 2 3 4 ...
 $ date   : chr  17.05.1980 17.05.1981 17.05.1982 17.05.1983 ...
 $ enddate: chr  17.05.1981 17.05.1982 17.05.1983 17.05.1984 ...
 $ div: chr  NA NA NA NA ...
 $ ndivs  : chr  NA NA NA NA ...
 $ posdiv : chr  NA NA NA NA ...
 $ ddiv2  : chr  NA NA NA NA ...
 $ ddiv3  : chr  NA NA NA NA ...
 $ ddiv4  : chr  NA NA NA NA ...
 $ ddiv5  : chr  NA NA NA NA ...
 $ ddiv6  : chr  NA NA NA NA ...

str(data3)
'data.frame':   812354 obs. of  9 variables:
 $ date  : chr  02.01.1996 03.01.1996 04.01.1996
05.01.1996 ...
 $ Securityid: chr  6001 6001 6001 6001 ...
 $ Symbol: chr  AAV AAV AAV AAV ...
 $ name  : chr  Adresseavisen Adresseavisen
Adresseavisen Adresseavisen ...
 $ Securitytype  : chr  Ordinary Shares Ordinary Shares
Ordinary Shares Ordinary Shares ...
 $ Unadjusted: chr  200 200 200 200 ...
 $ Event.adjusted: chr  200 200 200 200 ...
 $ Div.and.Event.adjusted: chr  109,7595375 109,7595375 109,7595375
109,7595375 ...
 $ Sharesissued  : chr  1901646 1901646 1901646 1901646 ...

Here is some suitable data for data

 dput(data[1:20,])

structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), 
compid = c(2514, 2514, 2514, 2514, 2514, 2514, 
2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 
2514, 2514, 2514, 2514, 2514, 2514), secid = c(15856, 
15856, 15856, 15856, 15856, 15856, 15856, 15856, 
15856, 15856, 15856, 15856, 15856, 15856, 15856, 
15856, 15856, 15856, 15856, 15856), name = c(A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen), period = c(1, 
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20), date = c(17.05.1980, 
17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 
17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 
17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 
17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999), 
enddate = c(17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 
17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 
17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 
17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999, 
17.05.2000), div = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
0, 0, 0, 0, 0, 5, 0, 1.1, 1.2, 1, 0
), ndivs = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 
0, 0, 0, 1, 0, 1, 1, 1, 0), posdiv = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 
NA, 1, 1, 1, NA), ddiv2 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, 0, 0, 0, 0, 0, NA, 0, 1, NA, 
NA), ddiv3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
0, 0, 0, 0, 0, 0, 0, 0, -1), ddiv4 = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 
0, 0, 0, 0, 0), ddiv5 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 
0), ddiv6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 0, 0, 0, 0, 0, 0)), .Names = c(id, 
compid, secid, name, period, date, enddate, div, 
ndivs, posdiv, ddiv2, ddiv3, ddiv4, ddiv5, ddiv6
), row.names = c(NA, 20L), class = data.frame)




Here is some suitable data for data3:

 dput(data3[1:20,])

structure(list(date = c(02.01.1996, 03.01.1996, 04.01.1996, 
05.01.1996, 08.01.1996, 09.01.1996, 10.01.1996, 11.01.1996, 
12.01.1996, 15.01.1996, 16.01.1996, 17.01.1996, 18.01.1996, 
19.01.1996, 22.01.1996, 23.01.1996, 24.01.1996, 25.01.1996, 
26.01.1996, 29.01.1996), Securityid = c(6001, 6001, 6001, 
6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 
6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 
6001), Symbol = c(AAV, AAV, AAV, AAV, AAV, AAV, 
AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, AAV, 
AAV, AAV, AAV, AAV, AAV), name = c(Adresseavisen, 
Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, 
Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, 
Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, 
Adresseavisen, Adresseavisen, Adresseavisen, Adresseavisen, 
Adresseavisen, Adresseavisen, Adresseavisen), Securitytype =
c(Ordinary Shares, 
Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, 
Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, 
Ordinary Shares, Ordinary Shares, Ordinary Shares, Ordinary Shares, 
Ordinary

[R] New column of data filled with the larger value from 2 columns

2011-10-26 Thread robgriffin247

Hi,
I'm sure there is a pretty simple answer to this but I have had my head
buried in the R book and on help pages for a while now and I've not made any
progress.

In simple terms:
I have 2 columns of data, column A and column B. I want to create a new
column (C) and fill it with the largest value from of A or B on each row.


So I want C to contain A if BA
and C to contain B if A=B


Like I said I have tried to look for an answer and I'm sure there is one (or
many) out there but I am looking in the wrong places or for the wrong terms
so I would really appreciate this help! 


Thanks,
Rob.

(I promise that once I have mastered R- hopefully in the near future- I will
make up for my sins of asking a basic Q by answering many on here!)

--
View this message in context: 
http://r.789695.n4.nabble.com/New-column-of-data-filled-with-the-larger-value-from-2-columns-tp3940020p3940020.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correlation Matrix in R

2011-10-26 Thread AlexC

Thank you for your quick reply and helpful advice.

Using this argument allows me to do what I needed to do

Now the only other thing I wanted to accomplish was to obtain the top half
of the matrix with p values 
and the bottom half with the correlations, to observe the significant
correlations.  I have been able to use a few functions such as rcorr, and
cor.matrix to get such information but it isn't output in a format that I
can save with the write.table function or write.clipboard

the pair function allows a graphical display of the data on the other hand
(with correlation graphics on the bottom half) and I have added an argument
which allows to view the significant p values.  But I wanted to know if I
could also do the above easily.

--
View this message in context: 
http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3940170.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculate the difference using ave

2011-10-26 Thread Patrick Hausmann


Dear R users,

It may be very simple but it is being difficult for me.
I'd like to calculate the difference in percent between to measures.
My data looks like this:

set.seed(123)
df1 - data.frame(measure = rep(c(A1, A2, A3), each=3),
  water = sample(c(100:200), 9),
  tide  = sample(c(-10:+10), 9))
df1

# What I want to calculate is:
# tide_[A2] / water_[A1],
# tide_[A3] / water_[A2]

# This 'works' for the example, but I am
# looking for a more general solution.

df1$tide_diff - ave(df1$tide, FUN=function(L) L /
 c(NA, NA, NA, df1$water)) * 100
df1

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] set different font family for strings in mtext or text?

2011-10-26 Thread Prof Brian Ripley


On Wed, 26 Oct 2011, Jinsong Zhao wrote:


Thank you very much for the quick reply.

On 2011-10-26 18:24, Prof Brian Ripley wrote:

See ?par: check the 'family' paramater.
You can select 'family' for each call to mtext or text.


Yes, I can select 'family' for each call to mtext or text. however, when it's 
necessary to put both Chinese and English in one line, I should call text or 
mtext several times with position explicitly. It will be really tedious. The 
following code have been used for this purpose, however, I don't like this 
design:


put.text - function(x, y, text, family, font, ...) {
  str.n - length(text)
  sw.n - numeric(length = str.n+1)
  sw.n[1] - 0

  if (missing(family)) family - rep(, str.n)
  if (missing(font)) font - rep(1, str.n)

  for (i in 1:str.n) sw.n[i+1] - strwidth(text[i], family = family[i], font 
= font[i])

  sw - sum(sw.n)
  for (i in 1:str.n)
 text(x+sum(sw.n[1:i]), y, text[i], family = family[i], font = font[i], 
adj = c(0,0.5), ...)

}

## usage
## plot 中文(English) with different font family
## 'song' is a user defined font family for CJK.
pdf()
plot(1:10, type = n)
put.text(5, 5, c(中文, (English)), c(song, Times))
dev.off()




However, mixing families is rather ugly, and there are font families
that cover both English and Chinese.


Yes, there are some font families that cover both English and Chinese, 
however, in those font families, the English characters are ugly...


Not to my eyes in Arial Unicode MS (nor to millions of writers of Word 
documents).  Not elegant, but not ugly.


And that is one of the recommended choices in several places in the R 
documentation.






Note that the main problem with postscript() and pdf() is the limited
support in those languages for non-8-bit character encodings: R cannot
magically remove restrictions of languages designed in the 1970s.
See also http://cran.r-project.org/doc/Rnews/Rnews_2006-2.pdf
(referenced from ?pdf)


Well, I have read this paper very careful, so I can draw CJK on the plot in 
postscript() and pdf().




Users of other OSes have the option of using cairographics-based devices
(e.g. cairo_pdf), and so will Windows' users as from 2.14.0 (which is in
RC): however, the font flexibility is far less on Windows.


I will try this device. Thanks for the information.

Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New column of data filled with the larger value from 2 columns

2011-10-26 Thread Ben Bolker

robgriffin247 robgriffin247 at hotmail.com writes:

 
 Hi,
 I'm sure there is a pretty simple answer to this but I have had my head
 buried in the R book and on help pages for a while now and I've not made any
 progress.
 
 In simple terms:
 I have 2 columns of data, column A and column B. I want to create a new
 column (C) and fill it with the largest value from of A or B on each row.
 

  sounds like you want data$C - pmax(data$A,data$B)
(or data - transform(C,pmax(A,B)))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate the difference using ave

2011-10-26 Thread Dimitris Rizopoulos


Maybe one approach could be:

set.seed(123)
df1 - data.frame(measure = rep(c(A1, A2, A3), each=3),
  water = sample(c(100:200), 9),
  tide  = sample(c(-10:+10), 9))


100 * tail(df1$tide, -3) / head(df1$water, -3)


I hope it helps.

Best,
Dimitris


On 10/26/2011 12:02 PM, Patrick Hausmann wrote:

Dear R users,

It may be very simple but it is being difficult for me.
I'd like to calculate the difference in percent between to measures.
My data looks like this:

set.seed(123)
df1 - data.frame(measure = rep(c(A1, A2, A3), each=3),
water = sample(c(100:200), 9),
tide = sample(c(-10:+10), 9))
df1

# What I want to calculate is:
# tide_[A2] / water_[A1],
# tide_[A3] / water_[A2]

# This 'works' for the example, but I am
# looking for a more general solution.

df1$tide_diff - ave(df1$tide, FUN=function(L) L /
c(NA, NA, NA, df1$water)) * 100
df1

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging two dataframes

2011-10-26 Thread Timothy Bates

I think you want something like this (I like to be explicit about what you are 
merging)

df3 = merge(df1, df2, by = date, all=T)

You can be explicit about what you are merging on in each file:

df3 = merge(df1,df2, by.x = date”, by.y=date, all=T)

You were trying to merge on “date1” but it looks to me like your data frames 
actually contains columns called “date” not “date1

As Petr says, in the vanilla situation where there is no overlap of data and 
the ID column has the same name in both frames, then 
merge(frame1, frame2) works by itself.

tip: don’t use words like “data” as variable names, as that is also a function

On 26 Oct 2011, at 11:59 AM, dividend wrote:

 Hello.
 
 Now i tried to do what you told me.
 I used the str(fuction), and data$date1 and data3$date1 where both listed
 character. I changed name to character but it did not work either.
 I also changed all variables to character, with no positive result.
 
 str(data)
 'data.frame':   14446 obs. of  15 variables:
 $ id : chr  1 1 1 1 ...
 $ compid : chr  2514 2514 2514 2514 ...
 $ secid  : chr  15856 15856 15856 15856 ...
 $ name   : chr  A-pressen A-pressen A-pressen A-pressen ...
 $ period : chr  1 2 3 4 ...
 $ date   : chr  17.05.1980 17.05.1981 17.05.1982 17.05.1983 ...
 $ enddate: chr  17.05.1981 17.05.1982 17.05.1983 17.05.1984 ...
 $ div: chr  NA NA NA NA ...
 $ ndivs  : chr  NA NA NA NA ...
 $ posdiv : chr  NA NA NA NA ...
 $ ddiv2  : chr  NA NA NA NA ...
 $ ddiv3  : chr  NA NA NA NA ...
 $ ddiv4  : chr  NA NA NA NA ...
 $ ddiv5  : chr  NA NA NA NA ...
 $ ddiv6  : chr  NA NA NA NA ...
 
 str(data3)
 'data.frame':   812354 obs. of  9 variables:
 $ date  : chr  02.01.1996 03.01.1996 04.01.1996
 05.01.1996 ...
 $ Securityid: chr  6001 6001 6001 6001 ...
 $ Symbol: chr  AAV AAV AAV AAV ...
 $ name  : chr  Adresseavisen Adresseavisen
 Adresseavisen Adresseavisen ...
 $ Securitytype  : chr  Ordinary Shares Ordinary Shares
 Ordinary Shares Ordinary Shares ...
 $ Unadjusted: chr  200 200 200 200 ...
 $ Event.adjusted: chr  200 200 200 200 ...
 $ Div.and.Event.adjusted: chr  109,7595375 109,7595375 109,7595375
 109,7595375 ...
 $ Sharesissued  : chr  1901646 1901646 1901646 1901646 ...
 
 Here is some suitable data for data
 
 dput(data[1:20,])
 
 structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), 
compid = c(2514, 2514, 2514, 2514, 2514, 2514, 
2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 
2514, 2514, 2514, 2514, 2514, 2514), secid = c(15856, 
15856, 15856, 15856, 15856, 15856, 15856, 15856, 
15856, 15856, 15856, 15856, 15856, 15856, 15856, 
15856, 15856, 15856, 15856, 15856), name = c(A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
A-pressen, A-pressen, A-pressen, A-pressen), period = c(1, 
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20), date = c(17.05.1980, 
17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 17.05.1985, 
17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 17.05.1990, 
17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 17.05.1995, 
17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999), 
enddate = c(17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 
17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 
17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 
17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999, 
17.05.2000), div = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
0, 0, 0, 0, 0, 5, 0, 1.1, 1.2, 1, 0
), ndivs = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 
0, 0, 0, 1, 0, 1, 1, 1, 0), posdiv = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 
NA, 1, 1, 1, NA), ddiv2 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, 0, 0, 0, 0, 0, NA, 0, 1, NA, 
NA), ddiv3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
0, 0, 0, 0, 0, 0, 0, 0, -1), ddiv4 = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 
0, 0, 0, 0, 0), ddiv5 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 
0), ddiv6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 0, 0, 0, 0, 0, 0)), .Names = c(id, 
 compid, secid, name, period, date, enddate, div, 
 ndivs, posdiv, ddiv2, ddiv3, ddiv4, ddiv5, ddiv6
 ), row.names = c(NA, 20L), class = data.frame)
 
 
 
 
 Here is some suitable data for data3:
 
 dput(data3[1:20,])
 
 structure(list(date = c(02.01.1996, 03.01.1996, 04.01.1996, 
 05.01.1996, 08.01.1996, 09.01.1996, 10.01.1996, 11.01.1996, 
 12.01.1996, 15.01.1996, 16.01.1996, 17.01.1996, 18.01.1996, 
 19.01.1996, 22.01.1996, 23.01.1996, 24.01.1996, 25.01.1996, 
 26.01.1996, 29.01.1996), Securityid = c(6001, 6001, 6001, 
 6001, 6001, 6001, 6001, 6001, 6001, 6001, 6001, 
 6001, 6001, 6001,

Re: [R] Random Forest Classification

2011-10-26 Thread Steve_Friedman

Explore the ModelMap package. It might offer some useful tools for your
application.



Steve Friedman Ph. D.
Ecologist  / Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147


   
 Mohammed Rashad   
 mohammedrashadkm 
 @gmail.comTo 
 Sent by:  r-help@r-project.org
 r-help-bounces@r-  cc 
 project.org   
   Subject 
   [R] Random Forest Classification
 10/26/2011 02:50  
 AM
   
   
   
   




Hi All,
I wrant to do Random Forest classification. I installed R, randomForest
classifier package for R
but dont know how to use it.

Is there any Open Source Remote sensing application which do RF
classification on satellite images?

Anyone r has random forest classification example?

Any language or package  example no problem.

Does anyone did it in R?
if yes how?

I google RF Classification but most of them are for medical disease and
research not for Remote Sensing


--
Regards,
   Mohammed Rashad K M
   M.S. (By Research) student
   Lab for Spatial Informatics
   Department of CSE
   International Institute of Information Technology
   Hyderabad, India

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [BioC] comparing two tables

2011-10-26 Thread Assa Yeroslaviz

Hi David,

your function works just fine if I take nly the region into account. But
unfortunately it does not consider the first column of the chromosomes.
There can be an overlap between the two tables only if the regions are on
the same chromosome. This is why the first column of both tables is a
prerequisite for the analysis.

I treid somehow to create a second argument to consider this, but until now
without success.

If you have any Ideas I will be grateful.

Thanks
Assa

(I send it only to r-help, as iti si besically an R-question and not
specific to bioconductor, but I still think it is also something to do with
bioc as it deals with chromosome regions. But anyway, I think you were right
about it.)

On Tue, Oct 25, 2011 at 18:01, David Winsemius dwinsem...@comcast.netwrote:


 On Oct 25, 2011, at 10:40 AM, Assa Yeroslaviz wrote:

  Hi all,

 @Martin - thanks for the help it works very good.

 @David - sorry for the misunderstanding. I will see to it, that it won't
 happen again.
 BTW, unfortunately your function is not working.
 It is patialy my error as I gave no regions with overlaps, but even after
 changing them it just doesn't fit.

 Here is the new data with an overlap in the third gene:

 genetable - rd.txt(name chr start end str
 accession Length

 gen1 4 646752 646838 + MI0005806 86
 gen12 2L 243035 243141 - MI0005821 106
 gen3 2L 159838 159928 + MI0005813 90
 gen7 2L 1831685 1831799 - MI0011290 114
 gen4 2L 2737568 2737661 + MI0017696 93)
 loctable - rd.txt(Chr Start End length

 4 136532 138654 2122
 3 139870 141970 2100
 2L 157838 160440 2602
 X 160834 162966 2132
 4 204040 208536 4496)

 But I still get:

 apply(genetable, 1, function(x) inregion(x, loctable[, c(Start,

 End)]) )
 [1] FALSE FALSE FALSE FALSE FALSE


 You just want to pass the start and end columns of genetable


  # Helper function
  inregion - function(vec, locs) {
 +any( apply(locs, 1, function(x) vec[start]x[1] 
 vec[end]=x[2])) }
  # Test the function
  inregion(genetable[2, ], loctable[, c(Start, End)])
 [1] FALSE
  # [1] FALSE
 
  apply(genetable[, 3:4], 1, function(x) inregion(x, loctable[, c(Start,
 End)]) )
 [1] FALSE FALSE  TRUE FALSE FALSE

 ( I really wish that you would stop crossposting. I am only following your
 bad practice because you posted my code on BioC.)

 --
 David


 for the single queries I get TRUE:

  inregion(genetable[3, ], loctable[, c(Start, End)])

 [1] TRUE

 Do you have Idea, as to how I can fix this problem?

 Thanks and again sorry for the trouble.

 Assa

 On Tue, Oct 25, 2011 at 15:48, Martin Morgan mtmor...@fhcrc.org wrote:

  On 10/25/2011 03:42 AM, Assa Yeroslaviz wrote:

  Hi everybody,

 I would like to know whether it is possible to compare to tables for
 certain
 parameters.
 I have these two tables:
 gene table
 name chr start end str accession Length
 gen1 4 646752 646838 + MI0005806 86
 gen12 2L 243035 243141 - MI0005821 106
 gen3 2L 159838 159928 + MI0005813 90
 gen7 2L 1831685 1831799 - MI0011290 114
 gen4 2L 2737568 2737661 + MI0017696 93
 ...

 localization table:
 Chr Start End length
 4 136532 138654 2122
 3 139870 141970 2100
 2L 157838 158440 602
 X 160834 162966 2132
 4 204040 208536 4496
 ...

 I would like to check whether a specific gene lie within a certain
 region.
 For example I want to see if gene 3 on chromosome 2L lies within the
 region
 given in the second table.


 Hi Assa --

 In Bioconductor, use the GenomicRanges package. Create two GRanges
 objects

 genes = with(genetable, GRanges(chr, IRanges(start, end), str,
accession=accession, Length=length)
 locations = with(locationtable, GRanges(Chr, IRanges(Start, End)))

 then

 olaps = findOverlaps(genes, locations)

 queryHits(olaps) and subjectHits(olaps) index each gene with all
 locations
 it overlaps. The definition of 'overlap' is flexible, see ?findOverlaps.

 Martin



  What I would like to is like
 1. check if the gene lies on a specific chromosome
 1.a if no - go to the next line
 1.b if yes - go to 2
 2. check if the start position of the gene is bigger than the start
 position
 of the localization table AND if it smaller than the end position (if it
 lies between the start and end positions in the localization table)
 2.a if no - go to the next gene
 2.b if yes - give it to me.

 I was having difficulties doing it without running into three
 interleaved
 conditional loops (if).

 I would appreciate any help.

 Thanks

 Assa

  [[alternative HTML version deleted]]

 ___

Re: [R] RGtk2 problems

2011-10-26 Thread Michael Lawrence

The gain from updating will be that RGtk2 now looks in a specific (internal)
place for the libraries, so you should no longer need to worry about library
conflicts and PATH settings. In theory.

Michael

On Tue, Oct 25, 2011 at 5:46 PM, Aref arefnamm...@gmail.com wrote:

 Thank you for the response and I am sorry about the html--will
 remember next time.
 The version of RGtk2 installed is 2.20.8 I installed it through R from
 CRAN repository.
 I believe that the problem is that during the installation the
 environment variable GTK_BASEPATH was set to some other location than
 where GTK+ was installed--overwritten by the R installation process. I
 found this after I fixed the issue by copying the libraries into R
 \bin. This is probably not the best solution but it works. I will be
 updating R soon to 2.14 when it comes out and hopefully things will
 work better now that I have the environment variable pointing to the
 right place for the GTK+ libraries.


 On Oct 24, 12:12 am, Prof Brian Ripley rip...@stats.ox.ac.uk wrote:
  Please update your R (and probably your RGtk2: you did not tell us its
  version), as the posting guide asked you to do before posting.
 
  On Sun, 23 Oct 2011, Aref Nammari wrote:
   Hello,
 
   I hope this is the right place to ask for help with a problem I am
   having with RGtk2 installation with R on Windows XP.
   I am running R 2.11.1 and have installed the package RGtk2 from CRAN.
 
  As a binary package, I guess, but please tell us (it matters).
 
   I also have GTK 2.10.11 installed as well as GTK2-runtime 2.22.0. I
   have added the environment variable GTK_PATH and set its value to the
   root location where GTK is installed.
 
  But you need the Gtk+ bin directory in your PATH.  Environment
  variable GTK_PATH is only needed when RGtk2 is installed from the
  sources.
 
  Which Gtk+ you need in your path depends on the version of RGtk2 you
  have and how you installed it.  For current binary versions, see
 
  http://cran.r-project.org/bin/windows/contrib/2.13/@ReadMe
 
 
 
   When I try to run RGtk2 in R by
   typing library(RGtk2) a popup dialog appears with the following error
   message:
 
   The procedure entry point gdk_app_launch_context_get_type could not be
   located in the dynamic link library libgdk-win32-2.0-0.dll
 
   In the R window I get :
 
   Error in inDL(x, as.logical(local), as.logical(now), ...) :
unable to load shared library 'C:/PROGRA~1/R/R-211~1.1/library/RGtk2/
   libs/RGtk2.dll':
LoadLibrary failure:  The specified procedure could not be found.
 
   Failed to load RGtk2 dynamic library, attempting to install it.
   Error : .onLoad failed in loadNamespace() for 'RGtk2', details:
call: install_all()
error: This platform is not yet supported by the automatic
   installer. Please install GTK+ manually, if necessary. See:
  http://www.gtk.org
 http://www.google.com/url?sa=Dq=http://www.gtk.orgusg=AFQjCNFJhHsdo...
   Error: package/namespace load failed for 'RGtk2'
 
   Any help in figuring out what could be the problem is greatly
   appreciated.
 
   Cheers,
 
  [[alternative HTML version deleted]]
 
  Please do as the posting guide asked of you and not send HTML.
 
   __
   r-h...@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guidehttp://
 www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  --
  Brian D. Ripley,  rip...@stats.ox.ac.uk
  Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
  University of Oxford, Tel:  +44 1865 272861 (self)
  1 South Parks Road, +44 1865 272866 (PA)
  Oxford OX1 3TG, UKFax:  +44 1865 272595
 
  __
  r-h...@r-project.org mailing listhttps://
 stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guidehttp://
 www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dotPlot with diagonal

2011-10-26 Thread Dennis Murphy

Let's see: there is a dotPlot() function in each of the following packages:
BHH2, caret, mosaic, qualityTools
Would you be kind enough to share which of these packages (if any) you
are using?

Dennis

On Wed, Oct 26, 2011 at 4:25 AM, Jörg Reuter jo...@reuter.at wrote:
 Hi,
 I want draw a dotPlot. All works fine:
 (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3,
 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6))
 dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und
 Sequenz 2, asp = 1)
 Is there a way to draw a small diagonal, begin at (0/0) to (6/6)
 (perhaps in red??) or must I use gimp? I have many dotPlots, so it is
 fine if R can do this.

 Thanks Joerg

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lock a package to specific R version

2011-10-26 Thread Mehmet Suzen

-Original Message-
From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
Sent: 26 October 2011 10:12
To: Uwe Ligges
Cc: Mehmet Suzen; r-help@r-project.org
Subject: Re: [R] lock a package to specific R version

On Wed, 26 Oct 2011, Uwe Ligges wrote:

 On 25.10.2011 11:42, Mehmet Suzen wrote:
 Hi,

 I was wondering if it is possible to lock a package to a specific
 version of R. Dependency attribute in the package DESCRIPTION
 only accepts= AFAIU
 Depends: R (= 2.13.2), R (= 2.13.2)

Or even use ==

Dear Professor Ripley,

Thank you for the reply. We are maintaining 
Internal R packages and build binaries for different
versions of R base, ranging from 2.8.x to 2.13.x
We need to prevent users using wrong versions, but
the ones we tested. (we distribute binaries only and package
source base is evolving as well)

Not sure how to address this, initially I was thinking
to put R version in the package version, but package version 
in description files only allows x.x.x format which doesn't
give a room.

I don't see why you would want to do this: why would a package work
with 2.13.1 and not 2.13.2, or 2.13.2 and not 2.14.0?  Ranges may make
sense.

Ranges would be much more sensible then ==.

Best Regards,

Mehmet
LEGAL NOTICE
This message is intended for the use o...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dotPlot with diagonal

2011-10-26 Thread Jörg Reuter

Oh, sorry.
library(lattice)
(Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3,
 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6))
dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und
Sequenz 2, asp = 1)
Is there a way to draw a small diagonal, begin at (0/0) to (6/6)
(perhaps in red??) or must I use gimp? I have many dotPlots, so it is
fine if R can do this.


2011/10/26 Dennis Murphy djmu...@gmail.com:
 Let's see: there is a dotPlot() function in each of the following packages:
 BHH2, caret, mosaic, qualityTools
 Would you be kind enough to share which of these packages (if any) you
 are using?

 Dennis

 On Wed, Oct 26, 2011 at 4:25 AM, Jörg Reuter jo...@reuter.at wrote:
 Hi,
 I want draw a dotPlot. All works fine:
 (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3,
 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6))
 dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und
 Sequenz 2, asp = 1)
 Is there a way to draw a small diagonal, begin at (0/0) to (6/6)
 (perhaps in red??) or must I use gimp? I have many dotPlots, so it is
 fine if R can do this.

 Thanks Joerg

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [BioC] comparing two tables

2011-10-26 Thread Steve Lianoglou

Hi,

On Wed, Oct 26, 2011 at 8:17 AM, Assa Yeroslaviz fry...@gmail.com wrote:
 Hi David,

 your function works just fine if I take nly the region into account. But
 unfortunately it does not consider the first column of the chromosomes.
 There can be an overlap between the two tables only if the regions are on
 the same chromosome. This is why the first column of both tables is a
 prerequisite for the analysis.

 I treid somehow to create a second argument to consider this, but until now
 without success.

Well, bioconductor has packages to deal with this type of data, and
these type of queries (overlaps) very efficiently.

Martin Morgan had sent you an email earlier explaining how you can use
the GenomicRanges packages to get what you're after ... I (highly)
suggest you go that route.

HTH,

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging two dataframes

2011-10-26 Thread Petr PIKAL

 
 Hello.
 
 Now i tried to do what you told me.
 I used the str(fuction), and data$date1 and data3$date1 where both 
listed

You have no date1 only date. Therefore

result- merge(data, data3, by=c(date, name), all=T)

takes all values from both data frames 

 dim(data)
[1] 20 15
 dim(data3)
[1] 20  9

alltogether 24 columns from which 4 are date and name columns therefore 20 
columns contain data.

 dim(result)
[1] 40 22

So the result has all 20 columns from both data frames plus one name and 
one date column and all rows from both data frames = 40. Those two sets 
are disjoint. If you had some common date and name in both data frames 
these rows would be merged on the same row in result.

Let us try this.

 data3$name[1:5] - data$name[1:5]
 data3$date[3:5] - data$date[3:5]

result- merge(data, data3, by=c(date, name), all=T)
dim(result)
[1] 37 22

Regards
Petr


 character. I changed name to character but it did not work either.
 I also changed all variables to character, with no positive result.
 
 str(data)
 'data.frame':   14446 obs. of  15 variables:
  $ id : chr  1 1 1 1 ...
  $ compid : chr  2514 2514 2514 2514 ...
  $ secid  : chr  15856 15856 15856 15856 ...
  $ name   : chr  A-pressen A-pressen A-pressen A-pressen ...
  $ period : chr  1 2 3 4 ...
  $ date   : chr  17.05.1980 17.05.1981 17.05.1982 17.05.1983 ...
  $ enddate: chr  17.05.1981 17.05.1982 17.05.1983 17.05.1984 ...
  $ div: chr  NA NA NA NA ...
  $ ndivs  : chr  NA NA NA NA ...
  $ posdiv : chr  NA NA NA NA ...
  $ ddiv2  : chr  NA NA NA NA ...
  $ ddiv3  : chr  NA NA NA NA ...
  $ ddiv4  : chr  NA NA NA NA ...
  $ ddiv5  : chr  NA NA NA NA ...
  $ ddiv6  : chr  NA NA NA NA ...
 
 str(data3)
 'data.frame':   812354 obs. of  9 variables:
  $ date  : chr  02.01.1996 03.01.1996 04.01.1996
 05.01.1996 ...
  $ Securityid: chr  6001 6001 6001 6001 ...
  $ Symbol: chr  AAV AAV AAV AAV ...
  $ name  : chr  Adresseavisen Adresseavisen
 Adresseavisen Adresseavisen ...
  $ Securitytype  : chr  Ordinary Shares Ordinary Shares
 Ordinary Shares Ordinary Shares ...
  $ Unadjusted: chr  200 200 200 200 ...
  $ Event.adjusted: chr  200 200 200 200 ...
  $ Div.and.Event.adjusted: chr  109,7595375 109,7595375 
109,7595375
 109,7595375 ...
  $ Sharesissued  : chr  1901646 1901646 1901646 1901646 
...
 
 Here is some suitable data for data
 
  dput(data[1:20,])
 
 structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), 
 compid = c(2514, 2514, 2514, 2514, 2514, 2514, 
 2514, 2514, 2514, 2514, 2514, 2514, 2514, 2514, 
 2514, 2514, 2514, 2514, 2514, 2514), secid = c(15856, 
 15856, 15856, 15856, 15856, 15856, 15856, 15856, 
 15856, 15856, 15856, 15856, 15856, 15856, 15856, 
 15856, 15856, 15856, 15856, 15856), name = c(A-pressen, 
 A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
 A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
 A-pressen, A-pressen, A-pressen, A-pressen, A-pressen, 
 A-pressen, A-pressen, A-pressen, A-pressen), period = c(1, 

 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
 13, 14, 15, 16, 17, 18, 19, 20), date = 
c(17.05.1980, 
 17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 
17.05.1985, 
 17.05.1986, 17.05.1987, 17.05.1988, 17.05.1989, 
17.05.1990, 
 17.05.1991, 17.05.1992, 17.05.1993, 17.05.1994, 
17.05.1995, 
 17.05.1996, 17.05.1997, 17.05.1998, 17.05.1999), 
 enddate = c(17.05.1981, 17.05.1982, 17.05.1983, 17.05.1984, 
 17.05.1985, 17.05.1986, 17.05.1987, 17.05.1988, 
17.05.1989, 
 17.05.1990, 17.05.1991, 17.05.1992, 17.05.1993, 
17.05.1994, 
 17.05.1995, 17.05.1996, 17.05.1997, 17.05.1998, 
17.05.1999, 
 17.05.2000), div = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
 0, 0, 0, 0, 0, 5, 0, 1.1, 1.2, 1, 0
 ), ndivs = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 
 0, 0, 0, 1, 0, 1, 1, 1, 0), posdiv = c(NA, 
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 
 NA, 1, 1, 1, NA), ddiv2 = c(NA, NA, NA, NA, NA, NA, 
 NA, NA, NA, NA, 0, 0, 0, 0, 0, NA, 0, 1, NA, 
 NA), ddiv3 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
 0, 0, 0, 0, 0, 0, 0, 0, -1), ddiv4 = c(NA, 
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 
 0, 0, 0, 0, 0), ddiv5 = c(NA, NA, NA, NA, NA, NA, 
 NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0, 0, 
 0), ddiv6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
 NA, NA, NA, 0, 0, 0, 0, 0, 0)), .Names = c(id, 
 compid, secid, name, period, date, enddate, div, 
 ndivs, posdiv, ddiv2, ddiv3, ddiv4, ddiv5, ddiv6
 ), row.names = c(NA, 20L), class = data.frame)
 
 
 
 
 Here is some suitable data for data3:
 
  dput(data3[1:20,])
 
 structure(list(date = c(02.01.1996, 03.01.1996, 04.01.1996, 
 05.01.1996, 08.01.1996, 09.01.1996, 10.01.1996, 11.01.1996, 
 12.01.1996, 15.01.1996, 16.01.1996, 17.01.1996, 18.01.1996, 
 19.01.1996, 22.01.1996, 23.01.1996,

Re: [R] [BioC] comparing two tables

2011-10-26 Thread Assa Yeroslaviz

Thanks Steve,

I already did it and it went perfectly well.

I was just trying to understand the functions David wrote, so that I can use
them maybe for other queries.
Unfortunately I wasn't able to add a condition for the fact that there is a
third parameter to be compared.

I would still ove to know whether there is a way of adding such a perameter.

I tried to do it with a third argument in this line:   any( apply(locs,
1, function(x){vec[start]x[2]  vec[start]=x[3]  *
as.character(vec[chr])==as.character(x[chr]*)
but it doesn't seems to work at all.

Thanks for the help anyway
Assa

On Wed, Oct 26, 2011 at 15:33, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:

 Hi,

 On Wed, Oct 26, 2011 at 8:17 AM, Assa Yeroslaviz fry...@gmail.com wrote:
  Hi David,
 
  your function works just fine if I take nly the region into account. But
  unfortunately it does not consider the first column of the chromosomes.
  There can be an overlap between the two tables only if the regions are on
  the same chromosome. This is why the first column of both tables is a
  prerequisite for the analysis.
 
  I treid somehow to create a second argument to consider this, but until
 now
  without success.

 Well, bioconductor has packages to deal with this type of data, and
 these type of queries (overlaps) very efficiently.

 Martin Morgan had sent you an email earlier explaining how you can use
 the GenomicRanges packages to get what you're after ... I (highly)
 suggest you go that route.

 HTH,

 -steve

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  | Memorial Sloan-Kettering Cancer Center
  | Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in summary.mlm: formula not subsettable

2011-10-26 Thread Helios de Rosario

When I fit a multivariate linear model, and the formula is defined
outside the call to lm(), the method summary.mlm() fails.

This works well:
 y - matrix(rnorm(20),nrow=10)
 x - matrix(rnorm(10))
 mod1 - lm(y~x)
 summary(mod1)
...

But this does not:
 f - y~x
 mod2 - lm(f)
 summary(mod2)
Error en object$call$formula[[2L]] - object$terms[[2L]] -
as.name(ynames[i]) :
  objeto de tipo 'symbol' no es subconjunto

I would say that the problem is in the following difference:
 class(mod1$call$formula)
[1] call
 class(mod2$call$formula)
[1] name

As far as I understand, summary.mlm() creates a list of .lm objects
from the individual columns of the matrices in the .mlm object, and then
it tries to change the second element of object$call$formula, to present
the name of the corresponding column as the response variable. But if
the formula has been defined outside the call to lm(), that element
cannot be modifed that way.

A bug, perhaps?

 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base




-- 
Helios de Rosario Martínez
  Researcher

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Power mixed model ordinal logistic regression

2011-10-26 Thread Scott Raynaud

Is there a package that will perform power calculations for mixed model ordinal 
logistic regression?  I searched an came up with nothing.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in summary.mlm: formula not subsettable

2011-10-26 Thread Duncan Murdoch


On 26/10/2011 9:48 AM, Helios de Rosario wrote:

When I fit a multivariate linear model, and the formula is defined
outside the call to lm(), the method summary.mlm() fails.

This works well:
  y- matrix(rnorm(20),nrow=10)
  x- matrix(rnorm(10))
  mod1- lm(y~x)
  summary(mod1)
...

But this does not:
  f- y~x
  mod2- lm(f)
  summary(mod2)
Error en object$call$formula[[2L]]- object$terms[[2L]]-
as.name(ynames[i]) :
   objeto de tipo 'symbol' no es subconjunto

I would say that the problem is in the following difference:
  class(mod1$call$formula)
[1] call
  class(mod2$call$formula)
[1] name

As far as I understand, summary.mlm() creates a list of .lm objects
from the individual columns of the matrices in the .mlm object, and then
it tries to change the second element of object$call$formula, to present
the name of the corresponding column as the response variable. But if
the formula has been defined outside the call to lm(), that element
cannot be modifed that way.

A bug, perhaps?


Yes, I'd say it's a bug.  summary.lm handles this situation fine, but 
summary.mlm does not.


I'll take a look...

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in summary.mlm: formula not subsettable

2011-10-26 Thread Duncan Murdoch


On 26/10/2011 9:48 AM, Helios de Rosario wrote:

When I fit a multivariate linear model, and the formula is defined
outside the call to lm(), the method summary.mlm() fails.

This works well:
  y- matrix(rnorm(20),nrow=10)
  x- matrix(rnorm(10))
  mod1- lm(y~x)
  summary(mod1)
...

But this does not:
  f- y~x
  mod2- lm(f)
  summary(mod2)
Error en object$call$formula[[2L]]- object$terms[[2L]]-
as.name(ynames[i]) :
   objeto de tipo 'symbol' no es subconjunto

I would say that the problem is in the following difference:
  class(mod1$call$formula)
[1] call
  class(mod2$call$formula)
[1] name

As far as I understand, summary.mlm() creates a list of .lm objects
from the individual columns of the matrices in the .mlm object, and then
it tries to change the second element of object$call$formula, to present
the name of the corresponding column as the response variable. But if
the formula has been defined outside the call to lm(), that element
cannot be modifed that way.

A bug, perhaps?


Yes, it was a bug.  A simple workaround is the following:

mod2$call$formula - formula(mod2)

I'll add that to summary.mlm, but in the meantime, you can just do it 
yourself.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging two dataframes

2011-10-26 Thread dividend

I pasted wrong function, I have changed from date1 to date (ignore that).
I think it have to be something wrong with my data format. I can`t
understand why it don't work. I know I can use by.x= and by.y=, but
since both datasets have the same variable name it should be unnecessary to
do that. 



--
View this message in context: 
http://r.789695.n4.nabble.com/merging-two-dataframes-tp3932869p3940396.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows to a table with a loop

2011-10-26 Thread MJS

Thanks for the response, and the advice, glmulti looks like it could be quite
a good alternative.

As for the adding to the results table problem from within the loop, this
webpage:
http://ryouready.wordpress.com/2009/01/23/r-combining-vectors-or-data-frames-of-unequal-length-into-one-data-frame/
answered a number of my questions.

--
View this message in context: 
http://r.789695.n4.nabble.com/Adding-rows-to-a-table-with-a-loop-tp3933634p3940293.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New column of data filled with the larger value from 2 columns

2011-10-26 Thread robgriffin247

data$C - pmax(data$A,data$B)

 worked perfectly thank you very much

--
View this message in context: 
http://r.789695.n4.nabble.com/New-column-of-data-filled-with-the-larger-value-from-2-columns-tp3940020p3940399.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging two dataframes

2011-10-26 Thread Petr PIKAL

 
 Re: [R] merging two dataframes
 
 I pasted wrong function, I have changed from date1 to date (ignore 
that).
 I think it have to be something wrong with my data format. I can`t
 understand why it don't work. I know I can use by.x= and by.y=, 
but
 since both datasets have the same variable name it should be unnecessary 
to
 do that. 

Again

check dimensions of your the three data frames. Number of rows in final 
data frame shall be at least same as the number of rows in bigger data 
frame and lower than sum of rows of both merged data frames.

Regards
Petr


 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/merging-two-
 dataframes-tp3932869p3940396.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building package/DESCRIPTION file not existing?

2011-10-26 Thread Francois Rousseu

Thanks to both of you.

Indeed, it was a language issue. I eventually detected a check warning stating
that the DESCRIPTION file had non-ASCII characters and unknown encoding, but no
special characters were in the file.
From reading various messages on mailing lists, I added Encoding: latin1 and it
worked. Then, when installing the package tarball with install.packages, the
Ã© in the PropriÃ©taire for the library
directory was changed to ii . So I used it's DOS equivalent Program~1 and
it also worked. I haven't notice any warning about using non-english Windows in
the Writng R Extensions manual, but
I may have missed it. Anyway, I realize now that using non-english Windows is
probably a really bad idea in general.

Cheers,
Francois Rousseu

From: maech...@stat.math.ethz.ch
Date: Wed, 26 Oct 2011 10:37:30 +0200
To: francoisrous...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R] Building package/DESCRIPTION file not existing?

Francois Rousseu francoisrous...@hotmail.com
on Mon, 24 Oct 2011 20:10:27 -0400 writes:

Hello useRs

I am trying to build a package for personal use and for
making easier working with other people but I keep getting
the same error message about the DESCRIPTION file not
existing.

when trying to install from a source tar.gz file:

Error in .read_description(dfile) : file
'C:/Users/Propriï¿½taire/AppData/Local/Temp/RtmpHFMONb/R.INSTALL647a3535/mypkg/DESCRIPTION'
does not exist

when trying to build a binary version:

Error in .read_description(dfile) : file
'C:/Users/Propriï¿½taire/Documents/RETROBIRD/mypkg/DESCRIPTION'
does not exist

Yes, I'm almost sure it's the language issues.

I've recently taught a course on R Package building
and on Windows, the user had problems because of an 'Ã¤'
(a-Umlaut) in one of the directories in her 'path'.

So if you work from another place than
'C:/Users/Propriï¿½taire/' this may solve the main problem.

Bonnes salutations,
Martin Maechler, ETH Zurich

Any hints? Below is the complete sequence with errors.

Thanks, Francois Rousseu

[.]

[[alternative HTML version deleted]]

Re: [R] merging two dataframes

2011-10-26 Thread Timothy Bates

So when I do the merge on your example frames, I get the expected result.

But the example component dataframes you sent are already full of NAs, and 
there are no rows which are present in both data sets. So I think perhaps, that 
merge is just highlighting a problem that has its roots in your component data.

t


On 26 Oct 2011, at 1:16 PM, dividend wrote:
 I pasted wrong function, I have changed from date1 to date (ignore that).
 I think it have to be something wrong with my data format. I can`t
 understand why it don't work. I know I can use by.x= and by.y=, but
 since both datasets have the same variable name it should be unnecessary to
 do that. 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [BioC] comparing two tables

2011-10-26 Thread Steve Lianoglou

Hi Assa,


On Wed, Oct 26, 2011 at 9:44 AM, Assa Yeroslaviz fry...@gmail.com wrote:
 Thanks Steve,

 I already did it and it went perfectly well.

 I was just trying to understand the functions David wrote, so that I can use
 them maybe for other queries.
 Unfortunately I wasn't able to add a condition for the fact that there is a
 third parameter to be compared.

 I would still ove to know whether there is a way of adding such a perameter.

Sorry, I didn't realize you were after some personal R study

 I tried to do it with a third argument in this line:   any( apply(locs,
 1, function(x){vec[start]x[2]  vec[start]=x[3] 
 as.character(vec[chr])==as.character(x[chr])
 but it doesn't seems to work at all.

You have to change the table you are sending to the second param of
your inregion function.

currently you are sending into the `locs` parameter a two column table
that just has c(Start, End), eg:

R Think about inregion(genetable[2, ], loctable[, c(Start, End)])

Look at what `loctable[, c(Start, End)]` gives you

It looks like your change to inregion should work once you pass in the
Chr column from your loctable (barring case-sensitive issues (you
have 'chr' and Chr in your separate tables), eg use your modified
inregion function and call it like so:

R inregion(genetable[2, ], loctable[, c(Chr, Start, End)])

modulo this or that.


-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sometimes removing NAs from code

2011-10-26 Thread Schatzi

Sometimes I have NA values within specific columns of a dataframe (in this
example, the first two columns can have NAs). If there are NA values, I
would like them to be removed.

I have been using the code:

y-c(NA,5,4,2,5,6,NA)
z-c(NA,3,4,NA,1,3,7)
x-1:7
adata-data.frame(y,z,x)
adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,]

This works well if there are NA values, but when a dataset doesn't have NA
values, this code messes up the dataframe. I was trying to pick apart this
code and could not understand why it didn't work when there were no NA
values.


If there are no NA values and I run just the part:
apply(adata[,1:2],1,function(x)any(is.na(x)))
it results in:
2 3 5 6 
FALSE FALSE FALSE FALSE 

I was thinking that I can put in an if statement, but I think there has to
be a better way.

Any ideas/help? Thank you.

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941009.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using abline in lattice

2011-10-26 Thread weiflo

Ups, sorry, just realized the first code is wrong, its one with a panel
function already. The right code would be:

Tuvalu - c(9,3,4,0,3,0,0)
Singapor - c(38,0,0,0,12,19,0)
Samoa - c(26,16,2,0,5,2,0)
PNG - c(56,4,0,5,2,0,56)
Micronesia - c(6,0,0,0,0,0,0)
graph4 - data.frame(rbind(Tuvalu,Singapor,Samoa,PNG,Micronesia))
graph4$country - c(Tuvalu,Singapore,Samoa,Papua New
Guinea,Micronesia)
barchart(country ~ X1 + X2 + X3 + X4 + X5 + X6 + X7, data=graph4,
 stack=T,
 xlim=c(0,130),
 scales = list(alternating = 1, cex=1.2),
 xlab=,

#col=c(grey1,grey17,grey33,grey50,grey67,grey83,grey100)

col=c(grey20,grey100,grey50,grey83,grey33,grey67,grey0))

Apologies!

--
View this message in context: 
http://r.789695.n4.nabble.com/Using-abline-in-lattice-tp3941012p3941024.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using abline in lattice

2011-10-26 Thread weiflo

Dear all,

being a relative beginner in R, I apologize for posting the second question
within two days.

So I want a stacked barchart, which should look like the one produced by
this code:
Tuvalu - c(9,3,4,0,3,0,0)
Singapor - c(38,0,0,0,12,19,0)
Samoa - c(26,16,2,0,5,2,0)
PNG - c(56,4,0,5,2,0,56)
Micronesia - c(6,0,0,0,0,0,0)
graph4 - data.frame(rbind(Tuvalu,Singapor,Samoa,PNG,Micronesia))
graph4$country - c(Tuvalu,Singapore,Samoa,Papua New
Guinea,Micronesia)
graph4$country - factor(graph4$country)
xyplot(country ~ X1 + X2 + X3 + X4 + X5 + X6 + X7, data=graph4,
 xlim=c(0,130),
 #scales = list(alternating = 1, cex=1.2),
 xlab=,
 panel = function (x,y) {
   stack=F
   groups=country
   panel.barchart(x,y,
col=c(grey20,grey100,grey50,grey83,grey33,grey67,grey0))
   panel.abline(v = 20, lty = 2, col = blue)
 }
   )


But now I would like to add vertical lines at certain values (20, 40, etc.),
but because I couldn't make the abline command work with the above code, I
wrote a panel function. Then the vertical lines work quite well, but now the
bars are plotted on top of each other. See for yourself, here is the code
(the first four lines of recoding I did to make sure that its not a problem
of the formula with the pluses, but it turns out just the same):

test - data.frame(rep(graph4$country,7),
 
c(graph4$X1,graph4$X2,graph4$X3,graph4$X4,graph4$X5,graph4$X6,graph4$X7))
names(test) - c(country,X1)
names(test)


xyplot(country ~ X1, data=test,
 xlim=c(0,130),
 #scales = list(alternating = 1, cex=1.2),
 xlab=,
 panel = function (x,y) {
   #groups=country
   panel.abline(v = 20, lty = 2, col = grey70)
   panel.abline(v = 40, lty = 2, col = grey70)
   panel.barchart(x,y,
col=c(grey20,grey100,grey50,grey83,grey33,grey67,grey0))
   },
   )

I played a lot around with the stack command in the second code, nothing
worked. My question now would be, how can I either make the vertical lines
work with the first code, or the bars look like in the first example using
the second code.

Thanks a lot for your help!
Florian


--
View this message in context: 
http://r.789695.n4.nabble.com/Using-abline-in-lattice-tp3941012p3941012.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dotPlot with diagonal

2011-10-26 Thread David L Carlson

Try again. There is no dotPlot() function in lattice and dotplot() does not 
take two separate rows so the example you gave us generates an error message if 
dotPlot is changed to dotplot.


--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Jörg Reuter
Sent: Wednesday, October 26, 2011 8:13 AM
To: Dennis Murphy
Cc: r-help@r-project.org
Subject: Re: [R] dotPlot with diagonal

Oh, sorry.
library(lattice)
(Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3,
 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6))
dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und
Sequenz 2, asp = 1)
Is there a way to draw a small diagonal, begin at (0/0) to (6/6)
(perhaps in red??) or must I use gimp? I have many dotPlots, so it is
fine if R can do this.


2011/10/26 Dennis Murphy djmu...@gmail.com:
 Let's see: there is a dotPlot() function in each of the following packages:
 BHH2, caret, mosaic, qualityTools
 Would you be kind enough to share which of these packages (if any) you
 are using?

 Dennis

 On Wed, Oct 26, 2011 at 4:25 AM, Jörg Reuter jo...@reuter.at wrote:
 Hi,
 I want draw a dotPlot. All works fine:
 (Seq - matrix(c(1, 1, 6, 1, 2, 2, 5, 4, 3, 3, 4, 3,
 4, 4, 3, 2, 5, 5, 2, 5, 6, 6, 1, 6), ncol = 6))
 dotPlot(Seq[1,], Seq[2,], main = Sequenz 1 und
 Sequenz 2, asp = 1)
 Is there a way to draw a small diagonal, begin at (0/0) to (6/6)
 (perhaps in red??) or must I use gimp? I have many dotPlots, so it is
 fine if R can do this.

 Thanks Joerg

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Power mixed model ordinal logistic regression

2011-10-26 Thread Marc Schwartz


On Oct 26, 2011, at 8:57 AM, Scott Raynaud wrote:

 Is there a package that will perform power calculations for mixed model 
 ordinal logistic regression?  I searched an came up with nothing.


I am not sure that there is a canned package or function that will do that. 
More than likely, you will need to use simulation.

I would suggest that you subscribe to and post your query to the 
r-sig-mixed-models list:

  https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

That will provide you with a focused audience in this domain and somebody might 
know of alternatives.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sometimes removing NAs from code

2011-10-26 Thread Natalie Van Zuydam


Hi,

Why don't you give subset a try:

adata - subset(adata, is.na(z)==FALSEis.na(y)==FALSE)

I'm not sure if you want to use AND or OR for this statement.

Best wishes,
Natalie
On 26/10/2011 16:25, Schatzi wrote:

Sometimes I have NA values within specific columns of a dataframe (in this
example, the first two columns can have NAs). If there are NA values, I
would like them to be removed.

I have been using the code:

y-c(NA,5,4,2,5,6,NA)
z-c(NA,3,4,NA,1,3,7)
x-1:7
adata-data.frame(y,z,x)
adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,]

This works well if there are NA values, but when a dataset doesn't have NA
values, this code messes up the dataframe. I was trying to pick apart this
code and could not understand why it didn't work when there were no NA
values.


If there are no NA values and I run just the part:
apply(adata[,1:2],1,function(x)any(is.na(x)))
it results in:
 2 3 5 6
FALSE FALSE FALSE FALSE

I was thinking that I can put in an if statement, but I think there has to
be a better way.

Any ideas/help? Thank you.

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941009.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sometimes removing NAs from code

2011-10-26 Thread jim holtman

?complete.cases

 y-c(NA,5,4,2,5,6,NA)
 z-c(NA,3,4,NA,1,3,7)
 x-1:7
 adata-data.frame(y,z,x)
 adata
   y  z x
1 NA NA 1
2  5  3 2
3  4  4 3
4  2 NA 4
5  5  1 5
6  6  3 6
7 NA  7 7
 adata[complete.cases(adata),]
  y z x
2 5 3 2
3 4 4 3
5 5 1 5
6 6 3 6


On Wed, Oct 26, 2011 at 11:25 AM, Schatzi adele_thomp...@cargill.com wrote:
 Sometimes I have NA values within specific columns of a dataframe (in this
 example, the first two columns can have NAs). If there are NA values, I
 would like them to be removed.

 I have been using the code:

 y-c(NA,5,4,2,5,6,NA)
 z-c(NA,3,4,NA,1,3,7)
 x-1:7
 adata-data.frame(y,z,x)
 adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,]

 This works well if there are NA values, but when a dataset doesn't have NA
 values, this code messes up the dataframe. I was trying to pick apart this
 code and could not understand why it didn't work when there were no NA
 values.


 If there are no NA values and I run just the part:
 apply(adata[,1:2],1,function(x)any(is.na(x)))
 it results in:
    2     3     5     6
 FALSE FALSE FALSE FALSE

 I was thinking that I can put in an if statement, but I think there has to
 be a better way.

 Any ideas/help? Thank you.

 -
 In theory, practice and theory are the same. In practice, they are not - 
 Albert Einstein
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941009.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sometimes removing NAs from code

2011-10-26 Thread Marc Schwartz

On Oct 26, 2011, at 10:25 AM, Schatzi wrote:

 Sometimes I have NA values within specific columns of a dataframe (in this
 example, the first two columns can have NAs). If there are NA values, I
 would like them to be removed.
 
 I have been using the code:
 
 y-c(NA,5,4,2,5,6,NA)
 z-c(NA,3,4,NA,1,3,7)
 x-1:7
 adata-data.frame(y,z,x)
 adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,]
 
 This works well if there are NA values, but when a dataset doesn't have NA
 values, this code messes up the dataframe. I was trying to pick apart this
 code and could not understand why it didn't work when there were no NA
 values.
 
 
 If there are no NA values and I run just the part:
 apply(adata[,1:2],1,function(x)any(is.na(x)))
 it results in:
2 3 5 6 
 FALSE FALSE FALSE FALSE 
 
 I was thinking that I can put in an if statement, but I think there has to
 be a better way.
 
 Any ideas/help? Thank you.


Presuming that you want to remove an entire row, if any of the elements in that 
row are NA's, see ?na.omit

 na.omit(adata)
  y z x
2 5 3 2
3 4 4 3
5 5 1 5
6 6 3 6

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sometimes removing NAs from code

2011-10-26 Thread Sarah Goslee

Hi,

On Wed, Oct 26, 2011 at 11:25 AM, Schatzi adele_thomp...@cargill.com wrote:
 Sometimes I have NA values within specific columns of a dataframe (in this
 example, the first two columns can have NAs). If there are NA values, I
 would like them to be removed.

 I have been using the code:

 y-c(NA,5,4,2,5,6,NA)
 z-c(NA,3,4,NA,1,3,7)
 x-1:7
 adata-data.frame(y,z,x)
 adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,]

 This works well if there are NA values, but when a dataset doesn't have NA
 values, this code messes up the dataframe. I was trying to pick apart this
 code and could not understand why it didn't work when there were no NA
 values.

Thanks for the example. Your problem is because of the which() statement.

If there are NA values, which() returns the row numbers where the NAs are:

 which(apply(adata[,1:2],1,function(x)any(is.na(x
[1] 1 4 7

 bdata - data.frame(1:7, 1:7, 1:7)
 which(apply(bdata[,1:2],1,function(x)any(is.na(x
integer(0)

But if there aren't any, which() returns 0. How does R subset on a row
index of 0?
Unhelpfully.

Fortunately you don't need the which() at all: the logical vector
returned by your
apply statement is entirely sufficient (with added negation):

 adata[apply(adata[,1:2],1,function(x)!any(is.na(x))), ]
  y z x
2 5 3 2
3 4 4 3
5 5 1 5
6 6 3 6
 bdata[apply(bdata[,1:2],1,function(x)!any(is.na(x))), ]
  X1.7 X1.7.1 X1.7.2
11  1  1
22  2  2
33  3  3
44  4  4
55  5  5
66  6  6
77  7  7

Sarah


 If there are no NA values and I run just the part:
 apply(adata[,1:2],1,function(x)any(is.na(x)))
 it results in:
    2     3     5     6
 FALSE FALSE FALSE FALSE

 I was thinking that I can put in an if statement, but I think there has to
 be a better way.

 Any ideas/help? Thank you.



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sometimes removing NAs from code

2011-10-26 Thread William Dunlap

Instead of
   d[-which(condition)]
use
   d[!condition]
where 'condition' is a logical vector.

which(condition) returns integer(0) (an integer vector
of length 0) if there are no TRUEs in 'condition'.
-integer(0) is identical to integer(0) and d[integer(0)]
means to select zero elements from d.

!condition means to flip the senses of all the TRUEs and
FALSEs (and to leave NAs alone) so d[!condition] returns
the elements of d for which condition is not TRUE (along
with NA's for NA's in condition, but you won't have any
of them in your example).

By the way, your use of apply() slows things down and
might lead to errors.  Try replacing
  apply(adata[,1:2],1,function(x)any(is.na(x
by
  is.na(adata$y) | is.na(adata$z)
or
  rowSums(is.na(adata[,1:2]))  0

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Schatzi
 Sent: Wednesday, October 26, 2011 8:25 AM
 To: r-help@r-project.org
 Subject: [R] sometimes removing NAs from code
 
 Sometimes I have NA values within specific columns of a dataframe (in this
 example, the first two columns can have NAs). If there are NA values, I
 would like them to be removed.
 
 I have been using the code:
 
 y-c(NA,5,4,2,5,6,NA)
 z-c(NA,3,4,NA,1,3,7)
 x-1:7
 adata-data.frame(y,z,x)
 adata-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x,]
 
 This works well if there are NA values, but when a dataset doesn't have NA
 values, this code messes up the dataframe. I was trying to pick apart this
 code and could not understand why it didn't work when there were no NA
 values.
 
 
 If there are no NA values and I run just the part:
 apply(adata[,1:2],1,function(x)any(is.na(x)))
 it results in:
 2 3 5 6
 FALSE FALSE FALSE FALSE
 
 I was thinking that I can put in an if statement, but I think there has to
 be a better way.
 
 Any ideas/help? Thank you.
 
 -
 In theory, practice and theory are the same. In practice, they are not - 
 Albert Einstein
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-
 tp3941009p3941009.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating data frame with residuals of a data frame

2011-10-26 Thread jim holtman

try this:

 age- c(5,6,10,14,16,NA,18)
 value1- c(30,70,40,50,NA,NA,NA)
 value2- c(2,4,1,4,4,4,4)
 df- data.frame(age, value1, value2)

 #Run linear regression to adjust for age and get residuals:

 lm_f - function(x) {
+ x- residuals(lm(data=df, formula= x ~ age))
+ }
 resid - apply(df,2,lm_f)
 resid- resid[-1]
 for (i in names(resid)){
+ newCol - paste(i, 'res', sep = '')
+ df[[newCol]] - NA  # initialize
+ df[[newCol]][as.integer(names(resid[[i]]))] - resid[[i]]
+ }
 df
  age value1 value2  value1res   value2res
1   5 30  2 -16.945813 -0.37398374
2   6 70  4  22.906404  1.50406504
3  10 40  1  -7.684729 -1.98373984
4  14 50  4   1.724138  0.52845528
5  16 NA  4 NA  0.28455285
6  NA NA  4 NA  NA
7  18 NA  4 NA  0.04065041


On Mon, Oct 24, 2011 at 10:23 AM, francesca casalino
francy.casal...@gmail.com wrote:
 Dear experts,

 I am trying to create a data frame from the residuals I get after
 having applied a linear regression to each column of a data frame, but
 I don't know how to create this data frame from the resulting list
 since the list has differing numbers of rows.

 So for example:
 age- c(5,6,10,14,16,NA,18)
 value1- c(30,70,40,50,NA,NA,NA)
 value2- c(2,4,1,4,4,4,4)
 df- data.frame(age, value1, value2)

 #Run linear regression to adjust for age and get residuals:

 lm_f - function(x) {
 x- residuals(lm(data=df, formula= x ~ age))
 }
 resid - apply(df,2,lm_f)
 resid- resid[-1]

 Then resid is a list with different row numbers:

 $value1
         1          2          3          4
 -16.945813  22.906404  -7.684729   1.724138

 $value2
          1           2           3           4           5           7
 -0.37398374  1.50406504 -1.98373984  0.52845528  0.28455285  0.04065041

 I am trying to get both the original variable and their residuals in
 the same data frame like this:

 age, value1, value2, resid_value1, resid_value2

 But when I try cbind or other operations I get an error message
 because they do not have the same number of rows. Can you please help
 me figure out how to solve this?

 Thank you.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Want to exclude axis numbering in plot.ca

2011-10-26 Thread R. Michael Weylandt

I don't know what plot.ca is (it's not in base and you gave no package
citation), but the usual way is to add xaxt = n to a plot call.
Assuming plot.ca is an appropriately defined generic, this should
work.

E.g.,
layout(1:2)
plot(1:5)
plot(1:5, xaxt = n)

Michael

On Wed, Oct 26, 2011 at 3:59 AM, Mark Webb targetlinkm...@gmail.com wrote:
 plot.ca gives numbers on each axis. How do I stipulate to exclude these.
 Have read the R Documentation plot.ca but see no option to exclude axis
 numbers.
 Any suggestions?

 --
 Mark Webb

 Line +27 (21) 786 4379
 Cell +27 (72) 199 1000 [Poor reception]
 Fax  +27 (86) 260 1946

 Skype       tomarkwebb
 Email       targetlinkm...@gmail.com
 Client ftp  http://targetlinkresearch.co.za/cftp/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with means using tail()

2011-10-26 Thread Iara Faria

Hi all,
 
I have 5 series  (5 ts objects: rp, igpm, ereal, jurosreal, crescpib), and want 
to create a vector with the means of the last values of each variable.
What I did was this:
 
mrp1-mean(tail(rp,9))
migpm1-mean(tail(igpm,9))
mereal1-mean(tail(ereal,9))
mjr1-mean(tail(jurosreal,9))
mcp1-mean(tail(crescpib,9))
means=rbind(mrp1,migpm1,mereal1,mjr1,mcp1)
 
They are monthly series, from 1995.1 to 2011.6.
So what I did was generate the mean of each variable for [2010.10 to 2011.6] (9 
months, as I wanted).
But now I want to create a vector with the means of the last 9 values [2010.10 
to 2011.6] AND the means of of 9 months but deslocated one month, that is, 
[2010.9 to 2011.5].
 
I tried to find examples of this but with no help.
Can anyone give a hand?
 
Thanks in advance.
Regards,
Iara
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lock a package to specific R version

2011-10-26 Thread Uwe Ligges

On 26.10.2011 15:12, Mehmet Suzen wrote:

-Original Message-
From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
Sent: 26 October 2011 10:12
To: Uwe Ligges
Cc: Mehmet Suzen; r-help@r-project.org
Subject: Re: [R] lock a package to specific R version

On Wed, 26 Oct 2011, Uwe Ligges wrote:

On 25.10.2011 11:42, Mehmet Suzen wrote:

Hi,

I was wondering if it is possible to lock a package to a specific
version of R. Dependency attribute in the package DESCRIPTION
only accepts= AFAIU

Depends: R (= 2.13.2), R (= 2.13.2)

Or even use ==

Dear Professor Ripley,

Thank you for the reply. We are maintaining
Internal R packages and build binaries for different
versions of R base, ranging from 2.8.x to 2.13.x
We need to prevent users using wrong versions, but
the ones we tested. (we distribute binaries only and package
source base is evolving as well)

Not sure how to address this, initially I was thinking
to put R version in the package version, but package version
in description files only allows x.x.x format which doesn't
give a room.

No, you can have more, if you really want to.

I don't see why you would want to do this: why would a package work
with 2.13.1 and not 2.13.2, or 2.13.2 and not 2.14.0?  Ranges may make
sense.

Ranges would be much more sensible then ==.

Do go ahead with my suggestion.

Best wishes,
Uwe

Best Regards,

Mehmet
LEGAL NOTICE
This message is intended for the use o...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SpatialLines

2011-10-26 Thread Mark Newcomb

I'm hoping to use R for spatial analysis.  In working through examples in 
Chapt. 4 of Applied Spatial Data Analysis with R I've come across the following 
error in trying to plot lines with the meuse data set.  The text is verbatim 
from the book.

 m.sl - SpatialLines(list(Lines(list(Line(cc)
Error in Lines(list(Line(cc))) : Single ID required

What does Single ID required mean?

Thanks.

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic Regression - Variable Selection Methods With Prediction

2011-10-26 Thread RAJ

Can I atleast get help with what pacakge to use for logistic
regression with all possible models and do prediction. I know i can
use regsubsets but i am not sure if it has any prediction functions to
go with it.

Thanks

On Oct 25, 6:54 pm, RAJ dheerajathr...@gmail.com wrote:
 Hello,

 I am pretty new to R, I have always used SAS and SAS products. My
 target variable is binary ('Y' and 'N') and i have about 14 predictor
 variables. My goal is to compare different variable selection methods
 like Forward, Backward, All possible subsests. I am using
 misclassification rate to pick the winner method.

 This is what i have as of now,

 Reg - glm (Graduation ~., DFtrain,family=binomial(link=logit))
                 step - extractAIC(Reg, direction=forward)
                 pred - predict(Reg, DFtest,type=response)
                 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
 This program actually works but I needed to check to make sure am
 doing this right. Also, I am getting the same misclassification rates
 for all different methods.

 I also tried to use

 Reg - leaps(Graduation ~., DFtrain)
                 pred - predict(Reg, DFtest,type=response)
                 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
                 #print(summary(mis))
 which doesnt work

 and

 Reg - regsubsets(Graduation ~., DFtrain)
                 pred - predict(Reg, DFtest,type=response)
                 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
                 #print(summary(mis))

 The Regsubsets will work but the 'predict' function does not work with
 it. Is there any other way to do predictions when using regsubsets

 Any help is appreciated.

 Thanks,

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Guidance with PCA and Regression using complex categorical variables

2011-10-26 Thread sean st.clair

Hello.  I need some guidance.

I would like to run PCA and regression, and my predictor variables are
mainly complex categorical variables (hundred's of levels for some of
them).

What packages and functions are useful for this?

THanks.
sean

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot complete dataset

2011-10-26 Thread RMSOPS


Hello,

 I am a new user of R, so I still have some basic difficulties.
I'm trying to create a bar graph completely from reading a file.
   The idea was on the x axis have the columns of the table
Married ,Single,Divorced, widower
 the legend Ages
18-34
35-45
46-64
65-69
70-74

the dataset
dataset
   Ages Married Single Divorced widower
1 18-3410.5   35.7  8.5 3.2
2 35-4512.4   22.4 22.212.6
3 46-6425.4   22.2 33.412.4
4 65-6936.7   31.4 12.435.2
5 70-7426.4   15.1  8.543.2


The code for barplot 

barplot(dataset,dataset$Single, col = c(rainbow(dataset$Ages)), legend =
rownames(dataset$Ages), ylim = c(0, 100))

but I am not able to resolve.

Thanks





--
View this message in context: 
http://r.789695.n4.nabble.com/Plot-complete-dataset-tp3941346p3941346.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sometimes removing NAs from code

2011-10-26 Thread Schatzi

Thank you for the help and explanations. I used the complete.cases function
and it is working great.

adata[complete.cases(adata[,1:2]),]



-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/sometimes-removing-NAs-from-code-tp3941009p3941431.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic Regression - Variable Selection Methods With Prediction

2011-10-26 Thread Steve_Friedman

Try the glm package

Steve Friedman Ph. D.
Ecologist  / Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SpatialLines

2011-10-26 Thread Duncan Murdoch


On 26/10/2011 1:11 PM, Mark Newcomb wrote:

I'm hoping to use R for spatial analysis.  In working through examples in 
Chapt. 4 of Applied Spatial Data Analysis with R I've come across the following 
error in trying to plot lines with the meuse data set.  The text is verbatim 
from the book.

  m.sl- SpatialLines(list(Lines(list(Line(cc)
Error in Lines(list(Line(cc))) : Single ID required

What does Single ID required mean?


That message is coming from a contributed package, not from base R.  You 
should say what package you're using, and you may need to contact the 
author or maintainer of it to get an answer.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic Regression - Variable Selection Methods With Prediction

2011-10-26 Thread Steve Lianoglou

Hi,

On Wed, Oct 26, 2011 at 12:35 PM, RAJ dheerajathr...@gmail.com wrote:
 Can I atleast get help with what pacakge to use for logistic
 regression with all possible models and do prediction. I know i can
 use regsubsets but i am not sure if it has any prediction functions to
 go with it.

Maybe you could try glmnet instead.

It doesn't give you all possible models, but rather the best one at
a given value for the penalty (lambda) parameter.

HTH,

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic Regression - Variable Selection Methods With Prediction

2011-10-26 Thread Weidong Gu

Check glmulti package for all subset selection.

Weidong Gu

On Wed, Oct 26, 2011 at 12:35 PM, RAJ dheerajathr...@gmail.com wrote:
 Can I atleast get help with what pacakge to use for logistic
 regression with all possible models and do prediction. I know i can
 use regsubsets but i am not sure if it has any prediction functions to
 go with it.

 Thanks

 On Oct 25, 6:54 pm, RAJ dheerajathr...@gmail.com wrote:
 Hello,

 I am pretty new to R, I have always used SAS and SAS products. My
 target variable is binary ('Y' and 'N') and i have about 14 predictor
 variables. My goal is to compare different variable selection methods
 like Forward, Backward, All possible subsests. I am using
 misclassification rate to pick the winner method.

 This is what i have as of now,

 Reg - glm (Graduation ~., DFtrain,family=binomial(link=logit))
                 step - extractAIC(Reg, direction=forward)
                 pred - predict(Reg, DFtest,type=response)
                 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
 This program actually works but I needed to check to make sure am
 doing this right. Also, I am getting the same misclassification rates
 for all different methods.

 I also tried to use

 Reg - leaps(Graduation ~., DFtrain)
                 pred - predict(Reg, DFtest,type=response)
                 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
                 #print(summary(mis))
 which doesnt work

 and

 Reg - regsubsets(Graduation ~., DFtrain)
                 pred - predict(Reg, DFtest,type=response)
                 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
                 #print(summary(mis))

 The Regsubsets will work but the 'predict' function does not work with
 it. Is there any other way to do predictions when using regsubsets

 Any help is appreciated.

 Thanks,

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic Regression - Variable Selection Methods With Prediction

2011-10-26 Thread Bert Gunter

You mean the glm()  _function_ in the stats package.

?glm

(just to avoid confusion)

-- Bert

On Wed, Oct 26, 2011 at 10:31 AM, steve_fried...@nps.gov wrote:

 Try the glm package

 Steve Friedman Ph. D.
 Ecologist  / Spatial Statistical Analyst
 Everglades and Dry Tortugas National Park
 950 N Krome Ave (3rd Floor)
 Homestead, Florida 33034

 steve_fried...@nps.gov
 Office (305) 224 - 4282
 Fax (305) 224 - 4147

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic Regression - Variable Selection Methods With Prediction

2011-10-26 Thread Marc Schwartz

The reason that you are not likely getting replies is that what you propose to 
do is considered a poor way of building models. 

You need to get out of the SAS Mindset.

I would suggest you obtain a copy of Frank Harrell's book:

  http://www.amazon.com/exec/obidos/ASIN/0387952322/

and then consider using his 'rms' package on CRAN to engage in modeling 
building strategies and validation.

Regards,

Marc Schwartz

On Oct 26, 2011, at 11:35 AM, RAJ wrote:

 Can I atleast get help with what pacakge to use for logistic
 regression with all possible models and do prediction. I know i can
 use regsubsets but i am not sure if it has any prediction functions to
 go with it.
 
 Thanks
 
 On Oct 25, 6:54 pm, RAJ dheerajathr...@gmail.com wrote:
 Hello,
 
 I am pretty new to R, I have always used SAS and SAS products. My
 target variable is binary ('Y' and 'N') and i have about 14 predictor
 variables. My goal is to compare different variable selection methods
 like Forward, Backward, All possible subsests. I am using
 misclassification rate to pick the winner method.
 
 This is what i have as of now,
 
 Reg - glm (Graduation ~., DFtrain,family=binomial(link=logit))
 step - extractAIC(Reg, direction=forward)
 pred - predict(Reg, DFtest,type=response)
 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
 This program actually works but I needed to check to make sure am
 doing this right. Also, I am getting the same misclassification rates
 for all different methods.
 
 I also tried to use
 
 Reg - leaps(Graduation ~., DFtrain)
 pred - predict(Reg, DFtest,type=response)
 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
 #print(summary(mis))
 which doesnt work
 
 and
 
 Reg - regsubsets(Graduation ~., DFtrain)
 pred - predict(Reg, DFtest,type=response)
 mis - mean({pred  0.5} != {DFtest[,Graduation] == Y})
 #print(summary(mis))
 
 The Regsubsets will work but the 'predict' function does not work with
 it. Is there any other way to do predictions when using regsubsets
 
 Any help is appreciated.
 
 Thanks,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot complete dataset

2011-10-26 Thread Nordlund, Dan (DSHS/RDA)

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of RMSOPS
 Sent: Wednesday, October 26, 2011 9:59 AM
 To: r-help@r-project.org
 Subject: [R] Plot complete dataset

 Hello,

  I am a new user of R, so I still have some basic difficulties.
 I'm trying to create a bar graph completely from reading a file.
The idea was on the x axis have the columns of the table
 Married ,Single,Divorced, widower
  the legend Ages
 18-34
 35-45
 46-64
 65-69
 70-74

 the dataset
 dataset
Ages Married Single Divorced widower
 1 18-3410.5   35.7  8.5 3.2
 2 35-4512.4   22.4 22.212.6
 3 46-6425.4   22.2 33.412.4
 4 65-6936.7   31.4 12.435.2
 5 70-7426.4   15.1  8.543.2

 The code for barplot

 barplot(dataset,dataset$Single, col = c(rainbow(dataset$Ages)), legend
 =
 rownames(dataset$Ages), ylim = c(0, 100))

 but I am not able to resolve.

 Thanks

You should go back and read the help for barplot.  Do you really want to plot 
the whole dataset (say as a stacked barplot)? Then something like this should 
do it.

barplot(as.matrix(dataset[,2:5]),  col = c(lightblue, mistyrose, 
lightcyan,
lavender, cornsilk), 
  legend = dataset$Ages, ylim = c(0, 100))

Your values don't add to 100, so I'm not sure what you actually want.  If this 
isn't what you want then give us more information.

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extra Sums of Squares from an anova table - why are the values different?

2011-10-26 Thread Stephen Sefick

#For full disclosure- I am working on a homework problem.  However, my 
question revolves around computer rounding, I think.



x - (structure(list(y = c(0.222, 0.395, 0.422, 0.437, 0.428, 0.467,
0.444, 0.378, 0.494, 0.456, 0.452, 0.112, 0.432, 0.101, 0.232,
0.306, 0.0923, 0.116, 0.0764, 0.439, 0.0944, 0.117, 0.0726, 0.0412,
0.251, 2e-05), x1 = c(7.3, 8.7, 8.8, 8.1, 9, 8.7, 9.3, 7.6, 10,
8.4, 9.3, 7.7, 9.8, 7.3, 8.5, 9.5, 7.4, 7.8, 7.7, 10.3, 7.8,
7.1, 7.7, 7.4, 7.3, 7.6), x2 = c(0, 0, 0.7, 4, 0.5, 1.5, 2.1,
5.1, 0, 3.7, 3.6, 2.8, 4.2, 2.5, 2, 2.5, 2.8, 2.8, 3, 1.7, 3.3,
3.9, 4.3, 6, 2, 7.8), x3 = c(0, 0.3, 1, 0.2, 1, 2.8, 1, 3.4,
0.3, 4.1, 2, 7.1, 2, 6.8, 6.6, 5, 7.8, 7.7, 8, 4.2, 8.5, 6.6,
9.5, 10.9, 5.2, 20.7), x11 = c(53.29, 75.69, 77.44, 65.61, 81,
75.69, 86.49, 57.76, 100, 70.56, 86.49, 59.29, 96.04, 53.29,
72.25, 90.25, 54.76, 60.84, 59.29, 106.09, 60.84, 50.41, 59.29,
54.76, 53.29, 57.76), x22 = c(0, 0, 0.49, 16, 0.25, 2.25, 4.41,
26.01, 0, 13.69, 12.96, 7.84, 17.64, 6.25, 4, 6.25, 7.84, 7.84,
9, 2.89, 10.89, 15.21, 18.49, 36, 4, 60.84), x33 = c(0, 0.09,
1, 0.04, 1, 7.84, 1, 11.56, 0.09, 16.81, 4, 50.41, 4, 46.24,
43.56, 25, 60.84, 59.29, 64, 17.64, 72.25, 43.56, 90.25, 118.81,
27.04, 428.49), x12 = c(0, 0, 6.16, 32.4, 4.5, 13.05, 19.53,
38.76, 0, 31.08, 33.48, 21.56, 41.16, 18.25, 17, 23.75, 20.72,
21.84, 23.1, 17.51, 25.74, 27.69, 33.11, 44.4, 14.6, 59.28),
x13 = c(0, 2.61, 8.8, 1.62, 9, 24.36, 9.3, 25.84, 3, 34.44,
18.6, 54.67, 19.6, 49.64, 56.1, 47.5, 57.72, 60.06, 61.6,
43.26, 66.3, 46.86, 73.15, 80.66, 37.96, 157.32), x23 = c(0,
0, 0.7, 0.8, 0.5, 4.2, 2.1, 17.34, 0, 15.17, 7.2, 19.88,
8.4, 17, 13.2, 12.5, 21.84, 21.56, 24, 7.14, 28.05, 25.74,
40.85, 65.4, 10.4, 161.46)), .Names = c(y, x1, x2,
x3, x11, x22, x33, x12, x13, x23), row.names = c(NA,
-26L), class = data.frame)
)

x$x11 - x$x1^2
x$x22 - x$x2^2
x$x33 - x$x3^2
x$x12 - x$x1*x$x2
x$x13 - x$x1*x$x3
x$x23 - x$x2*x$x3

x.lm - lm(y~x1+x2+x3+x11+x22+x33+x12+x13+x23, data=x)

anova(lm(y~x1+x2+x3,data=x), x.lm)

anova(x.lm)

#I want to test

#Ho:y~x1+x2+x3
#Ha:y~x1+x2+x3+x11+x22+x33+x12+x13+x23

((0.00945+0.01340+0.00200+0.00568+0.00489+0.00050)/6)/(0.00371)

#Thanks

#Stephen Sefick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correlation Matrix in R

2011-10-26 Thread William Revelle

Alex,

corr.test in psych will give you a matrix of correlations, a matrix of sample 
sizes, and a matrix of probabilities.

You can combine the correlations and the probabilities to form what you want:

  try the following:

 library(psych)
 examp - corr.test(sat.act)
 mat.c.p - lower.tri(examp$r)*examp$r + t(lower.tri(examp$p)*examp$p)
 mat.cp

Bill



On Oct 26, 2011, at 6:03 AM, AlexC wrote:

 Thank you for your quick reply and helpful advice.
 
 Using this argument allows me to do what I needed to do
 
 Now the only other thing I wanted to accomplish was to obtain the top half
 of the matrix with p values 
 and the bottom half with the correlations, to observe the significant
 correlations.  I have been able to use a few functions such as rcorr, and
 cor.matrix to get such information but it isn't output in a format that I
 can save with the write.table function or write.clipboard
 
 the pair function allows a graphical display of the data on the other hand
 (with correlation graphics on the bottom half) and I have added an argument
 which allows to view the significant p values.  But I wanted to know if I
 could also do the above easily.
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3940170.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

William Revellehttp://personality-project.org/revelle.html
Professor  http://personality-project.org
Department of Psychology   http://www.wcas.northwestern.edu/psych/
Northwestern Universityhttp://www.northwestern.edu/
Use R for psychology http://personality-project.org/r
It is 6 minutes to midnighthttp://www.thebulletin.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] survival: fitting equation to survival curve?

2011-10-26 Thread Lancaster, Robert (Orbitz)

Given a survfit object, is it possible to fit an equation to the resulting 
survival curve?  What about with a coxph or survreg object?

TIA,
Rob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract data for specific levels factor

2011-10-26 Thread Andrés Aragón

Dear all,
Thanks for your help.
Option of Sarah and Dan is just what I want:

ff1-mydata[mydata$cat%in%c(“wish1”, “wish2”, “wish3”),]

Then I used ff1 in ggplot2 without problems.

Option of Dennis (reshape2) does produced an output “no coherent”:
content in he object contained data of categorical variable but not
data of “ind”.

Option of Dennis (ggplot2) does functioned well, but when I put more
than one clasification in categ the following output is obtained:

 ggplot(subset(mydata, categ=='folic tot', 'porc fol marc'), 
 aes(age,ind))+geom_point()
Error in `[.data.frame`(x, r, vars, drop = drop) :
  undefined columns selected


I tryed the same but added ddply:
qq-ddply(mydata, .(ind), subset, categ=(porc fol tot, porc fol marc))
Error: unexpected ',' in qq-ddply(mydata, .(ind), subset,
categ=(porc fol tot,

Any idea about the last?

Andrés AM

2011/10/25, Dennis Murphy djmu...@gmail.com:
 Are you trying to separate the substrings in cat? If so, one way is to
 use the colsplit() function in the reshape2 package, something like
 (untested since you did not provide a suitable data format with which
 to work):

 library('reshape2')
 splitcat - colsplit(mydata$cat, ' ', names = c('fat', 'bat', 'rat'))
 moredat - cbind(mydata, splitcat)

 Other options that might pertain to your request include:

 (i) subset(mydata, cat == 'por fol pec')
 which you can use as a data argument inside ggplot2 - e.g.,

 ggplot(subset(mydata, cat == 'por fol pec'), aes(x = age, y = ind)) +
geom_point()

 (ii) use faceting to get individual plots by factor level of cat - e.g.,
 ggplot(mydata, aes(x = age, y = ind)) +
geom_point() +
facet_grid( ~ cat)

 Hope that one of these is close to the bullseye...
 Dennis


 2011/10/25 Andrés Aragón armand...@gmail.com:
 Dear all,

 I'm trying to analyze data with the following structure:

 ind  cattx  age
 40.2 por fol peq vh35
 41.9 por fol med vh35
 68.9 por fol preov   vh   35
 71.5 por fol peq  ser   37
 67.5  por fol medser   37
 76.9  por fol preov   ser 37
 78.7  por fol peq  otr  37
 78.3  por fol medotr   37
 82.1  por fol preov   otr  37
 83.9  por fol peq  vh   37
 80.6  por fol med  vh  37
 76.1  por fol preov vh 37
 86.9  por fol peqser 35
 97.7  por fol med   ser 35
 62.3  por fol preov ser 35



 I want to separate exclusively some of factor levels  (“por fol peq”
 in the “cat” colum). I am using ggplot2  and I only can plot all of
 factors, not separately. I did try ddply without success.
 Any help is welcome.

 Andrés

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extra Sums of Squares from an anova table - why are the values different?

2011-10-26 Thread Joshua Wiley

Hi Stephen,

Thanks for the disclosure.  If you are referring to the difference in
the third decimal place between your calculated F value and what R
gives, yes, it is due to rounding.  Try this:

## extract the mean squares from anova() and store in msq
msq - anova(x.lm)[, Mean Sq]

mean(msq[4:9])/msq[10]

Cheers,

Josh

On Wed, Oct 26, 2011 at 11:19 AM, Stephen Sefick ssef...@gmail.com wrote:
 #For full disclosure- I am working on a homework problem.  However, my
 question revolves around computer rounding, I think.


 x - (structure(list(y = c(0.222, 0.395, 0.422, 0.437, 0.428, 0.467,
 0.444, 0.378, 0.494, 0.456, 0.452, 0.112, 0.432, 0.101, 0.232,
 0.306, 0.0923, 0.116, 0.0764, 0.439, 0.0944, 0.117, 0.0726, 0.0412,
 0.251, 2e-05), x1 = c(7.3, 8.7, 8.8, 8.1, 9, 8.7, 9.3, 7.6, 10,
 8.4, 9.3, 7.7, 9.8, 7.3, 8.5, 9.5, 7.4, 7.8, 7.7, 10.3, 7.8,
 7.1, 7.7, 7.4, 7.3, 7.6), x2 = c(0, 0, 0.7, 4, 0.5, 1.5, 2.1,
 5.1, 0, 3.7, 3.6, 2.8, 4.2, 2.5, 2, 2.5, 2.8, 2.8, 3, 1.7, 3.3,
 3.9, 4.3, 6, 2, 7.8), x3 = c(0, 0.3, 1, 0.2, 1, 2.8, 1, 3.4,
 0.3, 4.1, 2, 7.1, 2, 6.8, 6.6, 5, 7.8, 7.7, 8, 4.2, 8.5, 6.6,
 9.5, 10.9, 5.2, 20.7), x11 = c(53.29, 75.69, 77.44, 65.61, 81,
 75.69, 86.49, 57.76, 100, 70.56, 86.49, 59.29, 96.04, 53.29,
 72.25, 90.25, 54.76, 60.84, 59.29, 106.09, 60.84, 50.41, 59.29,
 54.76, 53.29, 57.76), x22 = c(0, 0, 0.49, 16, 0.25, 2.25, 4.41,
 26.01, 0, 13.69, 12.96, 7.84, 17.64, 6.25, 4, 6.25, 7.84, 7.84,
 9, 2.89, 10.89, 15.21, 18.49, 36, 4, 60.84), x33 = c(0, 0.09,
 1, 0.04, 1, 7.84, 1, 11.56, 0.09, 16.81, 4, 50.41, 4, 46.24,
 43.56, 25, 60.84, 59.29, 64, 17.64, 72.25, 43.56, 90.25, 118.81,
 27.04, 428.49), x12 = c(0, 0, 6.16, 32.4, 4.5, 13.05, 19.53,
 38.76, 0, 31.08, 33.48, 21.56, 41.16, 18.25, 17, 23.75, 20.72,
 21.84, 23.1, 17.51, 25.74, 27.69, 33.11, 44.4, 14.6, 59.28),
    x13 = c(0, 2.61, 8.8, 1.62, 9, 24.36, 9.3, 25.84, 3, 34.44,
    18.6, 54.67, 19.6, 49.64, 56.1, 47.5, 57.72, 60.06, 61.6,
    43.26, 66.3, 46.86, 73.15, 80.66, 37.96, 157.32), x23 = c(0,
    0, 0.7, 0.8, 0.5, 4.2, 2.1, 17.34, 0, 15.17, 7.2, 19.88,
    8.4, 17, 13.2, 12.5, 21.84, 21.56, 24, 7.14, 28.05, 25.74,
    40.85, 65.4, 10.4, 161.46)), .Names = c(y, x1, x2,
 x3, x11, x22, x33, x12, x13, x23), row.names = c(NA,
 -26L), class = data.frame)
 )

 x$x11 - x$x1^2
 x$x22 - x$x2^2
 x$x33 - x$x3^2
 x$x12 - x$x1*x$x2
 x$x13 - x$x1*x$x3
 x$x23 - x$x2*x$x3

 x.lm - lm(y~x1+x2+x3+x11+x22+x33+x12+x13+x23, data=x)

 anova(lm(y~x1+x2+x3,data=x), x.lm)

 anova(x.lm)

 #I want to test

 #Ho:y~x1+x2+x3
 #Ha:y~x1+x2+x3+x11+x22+x33+x12+x13+x23

 ((0.00945+0.01340+0.00200+0.00568+0.00489+0.00050)/6)/(0.00371)

 #Thanks

 #Stephen Sefick

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] survival: fitting equation to survival curve?

2011-10-26 Thread Lancaster, Robert (Orbitz)

Given a survfit object, is it possible to fit an equation to the resulting 
survival curve?  Is this possible?  What about with a coxph or survreg object?

TIA,
Rob


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extra Sums of Squares from an anova table - why are the values different?

2011-10-26 Thread Stephen Sefick

I was referring to the 3rd decimal place and beyond.  Thanks that did 
the trick.  I was trying to compare the two to make sure that I knew how 
to do it by hand.  Thanks for all of your help.


Stephen

On Wed 26 Oct 2011 02:23:02 PM CDT, Joshua Wiley wrote:


Hi Stephen,

Thanks for the disclosure. If you are referring to the difference in
the third decimal place between your calculated F value and what R
gives, yes, it is due to rounding. Try this:

## extract the mean squares from anova() and store in msq
msq- anova(x.lm)[, Mean Sq]

mean(msq[4:9])/msq[10]

Cheers,

Josh

On Wed, Oct 26, 2011 at 11:19 AM, Stephen Sefickssef...@gmail.com wrote:


#For full disclosure- I am working on a homework problem. However, my
question revolves around computer rounding, I think.


x- (structure(list(y = c(0.222, 0.395, 0.422, 0.437, 0.428, 0.467,
0.444, 0.378, 0.494, 0.456, 0.452, 0.112, 0.432, 0.101, 0.232,
0.306, 0.0923, 0.116, 0.0764, 0.439, 0.0944, 0.117, 0.0726, 0.0412,
0.251, 2e-05), x1 = c(7.3, 8.7, 8.8, 8.1, 9, 8.7, 9.3, 7.6, 10,
8.4, 9.3, 7.7, 9.8, 7.3, 8.5, 9.5, 7.4, 7.8, 7.7, 10.3, 7.8,
7.1, 7.7, 7.4, 7.3, 7.6), x2 = c(0, 0, 0.7, 4, 0.5, 1.5, 2.1,
5.1, 0, 3.7, 3.6, 2.8, 4.2, 2.5, 2, 2.5, 2.8, 2.8, 3, 1.7, 3.3,
3.9, 4.3, 6, 2, 7.8), x3 = c(0, 0.3, 1, 0.2, 1, 2.8, 1, 3.4,
0.3, 4.1, 2, 7.1, 2, 6.8, 6.6, 5, 7.8, 7.7, 8, 4.2, 8.5, 6.6,
9.5, 10.9, 5.2, 20.7), x11 = c(53.29, 75.69, 77.44, 65.61, 81,
75.69, 86.49, 57.76, 100, 70.56, 86.49, 59.29, 96.04, 53.29,
72.25, 90.25, 54.76, 60.84, 59.29, 106.09, 60.84, 50.41, 59.29,
54.76, 53.29, 57.76), x22 = c(0, 0, 0.49, 16, 0.25, 2.25, 4.41,
26.01, 0, 13.69, 12.96, 7.84, 17.64, 6.25, 4, 6.25, 7.84, 7.84,
9, 2.89, 10.89, 15.21, 18.49, 36, 4, 60.84), x33 = c(0, 0.09,
1, 0.04, 1, 7.84, 1, 11.56, 0.09, 16.81, 4, 50.41, 4, 46.24,
43.56, 25, 60.84, 59.29, 64, 17.64, 72.25, 43.56, 90.25, 118.81,
27.04, 428.49), x12 = c(0, 0, 6.16, 32.4, 4.5, 13.05, 19.53,
38.76, 0, 31.08, 33.48, 21.56, 41.16, 18.25, 17, 23.75, 20.72,
21.84, 23.1, 17.51, 25.74, 27.69, 33.11, 44.4, 14.6, 59.28),
x13 = c(0, 2.61, 8.8, 1.62, 9, 24.36, 9.3, 25.84, 3, 34.44,
18.6, 54.67, 19.6, 49.64, 56.1, 47.5, 57.72, 60.06, 61.6,
43.26, 66.3, 46.86, 73.15, 80.66, 37.96, 157.32), x23 = c(0,
0, 0.7, 0.8, 0.5, 4.2, 2.1, 17.34, 0, 15.17, 7.2, 19.88,
8.4, 17, 13.2, 12.5, 21.84, 21.56, 24, 7.14, 28.05, 25.74,
40.85, 65.4, 10.4, 161.46)), .Names = c(y, x1, x2,
x3, x11, x22, x33, x12, x13, x23), row.names = c(NA,
-26L), class = data.frame)
)

x$x11- x$x1^2
x$x22- x$x2^2
x$x33- x$x3^2
x$x12- x$x1*x$x2
x$x13- x$x1*x$x3
x$x23- x$x2*x$x3

x.lm- lm(y~x1+x2+x3+x11+x22+x33+x12+x13+x23, data=x)

anova(lm(y~x1+x2+x3,data=x), x.lm)

anova(x.lm)

#I want to test

#Ho:y~x1+x2+x3
#Ha:y~x1+x2+x3+x11+x22+x33+x12+x13+x23

((0.00945+0.01340+0.00200+0.00568+0.00489+0.00050)/6)/(0.00371)

#Thanks

#Stephen Sefick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FOR loop with statistical analysis for microarray data

2011-10-26 Thread Seb

hi all

i started recently using R and i found myself stuck when i try to
analyze microarray data.

i use the affy package to obtain  the intensities of the probes, i
have two CTRs and two treated.

 HG.U133A.Experiment1.CEL HG.U133A.Experiment2.CEL
HG.U133A_Control1.CEL HG.U133A_Control2.CEL
1007_s_at   2156.23115467.75615
 364.60615 362.11865
1053_at   88.76368 93.58436
 438.49365 357.75615
117_at   144.00743101.26120
  95.7 107.01623
121_at   551.36865639.45615
 456.66865 435.95615
1255_g_at 65.33164 18.39570
  14.22565  20.74632
1294_at  106.19083169.69369
  78.15722  81.14689

i divided the first two columns in two data.frames to divide Experim and CTRs

then, i created a FOR loop to create a vector per each row containing
a vector with two values per each gene and i wanted to do a
Wilcox.test to obtain the significant genes..BUT i get a list of NULL
like you can see here
..the first row works but then i get NULL down till the end of the array...

fcpv
[1,] 1007_s_at -20.248   0.4664612
[2,] 1053_at   -344.7132 NULL
[3,] 117_atNULL  NULL
[4,] 121_atNULL  NULL
[5,] 1255_g_at NULL  NULL
[6,] 1294_at   NULL  NULL

the script i used is:
===
fc=0
pv=0
for (i in 1:nrow(data))
{
v1= c(y1[i,1], y1[i,2]) 
v2= c(y2[i,1], y2[1,2])
fc=v1-v2
w=t.test(v1,v2)
pv=w$p.value
fc[i]= w[1]
pv[i]= w[2]
}

results = cbind(row.names(y1), fc, pv)

head(results)



what did i do wrong? i can't find a way around this!!!

thanks so much!!!

Seb

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correlation Matrix in R

2011-10-26 Thread Mark Podolsky

Hi,

rcor.test in library(ltm) will provide a correlation matrix with p-values on 
the bottom-half of the matrix.


Mark

On 2011-10-26, at 7:03 AM, AlexC wrote:

 Thank you for your quick reply and helpful advice.
 
 Using this argument allows me to do what I needed to do
 
 Now the only other thing I wanted to accomplish was to obtain the top half
 of the matrix with p values 
 and the bottom half with the correlations, to observe the significant
 correlations.  I have been able to use a few functions such as rcorr, and
 cor.matrix to get such information but it isn't output in a format that I
 can save with the write.table function or write.clipboard
 
 the pair function allows a graphical display of the data on the other hand
 (with correlation graphics on the bottom half) and I have added an argument
 which allows to view the significant p values.  But I wanted to know if I
 could also do the above easily.
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Correlation-Matrix-in-R-tp3938274p3940170.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FOR loop with statistical analysis for microarray data

2011-10-26 Thread David Winsemius

affy is a bioconductor package. You should be asking this question  
on the bioc mailing list.


--
David.
On Oct 26, 2011, at 4:56 PM, Seb wrote:


hi all

i started recently using R and i found myself stuck when i try to
analyze microarray data.

i use the affy package to obtain  the intensities of the probes, i
have two CTRs and two treated.

HG.U133A.Experiment1.CEL HG.U133A.Experiment2.CEL
HG.U133A_Control1.CEL HG.U133A_Control2.CEL
1007_s_at   2156.23115467.75615
364.60615 362.11865
1053_at   88.76368 93.58436
438.49365 357.75615
117_at   144.00743101.26120
 95.7 107.01623
121_at   551.36865639.45615
456.66865 435.95615
1255_g_at 65.33164 18.39570
 14.22565  20.74632
1294_at  106.19083169.69369
 78.15722  81.14689

i divided the first two columns in two data.frames to divide Experim  
and CTRs


then, i created a FOR loop to create a vector per each row containing
a vector with two values per each gene and i wanted to do a
Wilcox.test to obtain the significant genes..BUT i get a list of NULL
like you can see here
..the first row works but then i get NULL down till the end of the  
array...


   fcpv
[1,] 1007_s_at -20.248   0.4664612
[2,] 1053_at   -344.7132 NULL
[3,] 117_atNULL  NULL
[4,] 121_atNULL  NULL
[5,] 1255_g_at NULL  NULL
[6,] 1294_at   NULL  NULL

the script i used is:
===
fc=0
pv=0
for (i in 1:nrow(data))
{
v1= c(y1[i,1], y1[i,2]) 
v2= c(y2[i,1], y2[1,2])
fc=v1-v2
w=t.test(v1,v2)
pv=w$p.value
fc[i]= w[1]
pv[i]= w[2]
}

results = cbind(row.names(y1), fc, pv)

head(results)



what did i do wrong? i can't find a way around this!!!

thanks so much!!!

Seb

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Counting the number of marginals

2011-10-26 Thread Jim Silverton

Dear all,

I have two matrices lets call them A and B. Each of which is a 100 x 3
matrix. What I do is take the corresponding row from each matrix and form
100 2 x 3 tables. If we call the column sums for each 2 x 3 n1, n2 and n3, I
would like to compute the following probability:

Basically the number of (n1 = a, n2 = b) in all the tables divided by the
number of tables. Its the probability that a table has a particular column
total.

So if A was

0  2  3
0  2  3
1  2  4
2  3  3


B was
1  0  3
1  0  2
1  2  4
2  3  3


The 2 x 3 tables would be:

0  2  3
1  0  3
Totals (1,2)  # first 2 totals

0  2  3
1  0  2
Totals (1,2)

1  2  4
1  2  4
Totals (2,4)

2  3  3
2  3  3
Totals (4,6)



The expected probabilities I should get are:

0.5, 0.5, 0.25 and 0.25 for each of the 2 x 3 tables.

Any help is greatly appreciated.


-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Where can I find cmeans {e1071} package?

2011-10-26 Thread Rui Esteves

Hello,

I need a Fuzzy C Means algorithm.
I found some documentation about cmeans  {e1071} at
http://rss.acs.unt.edu/Rdoc/library/e1071/html/cmeans.html
Does someone knows where I can find it?

Thank you
Rui

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Where can I find cmeans {e1071} package?

2011-10-26 Thread Sarah Goslee

On Wed, Oct 26, 2011 at 6:40 PM, Rui Esteves ruimax...@gmail.com wrote:
 Hello,

 I need a Fuzzy C Means algorithm.
 I found some documentation about cmeans  {e1071} at
 http://rss.acs.unt.edu/Rdoc/library/e1071/html/cmeans.html
 Does someone knows where I can find it?

e1071 is a package, and  you can use install.packages() from R to install
it, or download it directly from the CRAN mirror nearest you.

http://cran.r-project.org/

This is a very basic question; I suspect you'd benefit from reading one of
the many Introduction to R documents available online.

Sarah

 Thank you
 Rui

        [[alternative HTML version deleted]]




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FOR loop with statistical analysis for microarray data

2011-10-26 Thread Weidong Gu

If you provide an example data (y1 and y2 in the loop), you might have
got specific helps already. A few things in your loop seem suspicious.
fc and pv are vectors, and in each loop you redesigned the whole
vectors and specific indices twice. That may cause your problems.

Weidong Gu



On Wed, Oct 26, 2011 at 4:56 PM, Seb seba@gmail.com wrote:
 hi all

 i started recently using R and i found myself stuck when i try to
 analyze microarray data.

 i use the affy package to obtain  the intensities of the probes, i
 have two CTRs and two treated.

  HG.U133A.Experiment1.CEL HG.U133A.Experiment2.CEL
 HG.U133A_Control1.CEL HG.U133A_Control2.CEL
 1007_s_at               2156.23115                467.75615
  364.60615             362.11865
 1053_at                   88.76368                 93.58436
  438.49365             357.75615
 117_at                   144.00743                101.26120
  95.7             107.01623
 121_at                   551.36865                639.45615
  456.66865             435.95615
 1255_g_at                 65.33164                 18.39570
  14.22565              20.74632
 1294_at                  106.19083                169.69369
  78.15722              81.14689

 i divided the first two columns in two data.frames to divide Experim and CTRs

 then, i created a FOR loop to create a vector per each row containing
 a vector with two values per each gene and i wanted to do a
 Wilcox.test to obtain the significant genes..BUT i get a list of NULL
 like you can see here
 ..the first row works but then i get NULL down till the end of the array...

                fc        pv
 [1,] 1007_s_at -20.248   0.4664612
 [2,] 1053_at   -344.7132 NULL
 [3,] 117_at    NULL      NULL
 [4,] 121_at    NULL      NULL
 [5,] 1255_g_at NULL      NULL
 [6,] 1294_at   NULL      NULL

 the script i used is:
 ===
 fc=0
 pv=0
 for (i in 1:nrow(data))
 {
        v1= c(y1[i,1], y1[i,2])
        v2= c(y2[i,1], y2[1,2])
        fc=v1-v2
        w=t.test(v1,v2)
        pv=w$p.value
        fc[i]= w[1]
        pv[i]= w[2]
 }

 results = cbind(row.names(y1), fc, pv)

 head(results)

 

 what did i do wrong? i can't find a way around this!!!

 thanks so much!!!

 Seb

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Example(chron) doesn't work

2011-10-26 Thread hchui

It works with Rgui vanilla, R version 2.13.1. I'll check it again when I
install R version 2.13.2. 

Many thanks!

 C:\\Program Files\\R\\R-2.13.1\\bin\\x64\\Rgui.exe --vanilla
[1] C:\\Program Files\\R\\R-2.13.1\\bin\\x64\\Rgui.exe --vanilla
 library(chron)
Warning message:
package 'chron' was built under R version 2.13.2 
 example(chron)

chron dts - dates(c(02/27/92, 02/27/92, 01/14/92,
chron+02/28/92, 02/01/92))

chron dts
[1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92

chron # [1] 02/27/92 02/27/92 01/14/92 02/28/92 02/01/92
chron tms - times(c(23:03:20, 22:29:56, 01:03:30,
chron+18:21:03, 16:56:26))

chron tms
[1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26

chron # [1] 23:03:20 22:29:56 01:03:30 18:21:03 16:56:26
chron x - chron(dates = dts, times = tms)

chron x
[1] (02/27/92 23:03:20) (02/27/92 22:29:56) (01/14/92 01:03:30)
[4] (02/28/92 18:21:03) (02/01/92 16:56:26)

chron # [1] (02/27/92 23:03:19) (02/27/92 22:29:56) (01/14/92 01:03:30)
chron # [4] (02/28/92 18:21:03) (02/01/92 16:56:26)
chron 
chron # We can add or subtract scalars (representing days) to dates or
chron # chron objects:
chron c(dts[1], dts[1] + 10)
[1] 02/27/92 03/08/92

chron # [1] 02/27/92 03/08/92
chron dts[1] - 31
[1] 01/27/92

chron # [1] 01/27/92
chron 
chron # We can substract dates which results in a times object that
chron # represents days between the operands:
chron dts[1] - dts[3]
Time in days:
[1] 44

chron # Time in days:
chron # [1] 44
chron 
chron # Logical comparisons work as expected:
chron dts[dts  01/25/92]
[1] 02/27/92 02/27/92 02/28/92 02/01/92

chron # [1] 02/27/92 02/27/92 02/28/92 02/01/92
chron dts  dts[3]
[1]  TRUE  TRUE FALSE  TRUE  TRUE

chron # [1]  TRUE  TRUE FALSE  TRUE  TRUE
chron 
chron # Summary operations which are sensible are permitted and work as
chron # expected:
chron range(dts)
[1] 01/14/92 02/28/92

chron # [1] 01/14/92 02/28/92
chron diff(x)
Time in days:
[1]  -0.02319444 -44.89335648  45.72052083 -27.05876157

chron # Time in days:
chron # [1]  -0.02319444 -44.89335648  45.72052083 -27.05876157
chron sort(dts)[1:3]
[1] 01/14/92 02/01/92 02/27/92

chron # [1] 01/14/92 02/01/92 02/27/92
chron 
chron 
chron 


--
View this message in context: 
http://r.789695.n4.nabble.com/Example-chron-doesn-t-work-tp801580p3942640.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SpatialLines

2011-10-26 Thread MacQueen, Don

In addition to which, R-sig-geo would be better place to ask.
-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 10/26/11 10:39 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote:

On 26/10/2011 1:11 PM, Mark Newcomb wrote:
 I'm hoping to use R for spatial analysis.  In working through examples
in Chapt. 4 of Applied Spatial Data Analysis with R I've come across the
following error in trying to plot lines with the meuse data set.  The
text is verbatim from the book.

   m.sl- SpatialLines(list(Lines(list(Line(cc)
 Error in Lines(list(Line(cc))) : Single ID required

 What does Single ID required mean?

That message is coming from a contributed package, not from base R.  You
should say what package you're using, and you may need to contact the
author or maintainer of it to get an answer.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Webscraping - How to Scrape Out Text Into R As If Copied Pasted From Webpage?

2011-10-26 Thread Moser, Gary

Greetings,

 

I am trying to get all of the text from a web page as if I selected
all on the page, pasted into a text file, and then read in the text
file with read.csv().

 

# this is the actual page I'm trying to acquire text from:

web.pg - readLines(http://www.airweb.org/?page=574;)

 

# then parsed in hopes of an easier structure to work with:

web.pg - htmlTreeParse(file=web.pg, ignoreBlanks=TRUE)

 

Now I have a lovely html tree, but don't know the best way to get just
the text components (job descriptions, job titles, etc...) as they
appear on the web site. I'd like to do a little text mining and make a
wordcloud using the text. Can anybody suggest a method to achieve this
result?

 

Thank you,

 

Gary R. Moser

Institutional Research Analyst

Heald College

p - 415.808.1533

f - 415.808.1598

gary_mo...@heald.edu mailto:gary_mo...@heald.edu 

 



Disclaimer: This communication may contain Heald College confidential and 
proprietary data. This message is intended only for the personal and 
confidential use of the designated recipients named above. If you are not the 
intended recipient of this message you are hereby notified that any review, 
dissemination, distribution or copying of this message is strictly prohibited. 
In addition, if you have received this message in error, please advise the 
sender by reply email and delete the message.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Webscraping - How to Scrape Out Text Into R As If Copied Pasted From Webpage?

2011-10-26 Thread Henrique Dallazuanna

Use XPATH query:

web.pg - htmlTreeParse(file=web.pg, ignoreBlanks=TRUE, useInternalNodes = TRUE)

# Job title
xpathApply(web.pg, //span[@class='normal']//b, xmlValue)

On Wed, Oct 26, 2011 at 9:36 PM, Moser, Gary gary_mo...@heald.edu wrote:
 Greetings,



 I am trying to get all of the text from a web page as if I selected
 all on the page, pasted into a text file, and then read in the text
 file with read.csv().



 # this is the actual page I'm trying to acquire text from:

 web.pg - readLines(http://www.airweb.org/?page=574;)



 # then parsed in hopes of an easier structure to work with:

 web.pg - htmlTreeParse(file=web.pg, ignoreBlanks=TRUE)



 Now I have a lovely html tree, but don't know the best way to get just
 the text components (job descriptions, job titles, etc...) as they
 appear on the web site. I'd like to do a little text mining and make a
 wordcloud using the text. Can anybody suggest a method to achieve this
 result?



 Thank you,



 Gary R. Moser

 Institutional Research Analyst

 Heald College

 p - 415.808.1533

 f - 415.808.1598

 gary_mo...@heald.edu mailto:gary_mo...@heald.edu





 Disclaimer: This communication may contain Heald College confidential and 
 proprietary data. This message is intended only for the personal and 
 confidential use of the designated recipients named above. If you are not the 
 intended recipient of this message you are hereby notified that any review, 
 dissemination, distribution or copying of this message is strictly 
 prohibited. In addition, if you have received this message in error, please 
 advise the sender by reply email and delete the message.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Consistant test for NAs in a factor when exclude = NULL?

2011-10-26 Thread andrewH

Dear folks?

Is there a function to correctly find (and count) the NAs in a factor when
exclude=NULL, regardless of whether their origin is in the original data or
by subsequent assignment?

In example number 1 below, where NAs are assigned by is.na()-, testing the
factor with is.na() finds the correct number of NAs.  In example number 2,
where the NAs are from the data, neither is.na(), ==NA, nor ==NA correctly
identifies the NAs.  In example number 3, which mixes NAs from assignment
with NAs from data, is.na does not even find the NAs created by assignment,
as it did in example 1.

I'm running R 2.13.2 on Windows XP with ServicePack 3

Any assistance would be greatly appreciated.

Appreciatively, andrewH


Example #1

 # Origin: is.na()-  Exclude: NULL
 KK - factor(c(A,A,B,B,C,C), exclude=NULL)
 KK[KK==C]
[1] C C
Levels: A B C
 is.na(KK[KK==C]) - TRUE
 KK
[1] AABBNA NA
Levels: A B C
 levels(KK)
[1] A B C
 levels(KK)[KK]
[1] A A B B NA  NA 
 KK==NA
[1] NA NA NA NA NA NA
 sum(KK==NA)
[1] NA
 KK==NA
[1] FALSE FALSE FALSE FALSENANA
 sum(KK==NA)
[1] NA
 is.na(KK)
[1] FALSE FALSE FALSE FALSE  TRUE  TRUE
 sum(is.na(KK))
[1] 2

Example #2

 # Origin: data Exclude: NULL
 GG - factor(c(A,A,B,B, NA, NA), exclude=NULL)
 GG
[1] AABBNA NA
Levels: A B NA
 levels(GG)
[1] A B NA 
 levels(GG)[GG]
[1] A A B B NA  NA 
 GG==NA
[1] NA NA NA NA NA NA
 sum(GG==NA)
[1] NA
 GG==NA
[1] FALSE FALSE FALSE FALSE FALSE FALSE
 sum(GG==NA)
[1] 0
 is.na(GG)
[1] FALSE FALSE FALSE FALSE FALSE FALSE
 sum(is.na(GG))

Example #3.

 MM - factor(c(A,A,B,B,C,C, NA), exclude=NULL)
 is.na(MM[MM==C]) - TRUE
 MM
[1] AABBNA NA NA
Levels: A B C NA
 levels(MM)
[1] A B C NA 
 levels(MM)[MM]
[1] A A B B NA  NA  NA 
 MM==NA
[1] NA NA NA NA NA NA NA
 sum(MM==NA)
[1] NA
 MM==NA
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 sum(MM==NA)
[1] 0
 is.na(MM)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 sum(is.na(MM))
[1] 0

--
View this message in context: 
http://r.789695.n4.nabble.com/Consistant-test-for-NAs-in-a-factor-when-exclude-NULL-tp3942755p3942755.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 115 matches

Mail list logo