Re: [R] Where is gdata?

2010-11-29 Thread Stephen Liu
Hi Liviu,

 Not if you
 library(gdata)
 first. Then
 ?read.xls

 should work.

Yes, I did.


I found something strange here which I can't explain.

Win 7 64bit
R 32/64 bit

Just rebooted Win 7 and R

 library(gdata)
gdata: Unable to locate valid perl interpreter
gdata: 
gdata: read.xls() will be unable to read Excel XLS and XLSX files
gdata: unless the 'perl=' argument is used to specify the location of a
gdata: valid perl intrpreter.
gdata: 
gdata: (To avoid display of this message in the future, please ensure
gdata: perl is installed and available on the executable search path.)
gdata: Unable to load perl libaries needed by read.xls()
gdata: to support 'XLX' (Excel 97-2004) files.

gdata: Unable to load perl libaries needed by read.xls()
gdata: to support 'XLSX' (Excel 2007+) files.

gdata: Run the function 'installXLSXsupport()'
gdata: to automatically download and install the perl
gdata: libaries needed to support Excel XLS and XLSX formats.

Attaching package: 'gdata'

The following object(s) are masked from 'package:utils':

object.size


It complains.


 ?read.xls
starting httpd help server ... done

Read Excel files


Both 32 and 64 bit R worked.


If there is NO complaint on running;

 library(gdata)

Then
 ?read.xls

can't work.


Perl seems has been installed.  But I can't recall, when and how;

C:\dir C:\Users\satimiswin764\Documents\R\win-library\2.12\gdata\
.
11/22/2010  10:44 AMDIR  perl
11/22/2010  10:44 AMDIR  R
11/22/2010  10:44 AMDIR  unitTests
11/22/2010  10:44 AMDIR  xls


dir C:\Users\satimiswin764\My Documents\R\win-library\2.12\gdata\perl

11/22/2010  10:44 AMDIR  .
11/22/2010  10:44 AMDIR  ..
11/22/2010  10:44 AMDIR  Archive
11/22/2010  10:44 AM   418 install_modules.pl
11/22/2010  10:44 AMDIR  IO
11/22/2010  10:44 AM 2,710 module_tools.pl
11/22/2010  10:44 AMDIR  OLE
11/22/2010  10:44 AM 2,019 sheetCount.pl
11/22/2010  10:44 AM 2,019 sheetNames.pl
11/22/2010  10:44 AMDIR  Spreadsheet
11/22/2010  10:44 AM   550 supportedFormats.pl
11/22/2010  10:44 AM   114 VERSIONS
11/22/2010  10:44 AM 5,512 xls2csv.pl
11/22/2010  10:44 AM 5,512 xls2tab.pl
11/22/2010  10:44 AM 5,512 xls2tsv.pl
   9 File(s) 24,366 bytes
   6 Dir(s)  16,776,032,256 bytes free


B.R.
Stephen L






From: Liviu Andronic landronim...@gmail.com

Cc: Gabor Grothendieck ggrothendi...@gmail.com; r-help r-help@r-project.org
Sent: Mon, November 29, 2010 2:40:16 PM
Subject: Re: [R] Where is gdata?


 ?read.xls
 I must run ??read.xls

Not if you
 library(gdata)

first. Then
 ?read.xls

should work.

Regards
Liviu



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help Please!!!!!!!!!

2010-11-29 Thread Melissa Waldman
Hi,

I have been working with Program R for my stats class and I keep coming upon
the same error, I have read so many sites about inputting data from a text
file into R and I'm using the data to do a correspondence analysis.  I feel
like I have read everything and it is still not explaining why the error
message keeps coming up, I have used the exact examples I have seen in
articles and the same error keeps popping up: Error in sum(N) : invalid
'type' (character) of argument

I have spent so long trying to figure this out without success,
I am sure it has to do with the fact that my rows have names in them.  I
have attached the text file I have been using and if you have any ideas as
to how I can get R to plot the data using correspondence analysis with the
column and row names that would be really helpful!  Or if you could pass
this email to someone who may know how to help me, that would be much
appreciated.

Thank you,
Melissa Waldman

my email: melissawald...@gmail.com
NoneLight   Medium  Heavy
SM  4   2   3   2
JM  4   3   7   4
SE  25  10  12  4
JE  18  24  33  13
S   10  6   7   2
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Evaluation of survival analysis

2010-11-29 Thread He Zhang
Dear all,

May I ask is there any functions in R to evaluate the fitness of coxph and
survreg in survival analysis, please?

For example, the results from Cox regression and Parametric survival
analysis are shown below. Which method is prefered and how to see that / how
to compare the methods?

1. coxph(formula = y ~ pspline(x1, df = 2))

 coef   se(coef)   se2  Chisq   DF
p
pspline(x1, df = 2), line 0.0522 0.00867  0.00866 36.23 1.00   1.8e-09
pspline(x1, df = 2), nonl3.27 1.04
7.5e-02

Iterations: 4 outer, 13 Newton-Raphson
 Theta= 0.91
Degrees of freedom for terms= 2
Likelihood ratio test=34.6  on 2.04 df, p=3.24e-08

2. survreg(formula = y ~ pspline(x1, df = 2))

   coefse(coef)se2  ChisqDF
  p
(Intercept)2.8199 0.15980  0.09933 311.37  1.0   0.0e+00
pspline(x1, df = 2), line -0.0193 0.00248  0.00248  60.35  1.0   8.0e-15
pspline(x1, df = 2), nonl 1.43  1.1
2.6e-01

Scale= 0.304

Iterations: 6 outer, 20 Newton-Raphson
 Theta= 0.991
Degrees of freedom for terms= 0.4 2.1 1.0
Likelihood ratio test=48.2  on 1.5 df, p=1.18e-11


I really appreciate for your help. Thank you very much in advance.

Best wishes,
He

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is gdata?

2010-11-29 Thread Spencer Graves

Hi, Stephen:


  The directory C:\Users\satimiswin764\My 
Documents\R\win-library\2.12\gdata\perl is NOT the perl interpreter but 
only perl code in the gdata package for R, invoked by certain R 
commands.  You need to install something like Strawberry perl, as I've 
previously stated.



  Spencer


On 11/29/2010 12:44 AM, Stephen Liu wrote:

Hi Liviu,


Not if you
library(gdata)
first. Then
?read.xls
should work.

Yes, I did.


I found something strange here which I can't explain.

Win 7 64bit
R 32/64 bit

Just rebooted Win 7 and R


library(gdata)

gdata: Unable to locate valid perl interpreter
gdata:
gdata: read.xls() will be unable to read Excel XLS and XLSX files
gdata: unless the 'perl=' argument is used to specify the location of a
gdata: valid perl intrpreter.
gdata:
gdata: (To avoid display of this message in the future, please ensure
gdata: perl is installed and available on the executable search path.)
gdata: Unable to load perl libaries needed by read.xls()
gdata: to support 'XLX' (Excel 97-2004) files.

gdata: Unable to load perl libaries needed by read.xls()
gdata: to support 'XLSX' (Excel 2007+) files.

gdata: Run the function 'installXLSXsupport()'
gdata: to automatically download and install the perl
gdata: libaries needed to support Excel XLS and XLSX formats.

Attaching package: 'gdata'

The following object(s) are masked from 'package:utils':

 object.size


It complains.



?read.xls

starting httpd help server ... done

Read Excel files


Both 32 and 64 bit R worked.


If there is NO complaint on running;


library(gdata)

Then

?read.xls

can't work.


Perl seems has been installed.  But I can't recall, when and how;

C:\dir C:\Users\satimiswin764\Documents\R\win-library\2.12\gdata\
.
11/22/2010  10:44 AMDIR   perl
11/22/2010  10:44 AMDIR   R
11/22/2010  10:44 AMDIR   unitTests
11/22/2010  10:44 AMDIR   xls



dir C:\Users\satimiswin764\My Documents\R\win-library\2.12\gdata\perl

11/22/2010  10:44 AMDIR   .
11/22/2010  10:44 AMDIR   ..
11/22/2010  10:44 AMDIR   Archive
11/22/2010  10:44 AM   418 install_modules.pl
11/22/2010  10:44 AMDIR   IO
11/22/2010  10:44 AM 2,710 module_tools.pl
11/22/2010  10:44 AMDIR   OLE
11/22/2010  10:44 AM 2,019 sheetCount.pl
11/22/2010  10:44 AM 2,019 sheetNames.pl
11/22/2010  10:44 AMDIR   Spreadsheet
11/22/2010  10:44 AM   550 supportedFormats.pl
11/22/2010  10:44 AM   114 VERSIONS
11/22/2010  10:44 AM 5,512 xls2csv.pl
11/22/2010  10:44 AM 5,512 xls2tab.pl
11/22/2010  10:44 AM 5,512 xls2tsv.pl
9 File(s) 24,366 bytes
6 Dir(s)  16,776,032,256 bytes free


B.R.
Stephen L






From: Liviu Androniclandronim...@gmail.com

Cc: Gabor Grothendieckggrothendi...@gmail.com; r-helpr-help@r-project.org
Sent: Mon, November 29, 2010 2:40:16 PM
Subject: Re: [R] Where is gdata?



?read.xls

I must run ??read.xls


Not if you

library(gdata)

first. Then

?read.xls

should work.

Regards
Liviu



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help Please!!!!!!!!!

2010-11-29 Thread Edwin Groot
On Sun, 28 Nov 2010 21:29:08 -0800
 Melissa Waldman melissawald...@gmail.com wrote:
 Hi,
 
 I have been working with Program R for my stats class and I keep
 coming upon
 the same error, I have read so many sites about inputting data from a
 text
 file into R and I'm using the data to do a correspondence analysis.
  I feel
 like I have read everything and it is still not explaining why the
 error
 message keeps coming up, I have used the exact examples I have seen
 in
 articles and the same error keeps popping up: Error in sum(N) :
 invalid
 'type' (character) of argument
 
 I have spent so long trying to figure this out without
 success,
 I am sure it has to do with the fact that my rows have names in them.
  I
 have attached the text file I have been using and if you have any
 ideas as
 to how I can get R to plot the data using correspondence analysis
 with the
 column and row names that would be really helpful!  Or if you could
 pass
 this email to someone who may know how to help me, that would be much
 appreciated.
 
 Thank you,
 Melissa Waldman
 
 my email: melissawald...@gmail.com

Hello Melissa,
First of all, you need a descriptive subject, such as, Cannot read
tabular data in R. R-help is a high-volume (100 to 200 messages per
day) and each person that can help you is a specialist in one or
another area.
Secondly, please include in your mail an excerpt of the relevant code
you used that read the data in and produced the error.

From looking at your text file, I would delete the white space before
None, save the file, and use the following function to read your data
into a data.frame:

read.delim(smokedata.txt)

This assumes you used a tab character between each field.

HTH, Edwin
-- 
Dr. Edwin Groot, postdoctoral associate
AG Laux
Institut fuer Biologie III
Schaenzlestr. 1
79104 Freiburg, Deutschland
+49 761-2032945

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cross tabulate variables by subject id

2010-11-29 Thread Marianne Promberger
Dear list,

I have data like this:

dat1 - data.frame(subject=rep(1:10,2),
   cond1=rep(c(A,B),each=5),
   cond2=rep(c(C,D),each=10),
   choice=sample(0:1,10,replace=TRUE))

I would like to compare subjects' choice for (cond1==A 
cond2==C) vs (cond1==A  cond2==D), using mcnemar.test

The ?mcnemar.test example has the data in a matrix:

 Performance
2nd Survey
1st Survey   Approve Disapprove
  Approve794150
  Disapprove  86570


So for my case, I need something like:

   Choice
 AC
AD   0  1
  0 ...
  1

Where ... would be the sum of subjects who answered 0 or 1 to AC
and/or AD respectively.

I can get the first step by making an extra variable:

dat1$condnew - paste(dat1$cond1,dat1$cond2,sep=)

although I am sure there are more elegant ways, and especially, I am
stumped how to fill in the cells of the table.

Thanks,

Marianne

-- 
Marianne Promberger PhD, King's College London
http://promberger.info
R version 2.12.0 (2010-10-15)
Ubuntu 9.04

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help Please!!!!!!!!!

2010-11-29 Thread Paul

On 29/11/10 05:29, Melissa Waldman wrote:

Hi,

I have been working with Program R for my stats class and I keep coming upon
the same error, I have read so many sites about inputting data from a text
file into R and I'm using the data to do a correspondence analysis.  I feel
like I have read everything and it is still not explaining why the error
message keeps coming up, I have used the exact examples I have seen in
articles and the same error keeps popping up: Error in sum(N) : invalid
'type' (character) of argument

I have spent so long trying to figure this out without success,
I am sure it has to do with the fact that my rows have names in them.  I
have attached the text file I have been using and if you have any ideas as
to how I can get R to plot the data using correspondence analysis with the
column and row names that would be really helpful!  Or if you could pass
this email to someone who may know how to help me, that would be much
appreciated.

Thank you,
Melissa Waldman

   

Hi Melissa,

Welcome to the world of R.  You didn't tell us which commands you were 
running that gave an error, but the error 'invalid 'type'' suggests to 
me you were trying to sum a variable that R thought was a character, and 
not a number.


I would recomend you (re) read the introduction to R 
(http://cran.r-project.org/doc/manuals/R-intro.pdf), especially chapter 
2, which deals with this.


As a quick example, if you've read your file into a dataframe called 
foo, with columns none, light etc then doing


class(foo$none)

will tell you what R thinks this field is.  If it is character then you 
can do


foo$none - as.numeric(foo$none)

to tell R to treat it as numbers.

Regards,

Paul.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bayes factor for a Welch or Yuen t-test

2010-11-29 Thread Wilson, Andrew
Although I have located a number of solutions for the Student t-test
(equal variances), I have been unable to find any code for calculating a
Bayes Factor for a Welch (unequal variances) or Yuen (trimmed mean)
t-test.  I wonder if anyone could help me with this?

Many thanks,

Andrew Wilson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RODBC read all columns as character

2010-11-29 Thread juerg.dietrich
I'm using sqlQuery() to import excel-data (.xlsx). Almost everthing works 
perfect. But a column which contains type numeric as well as character is read 
as numeric. characters are unfortunately transformed to NA.
How can I read all columns as characters? I've allready tried 'as.is=TRUE'.

Thanks,
juerg




  
This e-mail (including any attachments) is confidential,...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Array help

2010-11-29 Thread Patrick Burns

Instead of:

  7:1123:34

I think you mean:

  c(7:11, 23:34)

Using '' for concatenation
is not an unreasonable idea,
but it is decidedly not what
R does.

It would be instructive to do:

  7:11  23:34

at the R prompt to see what you
get.



On 29/11/2010 03:56, bfhancock wrote:


Josh, the data set is called StatTemps and is in the PASWR package.  I want
to make an array that involves only the 8 a.m. and a separate array that
involves only the 9 a.m. so i can get info on the temperatures in those
groups. So I still want it in the format of StatTemps but in two arrays that
are based on 8 a.m. or 9 a.m.I have been messing with this for a while.
And no it's not homework, I am just trying to learn R so I am more appealing
out in the field eventually.  They book I am using is confusing!  Hopefully
what I am trying to do isn't confusing.  I want to do an array that holds
the info from 1:6  12:22 for 8am and then 7:11  23:34 for 9am.  I was able
easily make two arrays based on sex by just putting in StatTemps[1:11,,]
StatTemps[12:34,,] and assumed I could just go StatTemps[1:612:22,,] and
StatTemps[7:1123:34,,] but that didn't work. Any ideas? Thanks so much!

-B


--
Patrick Burns
pbu...@pburns.seanet.com
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HELPPPPPP

2010-11-29 Thread piccino

please i've  a big problem.
i've to do a econometric-quantitative methods assignment about the canadian
lynx, the problem is that i really i don't know how to use r and how to
apply all the steps.
I begun the time plot, ACF and PACF but i'm not able to decide what is the
correct model of ARIMA, Holt-winter, ecc to forecast the next 20 years of
canadian lynx's cyle...
if someone can help me i really really appreciate it.
thanks...

-- 
View this message in context: 
http://r.789695.n4.nabble.com/HELPP-tp3063358p3063358.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] weighted Spearman correlation coefficient

2010-11-29 Thread Daniel Rabczenko

Hello,
I would be grateful if anybody can help me in finding an R function to 
compute weighted Spearman correlation coefficient?

Kind regards,
Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot data inside matrix

2010-11-29 Thread alcesgabbo

Hi, I have this problem:

I have this matrix:
 result property procProperty
2010-10-01 07:32:00 40   Asensor1 
2010-10-01 17:32:00 15   Asensor3 
2010-10-02 07:32:00 32   Asensor2 
2010-10-03 04:33:21 20   Bsensor1 
2010-10-03 04:33:21 33   Bsensor2 
2010-10-03 14:33:21 12   Asensor3 
2010-10-05 07:32:00 31   Bsensor1 
2010-10-05 07:32:00 15   Bsensor2 
2010-10-06 17:32:00 4Asensor3 

I would like to plot this matrix in this way:

create in this case 2 plots (one for each property: A and B )
for each plot there will be 3 lines (one for each procProperty:
sensor1,sensor2,sensor3) composed by the result.

How can I do this with few commands??

Thanks Alberto

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Plot-data-inside-matrix-tp3063417p3063417.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] in regards of plotting using functions.

2010-11-29 Thread PRAVIN
Hello,

I am using basic plotting technique to get a graph.
but i want to color the points plotted onto the graph depending upon few
mathematical logics.
values  x should be colored blue.
values  y should be colored green.

how can i go forward with the programming part in drawing these plots from
 a single file.

Please do let me know as soon as possible

Regards,
-- 
Pravin Nilawe
Bioinformatics,
+91 9869739671

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is gdata?

2010-11-29 Thread Stephen Liu
Hi Spencer,

Download and install Strawberry perl from http://strawberryperl.com;

Installation went through without problem.

Start R
 library(gdata)
gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.
gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.
Attaching package: 'gdata'
The following object(s) are masked from 'package:utils':
object.size

 ?read.xls
starts Read Excel files.  Thanks


B.R.
Stephen L






From: Spencer Graves spencer.gra...@structuremonitoring.com

Cc: Liviu Andronic landronim...@gmail.com; r-help r-help@r-project.org
Sent: Mon, November 29, 2010 4:57:53 PM
Subject: Re: [R] Where is gdata?

Hi, Stephen:


   The directory C:\Users\satimiswin764\My 
Documents\R\win-library\2.12\gdata\perl is NOT the perl interpreter but 
only perl code in the gdata package for R, invoked by certain R 
commands.  You need to install something like Strawberry perl, as I've 
previously stated.


   Spencer


On 11/29/2010 12:44 AM, Stephen Liu wrote:
 Hi Liviu,

 Not if you
 library(gdata)
 first. Then
 ?read.xls
 should work.
 Yes, I did.


 I found something strange here which I can't explain.

 Win 7 64bit
 R 32/64 bit

 Just rebooted Win 7 and R

 library(gdata)
 gdata: Unable to locate valid perl interpreter
 gdata:
 gdata: read.xls() will be unable to read Excel XLS and XLSX files
 gdata: unless the 'perl=' argument is used to specify the location of a
 gdata: valid perl intrpreter.
 gdata:
 gdata: (To avoid display of this message in the future, please ensure
 gdata: perl is installed and available on the executable search path.)
 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLX' (Excel 97-2004) files.

 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLSX' (Excel 2007+) files.

 gdata: Run the function 'installXLSXsupport()'
 gdata: to automatically download and install the perl
 gdata: libaries needed to support Excel XLS and XLSX formats.

 Attaching package: 'gdata'

 The following object(s) are masked from 'package:utils':

  object.size


 It complains.


 ?read.xls
 starting httpd help server ... done

 Read Excel files


 Both 32 and 64 bit R worked.


 If there is NO complaint on running;

 library(gdata)
 Then
 ?read.xls
 can't work.


 Perl seems has been installed.  But I can't recall, when and how;

 C:\dir C:\Users\satimiswin764\Documents\R\win-library\2.12\gdata\
 .
 11/22/2010  10:44 AMDIR   perl
 11/22/2010  10:44 AMDIR   R
 11/22/2010  10:44 AMDIR   unitTests
 11/22/2010  10:44 AMDIR   xls


 dir C:\Users\satimiswin764\My Documents\R\win-library\2.12\gdata\perl
 11/22/2010  10:44 AMDIR   .
 11/22/2010  10:44 AMDIR   ..
 11/22/2010  10:44 AMDIR   Archive
 11/22/2010  10:44 AM   418 install_modules.pl
 11/22/2010  10:44 AMDIR   IO
 11/22/2010  10:44 AM 2,710 module_tools.pl
 11/22/2010  10:44 AMDIR   OLE
 11/22/2010  10:44 AM 2,019 sheetCount.pl
 11/22/2010  10:44 AM 2,019 sheetNames.pl
 11/22/2010  10:44 AMDIR   Spreadsheet
 11/22/2010  10:44 AM   550 supportedFormats.pl
 11/22/2010  10:44 AM   114 VERSIONS
 11/22/2010  10:44 AM 5,512 xls2csv.pl
 11/22/2010  10:44 AM 5,512 xls2tab.pl
 11/22/2010  10:44 AM 5,512 xls2tsv.pl
 9 File(s) 24,366 bytes
 6 Dir(s)  16,776,032,256 bytes free


 B.R.
 Stephen L





 
 From: Liviu Androniclandronim...@gmail.com

 Cc: Gabor Grothendieckggrothendi...@gmail.com; r-helpr-help@r-project.org
 Sent: Mon, November 29, 2010 2:40:16 PM
 Subject: Re: [R] Where is gdata?


 ?read.xls
 I must run ??read.xls

 Not if you
 library(gdata)
 first. Then
 ?read.xls
 should work.

 Regards
 Liviu



 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] surpressing tickmarks / labels x-as for two sets of boxplot (plotted as stacked boxplots)

2010-11-29 Thread Karin
Hello,

I am trying to plot two sets of boxplots together. These are estimates of two 
experiments and seven factors.
The results of the two experiments I want to plot as boxplots stacked to each 
other.
Therefore I plot first the results of the first experiment; and next with the 
add option the second set of boxplots.
The boxplots are plotted at 'at = 1:7 - 0.15 for the first experiment and 
at=1:7 
+ 0.15 for the second.
I surpress plotting the tickmarks and labels succesfully for the first boxplot 
with xaxt=n.
But for the second this does not work! 
I want to plot the tickmarks and labels  at position at=1:7, as below using the 
axis function.
But with this code also tickmarks and labels are plotted at position 
at=1:7+0.15.

boxplot(coefs ~ factor, data = temp,
    boxwex = 0.25, at = 1:7 - 0.15,
    subset = experiment == first, col = red,
     xlab = factor,xaxt=n,
    ylab = individual estimates)

boxplot(coefs ~ factor, data = temp, naxt=n,add = TRUE,
    boxwex = 0.25, at = 1:7 + 0.15, 
    subset = experiment == second, col = green)

axis(at=1:7,side=1,c(fac1,fac2,fac3,fac4,fac5,fac6,fac7))

legend(6,-0.5, c(experiment1, experiment2),
   fill = c(red, green))

Does anyone know how I can surpress these labels for the second boxplot?

Thanks in advance,

Karin





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cross tabulate variables by subject id

2010-11-29 Thread Michael Bedward
Hi Marianne,

How about this...

ac.ad - unstack(dat1, choice ~ cond1:cond2)[, c(A.C, A.D)]
acad.xtab - with(ac.ad, table(A.C, A.D))

Michael


On 29 November 2010 20:18, Marianne Promberger
marianne.promber...@kcl.ac.uk wrote:
 Dear list,

 I have data like this:

 dat1 - data.frame(subject=rep(1:10,2),
                   cond1=rep(c(A,B),each=5),
                   cond2=rep(c(C,D),each=10),
                   choice=sample(0:1,10,replace=TRUE))

 I would like to compare subjects' choice for (cond1==A 
 cond2==C) vs (cond1==A  cond2==D), using mcnemar.test

 The ?mcnemar.test example has the data in a matrix:

     Performance
            2nd Survey
 1st Survey   Approve Disapprove
  Approve        794        150
  Disapprove      86        570


 So for my case, I need something like:

   Choice
         AC
 AD       0      1
  0     ...
  1

 Where ... would be the sum of subjects who answered 0 or 1 to AC
 and/or AD respectively.

 I can get the first step by making an extra variable:

 dat1$condnew - paste(dat1$cond1,dat1$cond2,sep=)

 although I am sure there are more elegant ways, and especially, I am
 stumped how to fill in the cells of the table.

 Thanks,

 Marianne

 --
 Marianne Promberger PhD, King's College London
 http://promberger.info
 R version 2.12.0 (2010-10-15)
 Ubuntu 9.04

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: HELPPPPPP

2010-11-29 Thread Petr PIKAL
Hi

What does your teacher says about  the procedures you shall use?

You shall go through help pages for ?spectrum, ?acf, ?ar and maybe some 
others.

Regards
Petr

r-help-boun...@r-project.org napsal dne 29.11.2010 11:33:58:

 
 please i've  a big problem.
 i've to do a econometric-quantitative methods assignment about the 
canadian
 lynx, the problem is that i really i don't know how to use r and how to
 apply all the steps.
 I begun the time plot, ACF and PACF but i'm not able to decide what is 
the
 correct model of ARIMA, Holt-winter, ecc to forecast the next 20 years 
of
 canadian lynx's cyle...
 if someone can help me i really really appreciate it.
 thanks...
 
 -- 
 View this message in context: http://r.789695.n4.nabble.com/HELPP-
 tp3063358p3063358.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Performance tuning tips when working with wide datasets

2010-11-29 Thread Andreas Borg

Richard Vlasimsky schrieb:

Does anyone have any performance tuning tips when working with datasets that 
are extremely wide (e.g. 20,000 columns)?

In particular, I am trying to perform a merge like below:

merged_data - merge(data1, data2, by.x=ate,by.y=date,all=TRUE,sort=TRUE);

This statement takes about 8 hours to execute on a pretty fast machine.  The dataset data1 contains daily data going back to 1950 (20,000 rows) and has 25 columns.  The dataset data2 contains annual data (only 60 observations), however there are lots of columns (20,000 of them).  

I have to do a lot of these kinds of merges so need to figure out a way to speed it up.  


I have tried  a number of different things to speed things up to no avail.  
I've noticed that rbinds execute much faster using matrices than dataframes.  
However the performance improvement when using matrices (vs. data frames) on 
merges were negligible (8 hours down to 7).  I tried casting my merge field 
(date) into various different data types (character, factor, date).  This 
didn't seem to have any effect. I tried the hash package, however, merge 
couldn't coerce the class into a data.frame.  I've tried various ways to 
parellelize computation in the past, and found that to be problematic for a 
variety of reasons (runaway forked processes, doesn't run in a GUI environment, 
doesn't run on Macs, etc.).

I'm starting to run out of ideas, anyone?  Merging a 60 row dataset shouldn't 
take that long.

Thanks,
Richard

  


Hi Richard,

I had similar problems (even with much less data) and found out that 
most of the running time was caused by memory swapping instead of CPU 
usage. If you do not need all of the merged data at once, block-wise 
processing can help, which means that you only generate that much merged 
data at once as fits into main memory. I ended up using package RSQLite 
(an embedded database) in the following way:


-create a database connection (explained in the package docs)
-copy data to database tables via dbWriteTable()
-create indices on the columns which are used for merging, sth. like: 
dbGetQuery(con, 'create index index_year on table2(year)') - this 
speeds upjoining significantly
-construct an SQL query to do the join / merge operation and send it to 
SQLite via dbSendQuery()

-retreive the result in blocks of reasonable size with fetch()

Unless there is an operation in the query which requires SQLite to 
process the whole result (e.g. sorting), the result rows will be created 
on the fly for every call of fetch() instead of a huge table being 
allocated in addition to the original data.


I am not sure if this works with other database engines (there are a 
couple of database interfaces on CRAN); when I tried to use RPostgreSQL, 
it created the whole result set at once, leading to the same memory 
problem. Maybe that behavior can be changed by some config variable.


Best regards,

Andreas

--
Andreas Borg
Medizinische Informatik

UNIVERSITÄTSMEDIZIN
der Johannes Gutenberg-Universität
Institut für Medizinische Biometrie, Epidemiologie und Informatik
Obere Zahlbacher Straße 69, 55131 Mainz
www.imbei.uni-mainz.de

Telefon +49 (0) 6131 175062
E-Mail: b...@imbei.uni-mainz.de

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. 
Wenn Sie nicht der
richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren 
Sie bitte sofort den
Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die 
unbefugte Weitergabe
dieser Mail und der darin enthaltenen Informationen ist nicht gestattet.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: in regards of plotting using functions.

2010-11-29 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 29.11.2010 11:48:07:

 Hello,
 
 I am using basic plotting technique to get a graph.
 but i want to color the points plotted onto the graph depending upon few
 mathematical logics.
 values  x should be colored blue.
 values  y should be colored green.
 

Untested

plot(a, b, pch=19, col=c(green, blue)[ifelse(valuesx, 2, 
ifelse(valuesy, 1, NA))])

Regards
Petr


 how can i go forward with the programming part in drawing these plots 
from
  a single file.
 
 Please do let me know as soon as possible
 
 Regards,
 -- 
 Pravin Nilawe
 Bioinformatics,
 +91 9869739671
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] surpressing tickmarks / labels x-as for two sets of boxplot (plotted as stacked boxplots)

2010-11-29 Thread Peter Ehlers

On 2010-11-29 03:17, Karin wrote:

Hello,

I am trying to plot two sets of boxplots together. These are estimates of two
experiments and seven factors.
The results of the two experiments I want to plot as boxplots stacked to each
other.
Therefore I plot first the results of the first experiment; and next with the
add option the second set of boxplots.
The boxplots are plotted at 'at = 1:7 - 0.15 for the first experiment and at=1:7
+ 0.15 for the second.
I surpress plotting the tickmarks and labels succesfully for the first boxplot
with xaxt=n.
But for the second this does not work!
I want to plot the tickmarks and labels  at position at=1:7, as below using the
axis function.
But with this code also tickmarks and labels are plotted at position
at=1:7+0.15.

boxplot(coefs ~ factor, data = temp,
 boxwex = 0.25, at = 1:7 - 0.15,
 subset = experiment == first, col = red,
  xlab = factor,xaxt=n,
 ylab = individual estimates)

boxplot(coefs ~ factor, data = temp, naxt=n,add = TRUE,
 boxwex = 0.25, at = 1:7 + 0.15,
 subset = experiment == second, col = green)

axis(at=1:7,side=1,c(fac1,fac2,fac3,fac4,fac5,fac6,fac7))

legend(6,-0.5, c(experiment1, experiment2),
fill = c(red, green))

Does anyone know how I can surpress these labels for the second boxplot?



Perhaps all you need is a bit more care in typing: naxt???

Peter Ehlers


Thanks in advance,

Karin




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Moran I for very large data set

2010-11-29 Thread Watmough G.
Hi

Are there any more efficient ways of calculating the neighbourhood object for 
large datasets?

I am trying to compute Moran I statistics for a very large data set (over 
14,000 points).  I have been using moran.test from the spdep package and 
everything works fine for a small data set (200 points).  However, applying the 
same script to the whole dataset is taking days to compute (it so far has been 
going for 5 days and still no results).  This is no surprise due to the number 
of computations required.

I have found that calculating distances planar distances works much quicker but 
Great Circle distances are required.

Thanks

Gary Watmough


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] in regards of plotting using functions.

2010-11-29 Thread PRAVIN
Thanks for your help  guidance.

I am taking these values from a file as co-ordinates.
I tried using the code for the same but it gave an error. saying X object
error.
plot((x=[,1],y=[,2]),

so how should i embedded the expression within the plot?

regards,
Pravin

On Mon, Nov 29, 2010 at 5:20 PM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi

 r-help-boun...@r-project.org napsal dne 29.11.2010 11:48:07:

  Hello,
 
  I am using basic plotting technique to get a graph.
  but i want to color the points plotted onto the graph depending upon few
  mathematical logics.
  values  x should be colored blue.
  values  y should be colored green.
 

 Untested

 plot(a, b, pch=19, col=c(green, blue)[ifelse(valuesx, 2,
 ifelse(valuesy, 1, NA))])

 Regards
 Petr


  how can i go forward with the programming part in drawing these plots
 from
   a single file.
 
  Please do let me know as soon as possible
 
  Regards,
  --
  Pravin Nilawe
  Bioinformatics,
  +91 9869739671
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
Pravin Nilawe
Bioinformatics,
+91 9869739671

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Troubles in plotting to a postscript file (not to png)

2010-11-29 Thread pilchat

Dear R users,

I am trying to produce some plots in a postscript file, but I am 
experiencing some issues. I open the device with


-
setPS()
postscript (file='gs_mcmc_dust.ps',width=5*3,height=5*3,horizontal = 
FALSE, paper = special,family = 
ComputerModern,encoding=TeXtext.enc)#, onefile = FALSE

-

(it's a 9x9 multiplot) and I close it with

-
  dev.off()
-

Here are my problems:

-) xlab=expression(paste(lambda,(,~mu,m,))) : it correctly prints 
the mu greek letter but it fails for the lambda letter, leaving a 
blank space
-) text(min(mchain[2,]),max(tdensity$y),substitute( T[disk,med] == tmed 
%+-% tstd (K),list(tmed=tmed,tstd=tstd)),pos=4,cex=1.5  ) : it prints 
everything, but the +- symbol and (K) overlap with the substitute 
for tmed and tstd respectively. How can I force a blank space between 
numbers and symbols? Also, how can I set the number of decimal digits 
for tmed and tstd? ( option(digits=4) does not work )
-) once I close the R session without saving it (I answer n when 
quitting), the content of the ps file is erased. Do you know why?


I solve these problems plotting to a PNG device, but a postscript file 
is what I need.


Can you help me, please?

Thank you very much in advance

Cheers

Gaetano

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is gdata?

2010-11-29 Thread Gabor Grothendieck
On Mon, Nov 29, 2010 at 3:44 AM, Stephen Liu sati...@yahoo.com wrote:
 Hi Liviu,

 Not if you
 library(gdata)
 first. Then
 ?read.xls

 should work.

 Yes, I did.


 I found something strange here which I can't explain.

 Win 7 64bit
 R 32/64 bit

 Just rebooted Win 7 and R

 library(gdata)
 gdata: Unable to locate valid perl interpreter
 gdata:
 gdata: read.xls() will be unable to read Excel XLS and XLSX files
 gdata: unless the 'perl=' argument is used to specify the location of a
 gdata: valid perl intrpreter.
 gdata:
 gdata: (To avoid display of this message in the future, please ensure
 gdata: perl is installed and available on the executable search path.)
 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLX' (Excel 97-2004) files.

 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLSX' (Excel 2007+) files.

 gdata: Run the function 'installXLSXsupport()'
 gdata: to automatically download and install the perl
 gdata: libaries needed to support Excel XLS and XLSX formats.

 Attaching package: 'gdata'

 The following object(s) are masked from 'package:utils':

     object.size

This is just a message that it can't find perl.  If you don't need to
use read.xls then you don't need perl so you can  ignore the message.
If you do need to use read.xls then install perl and once you have
done that then run installXLSXsupport().



 It complains.


 ?read.xls
 starting httpd help server ... done

 Read Excel files


 Both 32 and 64 bit R worked.


 If there is NO complaint on running;

 library(gdata)

 Then
 ?read.xls

 can't work.

Can you clarify when ?read.xls works for you and when it does not?



 Perl seems has been installed.  But I can't recall, when and how;

 C:\dir C:\Users\satimiswin764\Documents\R\win-library\2.12\gdata\
 .
 11/22/2010  10:44 AM    DIR  perl
 11/22/2010  10:44 AM    DIR  R
 11/22/2010  10:44 AM    DIR  unitTests
 11/22/2010  10:44 AM    DIR  xls


The gdata\perl folder contains perl libraries that come with gdata.
Perl itself is not distributed with gdata and you don't need perl at
all to use gdata except for read.xls and related functions.

My understanding is that this question has nothing to do with perl nor
with read.xls and that the problem is that you seem to be able to run
this:

library(gdata)
?read.xls

and sometimes it works and at other times it does not work.  Is that
right?  Does it occur with any other package?  How about removing
gdata and reinstalling it?

remove.packages(gdata)
... exit R and check if gdata has been removed ...
... restart R ...
install.packages(gdata)


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot data inside matrix

2010-11-29 Thread David Winsemius


On Nov 29, 2010, at 6:22 AM, alcesgabbo wrote:



Hi, I have this problem:

I have this matrix:


Doubtful that is is a matrix. In R matrices are all of the same type  
of object. This looks more like a zoo object since it has a time  
index. How was it created and what does str() show?



result property procProperty
2010-10-01 07:32:00 40   Asensor1
2010-10-01 17:32:00 15   Asensor3
2010-10-02 07:32:00 32   Asensor2
2010-10-03 04:33:21 20   Bsensor1
2010-10-03 04:33:21 33   Bsensor2
2010-10-03 14:33:21 12   Asensor3
2010-10-05 07:32:00 31   Bsensor1
2010-10-05 07:32:00 15   Bsensor2
2010-10-06 17:32:00 4Asensor3

I would like to plot this matrix in this way:

create in this case 2 plots (one for each property: A and B )
for each plot there will be 3 lines (one for each procProperty:
sensor1,sensor2,sensor3) composed by the result.


So, what do you want:  a dotplot,  a barchart , a time-series  or  
what?





How can I do this with few commands??


Possibly depending on the correctness of my assumptions:

xyplot.zoo





--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Array help

2010-11-29 Thread bfhancock

if you can load the PASWR package and pull up StatTemps you will see what I
am talking about.  Otherwise I fear that my question will just be confusing. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Array-help-tp3062992p3063535.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help Please!!!!!!!!!

2010-11-29 Thread jim holtman
Your data seems to read in just fine, so what is the problem you are
trying to solve?

 x - read.table('clipboard', sep='\t', header=TRUE)
 str(x)
'data.frame':   5 obs. of  5 variables:
 $ X : Factor w/ 5 levels JE,JM,S,..: 5 2 4 1 3
 $ None  : int  4 4 25 18 10
 $ Light : int  2 3 10 24 6
 $ Medium: int  3 7 12 33 7
 $ Heavy : int  2 4 4 13 2
 summary(x)
  X  None  LightMedium Heavy
 JE:1   Min.   : 4.0   Min.   : 2   Min.   : 3.0   Min.   : 2
 JM:1   1st Qu.: 4.0   1st Qu.: 3   1st Qu.: 7.0   1st Qu.: 2
 S :1   Median :10.0   Median : 6   Median : 7.0   Median : 4
 SE:1   Mean   :12.2   Mean   : 9   Mean   :12.4   Mean   : 5
 SM:1   3rd Qu.:18.0   3rd Qu.:10   3rd Qu.:12.0   3rd Qu.: 4
Max.   :25.0   Max.   :24   Max.   :33.0   Max.   :13


On Mon, Nov 29, 2010 at 12:29 AM, Melissa Waldman
melissawald...@gmail.com wrote:
 Hi,

 I have been working with Program R for my stats class and I keep coming upon
 the same error, I have read so many sites about inputting data from a text
 file into R and I'm using the data to do a correspondence analysis.  I feel
 like I have read everything and it is still not explaining why the error
 message keeps coming up, I have used the exact examples I have seen in
 articles and the same error keeps popping up: Error in sum(N) : invalid
 'type' (character) of argument

 I have spent so long trying to figure this out without success,
 I am sure it has to do with the fact that my rows have names in them.  I
 have attached the text file I have been using and if you have any ideas as
 to how I can get R to plot the data using correspondence analysis with the
 column and row names that would be really helpful!  Or if you could pass
 this email to someone who may know how to help me, that would be much
 appreciated.

 Thank you,
 Melissa Waldman

 my email: melissawald...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] in regards of plotting using functions.

2010-11-29 Thread Petr PIKAL
Hi

PRAVIN 2pravinnil...@gmail.com napsal dne 29.11.2010 13:18:45:

 Thanks for your help  guidance.
 
 I am taking these values from a file as co-ordinates.
 I tried using the code for the same but it gave an error. saying X 
object error.
 plot((x=[,1],y=[,2]), 

What is [,1]? You shall have some data from which you wanted to extract 
first and second column.

What about to look into R-intro manual to chapter 2.

Regards
Petr

 
 so how should i embedded the expression within the plot?
 
 regards,
 Pravin
 
 On Mon, Nov 29, 2010 at 5:20 PM, Petr PIKAL petr.pi...@precheza.cz 
wrote:
 Hi
 
 r-help-boun...@r-project.org napsal dne 29.11.2010 11:48:07:
 
  Hello,
 
  I am using basic plotting technique to get a graph.
  but i want to color the points plotted onto the graph depending upon 
few
  mathematical logics.
  values  x should be colored blue.
  values  y should be colored green.
 

 Untested
 
 plot(a, b, pch=19, col=c(green, blue)[ifelse(valuesx, 2,
 ifelse(valuesy, 1, NA))])
 
 Regards
 Petr
 
 
  how can i go forward with the programming part in drawing these plots
 from
   a single file.
 
  Please do let me know as soon as possible
 
  Regards,
  --
  Pravin Nilawe
  Bioinformatics,
  +91 9869739671
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Pravin Nilawe
 Bioinformatics,
 +91 9869739671

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] attached file

2010-11-29 Thread Lorenzo Melchor

I forgot to attach it...



The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addressee only.  If the 
message is received by anyone other than the addressee, please return the 
message to the sender by replying to it and then delete the message from your 
computer and network.



Lorenzo Melchor, PhD
Mammary Stem Cell Team
The Breakthrough Breast Cancer Research Centre (ICR)
237 Fulham Road
SW3 6JB
lorenzo.melc...@icr.ac.uk



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems in running affylmGUI

2010-11-29 Thread Lorenzo Melchor

Hi,

I am trying to run affylmGUI on my mac computer. I have already  
installed the Tlc package as well as Bwidgets through ActiveTcl  
conversion installing files.


However, when running affylmGUI() on R, I keep getting the message in  
the attached file.


I have copied the tcl folders from the root library to the user  
library, and have obtained the same issue.


I would really appreciate if you could help me setting this up. Could  
you please send me an easy guideline to get affylmGUI working?  
Otherwise, we could use TeamViewer if that would be easier.


Kind regards,
Lorenzo


Lorenzo Melchor, PhD
Mammary Stem Cell Team
The Breakthrough Breast Cancer Research Centre (ICR)
237 Fulham Road
SW3 6JB
lorenzo.melc...@icr.ac.uk




The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the a...{{dropped:2}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] List of influential points?

2010-11-29 Thread Schwab,Wilhelm K
Hello all,

I fit a linear model to some data and used plot() to create diagnostic plots 
for the fit; I am having trouble reading the points that R is flagging as 
influential.  Is there a way to get the list of influential points from the fit 
or its summary, etc.?  Most likely, there are a few points appearing in almost 
the same place, making it difficult to read from the plots.

Bill

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] selecting only corresponding categories from a confusion matrix

2010-11-29 Thread drflxms
Dear R colleagues,

as a result of my calculations regarding the inter-observer-variability
in bronchoscopy, I get a confusion matrix like the following:

   0   1 1001 1010  11
0609  11   54   36   6
1  1   260   2
1014   008   4
1004   000   0
1000  23   7   12   10   5
1001   0   040   0
1010   4   003   0
1011   1   010   2
11 0   033   1
1101   000   0
1100   2   000   0
1110   1   000   0

The first column represents the categories found among observers, the
top row represents the categories found by the reference (goldstandard).
I am looking for a way (general algorithm) to extract a data.frame with
only the corresponding categories among observers and reference from the
above confusion matrix. Corresponding means in this case, that a
category has been chosen by both: observers and reference.
In this example corresponding categories would be simply all categories
that have been chosen by the reference (0,1,1001,1010,11), but generally
there might also occur categories which are found by the reference only
(and not among observers - in the first column).
So the solution-dataframe for the above example would look like:

   0   1 1001 1010  11
0609  11   54   36   6
1  1   260   2
1001   0   040   0
1010   4   003   0
11 0   033   1

All the categories found among observers only, were omitted.

If the solution algorithm would include a method to list the omitted
categories and to count their number as well as the number of omitted
cases, it would be just perfect for me.

I'd be happy to read from you soon! Thanks in advance for any kind of
help with this.
Greetings from snowy Munich, Felix

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aftreg vs survreg loglogistic aft model (different intercept term)

2010-11-29 Thread Terry Therneau
 Survreg maximizes the log-likelihood to a relative tolerance of 1e-9
(?survreg.control).  The printout shows -379503.5, to see the rest of
the digits you need something like:
fit - survreg(
print(fit$loglik, digits=9)

Aftreg printed even less digits; you would have to do the same with it
to see which routine got closer to maximizing the actual log-likelihood.
That is of survreg showed -37903.5392 and  aftreg -37903.6123 then
survreg wins.  

Likley all this means is that the default iteration tolerance is smaller
for one routine than for the other.  When you consider that
significant changes in a log-likihood are on the order of 3.94/2 =2
units, I do not get very excited by a .08 difference in convergence.

Terry Therneau

-- begin included message --
I add an example , all the variables are mutually excluding dummy
variables,
notice the different intercept: 5.627 vs 5.545:
survreg:
  Value Std. Error zp
(Intercept)   5.6270.00887 634.3 0.00e+00
Var1.recR2 -0.1080.01026 -10.5 1.00e-25
Var1.recR3 -0.4900.01099 -44.5 0.00e+00
Var1.recR4 -0.5420.01303 -41.6 0.00e+00
Var1.recR5 -0.8910.01095 -81.3 0.00e+00
Log(scale)   -0.3240.00350 -92.7 0.00e+00

Scale= 0.723 

Log logistic distribution
Loglik(model)= -379503.5   Loglik(intercept only)= -383388.9
Chisq= 7770.76 on 4 degrees of freedom, p= 0 

aftreg:
Covariate  W.mean  Coef Exp(Coef)  se(Coef)Wald p
Var1.recR 
   10.253 0 1   (reference)
   20.330 0.108 1.114 0.010 0.000 
   30.191 0.490 1.632 0.011 0.000 
   40.106 0.542 1.720 0.013 0.000 
   50.120 0.891 2.437 0.011 0.000 

log(scale)5.545   256.029 0.008 0.000 
log(shape)0.324 1.383 0.003 0.000 

Max. log. likelihood  -379504 

 end inclusion ---

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Custom ticks on x axis when dates are involved - many thanks!

2010-11-29 Thread Monica Pisica

Hi,
 
I am sorry i am sending this again, but my email was snatched over the weekend 
by a spam generator and i was not able to send any email out. But now things 
are again back to normal. Even if i wrote to those who answered my question, i 
would like to let the list know that i got actually 2 solutions, both correct, 
but the following one is amazingly clear, short and to the point. I still 
struggle with the concept of dates in R.
 
Thanks a lot,
 
Monica


 Date: Thu, 25 Nov 2010 18:58:08 +1100
 From: j...@bitwrit.com.au
 To: pisican...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] Custom ticks on x axis when dates are involved

 On 11/25/2010 06:27 AM, Monica Pisica wrote:
 
  ...
  Now the graph looks very close to what i want, but i know that my ticks 
  actually are not exactly at 01/01/ as i would like, although i suppose 
  my error is not that much in this instance. However i would really 
  appreciate if i can get the ticks on my x axis how i want in a much more 
  elegant way - if possible (and if not at least in the correct way).
 
 Hi Monica,
 How about this?

 mpdates-as.POSIXct(paste(1/1,1984:2009,sep=/),
 format=%d/%m/%Y)
 axis(1,at=mpdates,1984:2009,las=2)

 Jim 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems in running affylmGUI

2010-11-29 Thread James W. MacDonald

Hi Lorenzo,

Your question pertains to a Bioconductor package, so you are better off 
posing the question on the BioC-help list (CC'ed).


Best,

Jim



On 11/29/2010 7:47 AM, Lorenzo Melchor wrote:

Hi,

I am trying to run affylmGUI on my mac computer. I have already
installed the Tlc package as well as Bwidgets through ActiveTcl
conversion installing files.

However, when running affylmGUI() on R, I keep getting the message in
the attached file.

I have copied the tcl folders from the root library to the user library,
and have obtained the same issue.

I would really appreciate if you could help me setting this up. Could
you please send me an easy guideline to get affylmGUI working?
Otherwise, we could use TeamViewer if that would be easier.

Kind regards,
Lorenzo


Lorenzo Melchor, PhD
Mammary Stem Cell Team
The Breakthrough Breast Cancer Research Centre (ICR)
237 Fulham Road
SW3 6JB
lorenzo.melc...@icr.ac.uk




The Institute of Cancer Research: Royal Cancer Hospital, a charitable
Company Limited by Guarantee, Registered in England under Company No.
534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the a...{{dropped:2}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted Spearman correlation coefficient

2010-11-29 Thread Łukasz Ręcławowicz
2010/11/29 Daniel Rabczenko dan...@medstat.waw.pl

 I would be grateful if anybody can help me in finding an R function to
 compute weighted Spearman correlation coefficient?


There is someone, he lives here http://finzi.psych.upenn.edu/search.html

But you can write R-code for this coefficient using:

?rank
?var
?cov.wt

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting only corresponding categories from a confusion matrix

2010-11-29 Thread David Winsemius


On Nov 29, 2010, at 8:32 AM, drflxms wrote:


Dear R colleagues,

as a result of my calculations regarding the inter-observer- 
variability

in bronchoscopy, I get a confusion matrix like the following:

  0   1 1001 1010  11
0609  11   54   36   6
1  1   260   2
1014   008   4
1004   000   0
1000  23   7   12   10   5
1001   0   040   0
1010   4   003   0
1011   1   010   2
11 0   033   1
1101   000   0
1100   2   000   0
1110   1   000   0

The first column represents the categories found among observers, the
top row represents the categories found by the reference  
(goldstandard).
I am looking for a way (general algorithm) to extract a data.frame  
with
only the corresponding categories among observers and reference from  
the

above confusion matrix. Corresponding means in this case, that a
category has been chosen by both: observers and reference.
In this example corresponding categories would be simply all  
categories
that have been chosen by the reference (0,1,1001,1010,11), but  
generally
there might also occur categories which are found by the reference  
only

(and not among observers - in the first column).
So the solution-dataframe for the above example would look like:

  0   1 1001 1010  11
0609  11   54   36   6
1  1   260   2
1001   0   040   0
1010   4   003   0
11 0   033   1


I wasn't able to follow the confusing, er, confusion matrix  
explanation but it appears from a comparison of the input and output  
that you just want row indices that are the  column names:


 mtx[colnames(mtx), ]
   0  1 1001 1010 11
0609 11   54   36  6
1  1  260  2
1001   0  040  0
1010   4  003  0
11 0  033  1

 # and the omitted

 mtx[!rownames(mtx) %in% colnames(mtx), ]
  0 1 1001 1010 11
10   14 008  4
100   4 000  0
1000 23 7   12   10  5
1011  1 010  2
110   1 000  0
1100  2 000  0
1110  1 000  0

 # and their number:

 NROW(mtx[!rownames(mtx) %in% colnames(mtx), ])
[1] 7




All the categories found among observers only, were omitted.

If the solution algorithm would include a method to list the omitted
categories and to count their number as well as the number of omitted
cases, it would be just perfect for me.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot data inside matrix

2010-11-29 Thread alcesgabbo

yes, this is a zoo object.

First off all I have:

procProperty: sensor3 sensor3 sensor3 sensor3 sensor3 sensor3
sensor3 sensor3
property:  A B B A B B A A A
data:  40 20 31 32 15 33 15 12  4

I create a matrix with this objects:

data-cbind(data,property)
data-cbind(data,procProperty)

now data is:

data:
 data property procProperty
 [1,] 40 A  sensor3  
 [2,] 20 B  sensor3  
 [3,] 31 B  sensor3  
 [4,] 32 A  sensor3  
 [5,] 15 B  sensor3  
 [6,] 33 B  sensor3  
 [7,] 15 A  sensor3  
 [8,] 12 A  sensor3  
 [9,] 4  A  sensor3  

index contains the date :  2010-10-1 7:32:00  2010-10-3 4:33:21 
2010-10-5 7:32:00  2010-10-2 7:32:00  2010-10-5 7:32:00
 2010-10-3 4:33:21  2010-10-1 17:32:00 2010-10-3 14:33:21 2010-10-6
17:32:00

I modifed with this function: index-as.POSIXlt(index)

then I do:

sensor-zoo(data,index)

sensor is:
   data property procProperty
2010-10-01 07:32:00 40   Asensor3
2010-10-01 17:32:00 15   Asensor3
2010-10-02 07:32:00 32   Asensor3
2010-10-03 04:33:21 20   Bsensor3
2010-10-03 04:33:21 33   Bsensor3
2010-10-03 14:33:21 12   Asensor3
2010-10-05 07:32:00 31   Bsensor3
2010-10-05 07:32:00 15   Bsensor3
2010-10-06 17:32:00 4Asensor3   

str(sensor) is:
 str(sensor)
‘zoo’ series from 2010-10-01 07:32:00 to 2010-10-06 17:32:00
  Data: chr [1:9, 1:3] 40 15 32 20 33 12 31 15 4 A A
A B B A ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:3] data property procProperty
  Index:  POSIXlt[1:9], format: 2010-10-01 07:32:00 2010-10-01 17:32:00
2010-10-02 07:32:00 2010-10-03 04:33:21 ...

it doesn't matter the type of the plot, the problem is how can i manage all
the data in order to visualize the plot?

How can I tell to the pc that I want one plot for each property (A and
B) and a line for each procProperty??

Maybe I should use the function tapply?? (in order to have an object like
this:)

table for A:
  sensor1  sensor2  sensor3
2010:   40  32 20
2011:   30   30   15


table for B:
  sensor1  sensor2  sensor3
2010:   14  3 12
2011:   10   30   15
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Plot-data-inside-matrix-tp3063417p3063600.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the first. from SAS in R

2010-11-29 Thread Nutter, Benjamin
My apologies for coming to the party so late.

I'm sure this question has been answered a couple of times.  The
attached function is one I pulled from the help archives, but I can't
seem to duplicate the search that led me to it.

In any case, I've attached the function I found, and an .Rd file I use
as part of a local package.  I've also attached a pair of accompanying
records to retrieve the last record and the nth record.  These have the
advantage of not requiring data frames to be sorted prior to
extraction--the function will sort them for you.

Benjamin  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of David Katz
Sent: Wednesday, November 24, 2010 10:17 AM
To: r-help@r-project.org
Subject: Re: [R] the first. from SAS in R


Often the purpose of first/last in sas is to facilitate grouping of
observations in a sequential algorithm. This purpose is better served in
R by using vectorized methods like those in package plyr.

Also, note that first/last has different meanings in the context of by
x;
versus by x notsorted;. R duplicated does not address the latter,
which splits noncontiguous records with equal x.

Regards,
David
--
View this message in context:
http://r.789695.n4.nabble.com/the-first-from-SAS-in-R-tp3055417p3057476.
html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News  World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Troubles in plotting to a postscript file (not to png)

2010-11-29 Thread pilchat

Hi guys,

to make it easier, here is a simple case with the same issues. I use the 
short function below to make the attached PS file.


Things to fix:

-) the greek letter lambda is not printed, while mu is printed (see 
the plot command)
-) the annotation inside the plot area: the +- symbol and (K) 
overlap with the substitute for tmed and tstd respectively (see the text 
command). Also, how can I set the number of decimal digits for tmed and 
tstd? (option(digits=4) does not work )


Moreover, I'd like to make the characters thicker. Is there any way? 
Finally, once I close the R session without saving it (I answer n when 
quitting), the content of the ps file is erased. Do I miss something in 
writing the function?


Thanks

Gaetano


plot_example=function()
{

setPS()
postscript (file='plot_example.ps',width=5,height=5,horizontal = FALSE, 
paper = special,family = ComputerModern,encoding=TeXtext.enc)


tmed-1.23456789
tstd-1.23456789

plot(c(0,1),c(0,1),xlab=expression(paste(lambda,mu,T)),main=,sub=(a))#lambda 
not printed
text(.0,.8,substitute( T[disk,med] == tmed %+-% tstd 
(K),list(tmed=tmed,tstd=tstd)),pos=4,cex=1.5  )#overlapping symbols and 
numbers


dev.off()

}


plot_example.ps
Description: PostScript document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Do you find trusted Latin wife

2010-11-29 Thread nilie agfa
Do you find trusted Latin wife
http://wong.to/tp480

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: R encoding question

2010-11-29 Thread Xiaobo Gu
But Sys.setlocale tries to change the option of the whole OS, I just want only 
R to use a specified encoding, how can I do this. 


Xiaobo.Gu


-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
Sent: Monday, November 29, 2010 8:57 PM
To: Xiaobo Gu
Subject: Re: FW: R encoding question

I have never played with encodings myself.  Suggest you read the postgresql
documentation and try different arguments to Sys.setlocale in R.  You
probably have to do that before you initiate the database since it might not
have any effect afterwards. I am not sure this is the problem but its worth a 
try.
Here are some examples.

Sys.setlocale(locale=C)
Sys.setlocale(locale=en_NZ.iso88591)
Sys.setlocale(LC_ALL, en_US)
Sys.setlocale(LC_TIME, English)
Sys.setlocale('LC_ALL','fr_FR')
Sys.putenv(LANGUAGE=EN);Sys.setlocale(LC_ALL,EN)
Sys.putenv(LANGUAGE=FR);Sys.setlocale(LC_ALL,FR)


2010/11/29 Xiaobo Gu guxiaobo1...@gmail.com:
 Hi,
Can you help with this.

 Regards,

 Xiaobo Gu


 -Original Message-
 From: Xiaobo Gu [mailto:guxiaobo1...@gmail.com]
 Sent: Wednesday, November 24, 2010 10:19 PM
 To: r-help@r-project.org
 Subject: R encoding question

 Hi,
  I am using RpgSQL to retrieve data from a PostgreSQL database wich is
 with encoding UTF8, and I have some Chinese character in one of the
 columns, unfortunately R can't show it correctly.

 df - dbGetQuery(con, select * from test) df
  ab
 1 1 椤惧皬娉\xa2
 2 2   瑕冩 EURO\xa1

 I see the following option, do I need to change the encoding option to
 show the corresponding texts? In my case how to set?

 $encoding
 [1] native.enc

 Thanks,
 Xiaobo Gu





--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems in running affylmGUI

2010-11-29 Thread Ben Bolker
Lorenzo Melchor Lorenzo.Melchor at icr.ac.uk writes:

 I am trying to run affylmGUI on my mac computer. I have already  
 installed the Tlc package as well as Bwidgets through ActiveTcl  
 conversion installing files.
 
 However, when running affylmGUI() on R, I keep getting the message in  
 the attached file.
 
 I have copied the tcl folders from the root library to the user  
 library, and have obtained the same issue.
 
 I would really appreciate if you could help me setting this up. Could  
 you please send me an easy guideline to get affylmGUI working?  
 Otherwise, we could use TeamViewer if that would be easier.

  I strongly suggest that you send this e-mail to the Bioconductor
e-mail list instead: most people here have no idea about
affylmGUI or TeamViewer.  Also, most attachments are stripped from postings
to the list, so you may want to post it on the web in some public
place instead.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data.frame and formula classes of aggregate

2010-11-29 Thread David Freedman

Hi - I apologize for the 2nd post, but I think my question from a few weeks
ago may have been overlooked on a Friday afternoon.

I might be missing something very obvious, but is it widely known that the
aggregate function handles missing values differently depending if a data
frame or a formula is the first argument ?  For example, 

(d- data.frame(sex=rep(0:1,each=3),
wt=c(100,110,120,200,210,NA),ht=c(10,20,NA,30,40,50)))
x1- aggregate(d, by = list(d$sex), FUN = mean); 
names(x1)[3:4]- c('mean.dfcl.wt','mean.dfcl.ht')
x2- aggregate(cbind(wt,ht)~sex,FUN=mean,data=d); 
names(x2)[2:3]- c('mean.formcl.wt','mean.formcl.ht')
cbind(x1,x2)[,c(2,3,6,4,7)]

The output from the data.frame class has an NA if there are missing values
in the group for the variable with missing values.  But, the formula class
output seems to delete the entire row (missing and non-missing values) if
there are any NAs.  Wouldn't one expect that the 2 forms (data frame vs
formula) of aggregate would give the same result? 

thanks very much
david freedman, atlanta




-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-frame-and-formula-classes-of-aggregate-tp3063668p3063668.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Filling in missing time samples with na.approx

2010-11-29 Thread Jason Edgecombe

Hi Everyone,

I have a some data from a sports gps device like the following:

time latitude longitude altitude  distance heartrate
1 1277648884 0.304048 -0.793819  260  0.0094
2 1277648885 0.304056 -0.793772  262  4.30761595
3 127764 0.304060 -0.793696  263 11.26234797
4 1277648894 0.304075 -0.793544  263 25.237911   103
5 1277648898 0.304085 -0.793455  263 33.322525   108
6 1277648902 0.304064 -0.793387  256 40.042988   115

As you can see, the samples have irregular holes in the time column. How 
can I fill in the missing samples using na.approx?


I've tried to creating a blank series with no gaps and combine them, but 
merge just adds columns and rbind compains about duplicate indexes.


P.S. My GPS still has holes in the data when I turn off smart recording :(

Thanks,
Jason

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Troubles in plotting to a postscript file (not to png)

2010-11-29 Thread David Winsemius


On Nov 29, 2010, at 9:00 AM, pilchat wrote:


Hi guys,

to make it easier, here is a simple case with the same issues. I use  
the short function below to make the attached PS file.


Things to fix:

-) the greek letter lambda is not printed, while mu is printed  
(see the plot command)
-) the annotation inside the plot area: the +- symbol and (K)  
overlap with the substitute for tmed and tstd respectively (see the  
text command). Also, how can I set the number of decimal digits for  
tmed and tstd? (option(digits=4) does not work )


I would have thought one would do any formatting (of digits) outside  
the text( ...substitute(...),...) setting.




Moreover, I'd like to make the characters thicker. Is there any way?


Which characters? There is a bold() option within plotmath.

Finally, once I close the R session without saving it (I answer n  
when quitting), the content of the ps file is erased.


Now _that_ is weird. A file should have been created in your default  
directory and closing R should not have made it go away.



Do I miss something in writing the function?


Perhaps. (But you certainly missed something in writing the question.)  
When I remove the family=ComputerModern from the postscript call, I  
start seeing lambda.  And the other spacing weirness also  
resolves. I am on a Mac and ComputerModern is not one of the  
pdfFonts() on my machine. The list of available fonts varies widely  
across various OSes and devices about which you have given us no clues.


--
David.



Thanks

Gaetano


plot_example=function()
{

setPS()
postscript (file='plot_example.ps',width=5,height=5,horizontal =  
FALSE, paper = special,family =  
ComputerModern,encoding=TeXtext.enc)


tmed-1.23456789
tstd-1.23456789

plot 
(c 
(0,1 
),c 
(0,1 
),xlab=expression(paste(lambda,mu,T)),main=,sub=(a))#lambda  
not printed
text(.0,.8,substitute( T[disk,med] == tmed %+-% tstd  
(K),list(tmed=tmed,tstd=tstd)),pos=4,cex=1.5  )#overlapping symbols  
and numbers


dev.off()

}
plot_example.ps__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame and formula classes of aggregate

2010-11-29 Thread David Winsemius


On Nov 29, 2010, at 9:35 AM, David Freedman wrote:



Hi - I apologize for the 2nd post, but I think my question from a  
few weeks

ago may have been overlooked on a Friday afternoon.

I might be missing something very obvious, but is it widely known  
that the
aggregate function handles missing values differently depending if a  
data

frame or a formula is the first argument ?


I'm not sure if it is widely known, but it is certainly suggested by  
the documentation for aggregate, since aggregate.data.frame  has  
different defaults than aggregate.formula. See the Usage section at  
the very top of ?aggregate.




 For example,

(d- data.frame(sex=rep(0:1,each=3),
wt=c(100,110,120,200,210,NA),ht=c(10,20,NA,30,40,50)))
x1- aggregate(d, by = list(d$sex), FUN = mean);
names(x1)[3:4]- c('mean.dfcl.wt','mean.dfcl.ht')
x2- aggregate(cbind(wt,ht)~sex,FUN=mean,data=d);
names(x2)[2:3]- c('mean.formcl.wt','mean.formcl.ht')
cbind(x1,x2)[,c(2,3,6,4,7)]

The output from the data.frame class has an NA if there are missing  
values
in the group for the variable with missing values.  But, the formula  
class
output seems to delete the entire row (missing and non-missing  
values) if
there are any NAs.  Wouldn't one expect that the 2 forms (data frame  
vs

formula) of aggregate would give the same result?

thanks very much
david freedman, atlanta



--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] periodic time series

2010-11-29 Thread pengcafe

http://r.789695.n4.nabble.com/file/n3063697/sample.xlsx sample.xlsx 

So, here are some sample data. 
1st column is a label for 12 hour long days
2nd one a time stamp
The rest are the actual measured values for 4 different groups.

What would be the best model for such periodic data?

Thanks,
Andy
-- 
View this message in context: 
http://r.789695.n4.nabble.com/periodic-time-series-tp3062866p3063697.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filling in missing time samples with na.approx

2010-11-29 Thread Gabor Grothendieck
On Mon, Nov 29, 2010 at 9:45 AM, Jason Edgecombe
ja...@rampaginggeek.com wrote:
 Hi Everyone,

 I have a some data from a sports gps device like the following:

        time latitude longitude altitude  distance heartrate
 1 1277648884 0.304048 -0.793819      260  0.00        94
 2 1277648885 0.304056 -0.793772      262  4.307615        95
 3 127764 0.304060 -0.793696      263 11.262347        97
 4 1277648894 0.304075 -0.793544      263 25.237911       103
 5 1277648898 0.304085 -0.793455      263 33.322525       108
 6 1277648902 0.304064 -0.793387      256 40.042988       115

 As you can see, the samples have irregular holes in the time column. How can
 I fill in the missing samples using na.approx?

 I've tried to creating a blank series with no gaps and combine them, but
 merge just adds columns and rbind compains about duplicate indexes.

 P.S. My GPS still has holes in the data when I turn off smart recording :(


Try this:

Lines - time latitude longitude altitude  distance heartrate
1277648884 0.304048 -0.793819  260  0.0094
1277648885 0.304056 -0.793772  262  4.30761595
127764 0.304060 -0.793696  263 11.26234797
1277648894 0.304075 -0.793544  263 25.237911   103
1277648898 0.304085 -0.793455  263 33.322525   108
1277648902 0.304064 -0.793387  256 40.042988   115

# read in data
library(zoo)
z - read.zoo(textConnection(Lines), header = TRUE)

na.approx(z, xout = seq(min(time(z)), max(time(z



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RDA Triplot

2010-11-29 Thread Danielwc

Since I am doing an RDA with contraints I get this error message when trying
to use biplot.rda:

'biplot.rda' not suitable for models with constraints
Daniel
2010/11/26 Jari Oksanen [via R] 
ml-node+3060660-1324000642-57...@n4.nabble.comml-node%2b3060660-1324000642-57...@n4.nabble.com


 Danielwc daniel.carstensen at gmail.com writes:

  Im using the VEGAN package to do an RDA ordination. In my plot I get my
  environmental scores as arrows/vectors, but my species scores as points.
 I
  would like to get the species scores as arrows as well.
 
  Is there not a way I can tell R to plot both environmental and species
  scores as arrows/vectors?

 See ?biplot.rda in vegan.

 Cheers, Jari Oksanen

 __
 [hidden email] http://user/SendEmail.jtp?type=nodenode=3060660i=0mailing 
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 View message @
 http://r.789695.n4.nabble.com/RDA-Triplot-tp3055474p3060660.html
 To unsubscribe from RDA Triplot, click 
 herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3055474code=ZGFuaWVsLmNhcnN0ZW5zZW5AZ21haWwuY29tfDMwNTU0NzR8LTM3Mjc5OTMzMA==.



-- 
View this message in context: 
http://r.789695.n4.nabble.com/RDA-Triplot-tp3055474p3063712.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear fourth-order differential equations

2010-11-29 Thread Ravi Varadhan
OP is asking about a system of fourth-order differential equations,
whereas you are telling her how to solve a single, algebraic nonlinear
equation.

Take a look at package deSolve, and the function `lsode' in that package
for solving a system of nonlinear ODEs (given initial values).

Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns
Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Wu Gong
Sent: Sunday, November 28, 2010 6:31 PM
To: r-help@r-project.org
Subject: Re: [R] non-linear fourth-order differential equations


Hi Yanika,

Please try ?uniroot and ?ployroot

f - function(x) x^4-16
uniroot(f, lower= -3, upper=0)
polyroot(c(-16,0,0,0,1))

-
A R learner.
-- 
View this message in context:
http://r.789695.n4.nabble.com/non-linear-fourth-order-differential-equations
-tp3062805p3062894.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected behavior using round to 2 digits on randomly generated numbers

2010-11-29 Thread Petr Savicky
On Sun, Nov 28, 2010 at 01:53:05PM -0800, Jeff Newmiller wrote:
 FAQ 7.31
 
 http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

Additional information concerning rounding errors of double precision and 
suggestions for R code, which avoids them in some situations, may be found
in the first section of
  http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy
and in
  http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy:decimal_numbers

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is gdata?

2010-11-29 Thread Stephen Liu
Hi Gabor,

- snip -
..
  The following object(s) are masked from 'package:utils':

 : object.size
This is just a message that it can't find perl.  If you don't need to
 use read.xls then you don't need perl so you can  ignore the message.
 If you do need to use read.xls then install perl and once you have
 done that then run installXLSXsupport().

After having installed Strawberry perl the warning disappears.

library(gdata)
?read.xls
works starting Read Excel files

I haven't run installXLSXsupport() afterwards.

Just did it without success.

 installXLSXsupport()
Error: could not find function installXLSXsupport

Couldn't proceed further.


I can't resolve follows;

1)
library(AER)
data()
starts the datasets of AER

2)
library(gdata)
data()
gdata is added to the list of master dataset package ?


B.R.
Stephen L






From: Gabor Grothendieck ggrothendi...@gmail.com

Cc: Liviu Andronic landronim...@gmail.com; r-help r-help@r-project.org
Sent: Mon, November 29, 2010 8:39:55 PM
Subject: Re: [R] Where is gdata?


 Hi Liviu,

 Not if you
 library(gdata)
 first. Then
 ?read.xls

 should work.

 Yes, I did.


 I found something strange here which I can't explain.

 Win 7 64bit
 R 32/64 bit

 Just rebooted Win 7 and R

 library(gdata)
 gdata: Unable to locate valid perl interpreter
 gdata:
 gdata: read.xls() will be unable to read Excel XLS and XLSX files
 gdata: unless the 'perl=' argument is used to specify the location of a
 gdata: valid perl intrpreter.
 gdata:
 gdata: (To avoid display of this message in the future, please ensure
 gdata: perl is installed and available on the executable search path.)
 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLX' (Excel 97-2004) files.

 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLSX' (Excel 2007+) files.

 gdata: Run the function 'installXLSXsupport()'
 gdata: to automatically download and install the perl
 gdata: libaries needed to support Excel XLS and XLSX formats.

 Attaching package: 'gdata'

 The following object(s) are masked from 'package:utils':

 object.size

This is just a message that it can't find perl.  If you don't need to
use read.xls then you don't need perl so you can  ignore the message.
If you do need to use read.xls then install perl and once you have
done that then run installXLSXsupport().



 It complains.


 ?read.xls
 starting httpd help server ... done

 Read Excel files


 Both 32 and 64 bit R worked.


 If there is NO complaint on running;

 library(gdata)

 Then
 ?read.xls

 can't work.

Can you clarify when ?read.xls works for you and when it does not?



 Perl seems has been installed.  But I can't recall, when and how;

 C:\dir C:\Users\satimiswin764\Documents\R\win-library\2.12\gdata\
 .
 11/22/2010  10:44 AMDIR  perl
 11/22/2010  10:44 AMDIR  R
 11/22/2010  10:44 AMDIR  unitTests
 11/22/2010  10:44 AMDIR  xls


The gdata\perl folder contains perl libraries that come with gdata.
Perl itself is not distributed with gdata and you don't need perl at
all to use gdata except for read.xls and related functions.

My understanding is that this question has nothing to do with perl nor
with read.xls and that the problem is that you seem to be able to run
this:

library(gdata)
?read.xls

and sometimes it works and at other times it does not work.  Is that
right?  Does it occur with any other package?  How about removing
gdata and reinstalling it?

remove.packages(gdata)
... exit R and check if gdata has been removed ...
... restart R ...
install.packages(gdata)


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extracting P values from lm model

2010-11-29 Thread Rosario Garcia Gil
Hello

I am trying to get out of an lm model the fstatistics, however after I run the 
model I write 
 names(Model)

and the fstatistic does not appear only these.

names(Model)
 [1] coefficients  residuals effects   rank  
fitted.values
 [6] assignqrdf.residual   xlevels   call 

[11] terms model  

How could I extract the P values? I have run a cbind of 1800 response variables 
so is not easy to do it by hand.

Thanks in advance.
Rosario
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear fourth-order differential equations

2010-11-29 Thread Wu Gong

Hi Ravi,

Thank you for your correction. I hope I didn't mess up anything:)

Cheers.

Wu

-
A R learner.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/non-linear-fourth-order-differential-equations-tp3062805p3063761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Significance of the difference between two correlation coefficients

2010-11-29 Thread syrvn

Hi,

based on the sample size I want to calculate whether to correlation
coefficients are significantly different or not. I know that as a first step
both coefficients
have to be converted to z values using fisher's z transformation. I have
done this already but I dont know how to further proceed from there.

unlike for correlation coefficients I know that the difference for z values
is mathematically defined but I do not know how to incorporate the sample
size.

I found a couple of websites that provide that service but since I have huge
data sets I need to automate this procedure.

(http://faculty.vassar.edu/lowry/rdiff.html)

Can anyone help?

Cheers,
syrvn

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Significance-of-the-difference-between-two-correlation-coefficients-tp3063765p3063765.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is gdata?

2010-11-29 Thread Gabor Grothendieck
On Mon, Nov 29, 2010 at 10:18 AM, Stephen Liu sati...@yahoo.com wrote:
 Hi Gabor,

 - snip -
 ..
  The following object(s) are masked from 'package:utils':

 :     object.size
This is just a message that it can't find perl.  If you don't need to
 use read.xls then you don't need perl so you can  ignore the message.
 If you do need to use read.xls then install perl and once you have
 done that then run installXLSXsupport().

 After having installed Strawberry perl the warning disappears.

 library(gdata)
 ?read.xls
 works starting Read Excel files

 I haven't run installXLSXsupport() afterwards.

 Just did it without success.

 installXLSXsupport()
 Error: could not find function installXLSXsupport

 Couldn't proceed further.

Please start at a fresh version of R.  Copy and paste your session
from the R console rather than relating what happened.  Also, what
version of gdata are you using?  Older versions did not have
installXLSXsupport.  Show:

packageDescription(gdata)$Version
win.version()
R.version.string



 I can't resolve follows;

 1)
 library(AER)
 data()
 starts the datasets of AER

 2)
 library(gdata)
 data()
 gdata is added to the list of master dataset package

This is not clear.  Please provide exact and complete output.



 B.R.
 Stephen L


 
 From: Gabor Grothendieck ggrothendi...@gmail.com
 To: Stephen Liu sati...@yahoo.com
 Cc: Liviu Andronic landronim...@gmail.com; r-help r-help@r-project.org
 Sent: Mon, November 29, 2010 8:39:55 PM
 Subject: Re: [R] Where is gdata?

 On Mon, Nov 29, 2010 at 3:44 AM, Stephen Liu sati...@yahoo.com wrote:
 Hi Liviu,

 Not if you
 library(gdata)
 first. Then
 ?read.xls

 should work.

 Yes, I did.


 I found something strange here which I can't explain.

 Win 7 64bit
 R 32/64 bit

 Just rebooted Win 7 and R

 library(gdata)
 gdata: Unable to locate valid perl interpreter
 gdata:
 gdata: read.xls() will be unable to read Excel XLS and XLSX files
 gdata: unless the 'perl=' argument is used to specify the location of a
 gdata: valid perl intrpreter.
 gdata:
 gdata: (To avoid display of this message in the future, please ensure
 gdata: perl is installed and available on the executable search path.)
 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLX' (Excel 97-2004) files.

 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLSX' (Excel 2007+) files.

 gdata: Run the function 'installXLSXsupport()'
 gdata: to automatically download and install the perl
 gdata: libaries needed to support Excel XLS and XLSX formats.

 Attaching package: 'gdata'

 The following object(s) are masked from 'package:utils':

     object.size

 This is just a message that it can't find perl.  If you don't need to
 use read.xls then you don't need perl so you can  ignore the message.
 If you do need to use read.xls then install perl and once you have
 done that then run installXLSXsupport().



 It complains.


 ?read.xls
 starting httpd help server ... done

 Read Excel files


 Both 32 and 64 bit R worked.


 If there is NO complaint on running;

 library(gdata)

 Then
 ?read.xls

 can't work.

 Can you clarify when ?read.xls works for you and when it does not?



 Perl seems has been installed.  But I can't recall, when and how;

 C:\dir C:\Users\satimiswin764\Documents\R\win-library\2.12\gdata\
 .
 11/22/2010  10:44 AM    DIR  perl
 11/22/2010  10:44 AM    DIR  R
 11/22/2010  10:44 AM    DIR  unitTests
 11/22/2010  10:44 AM    DIR  xls


 The gdata\perl folder contains perl libraries that come with gdata.
 Perl itself is not distributed with gdata and you don't need perl at
 all to use gdata except for read.xls and related functions.

 My understanding is that this question has nothing to do with perl nor
 with read.xls and that the problem is that you seem to be able to run
 this:

 library(gdata)
 ?read.xls

 and sometimes it works and at other times it does not work.  Is that
 right?  Does it occur with any other package?  How about removing
 gdata and reinstalling it?

 remove.packages(gdata)
 ... exit R and check if gdata has been removed ...
 ... restart R ...
 install.packages(gdata)


 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com





-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Setting default path to library

2010-11-29 Thread Steve_Friedman

Hello

I recently upgraded from 2.11.1 to 2.12.0 on a windows machine. When I
launch R  via TINN - R,(2.3.7.0)  most things appear correct. The exception
is the path to the library.

I store all of the packages in C:\Program_Files \R\R-2.12.0\library

Last week when I upgraded I rec'd an error:
Error in loadNamespace(i[[1L]], c(lib.loc, .libPaths(  :
 there is no package called cluster

Well the cluster package was there, but I installed a new version from a
local *.zip file to be sure, and things worked fine.

This morning, when I launched R again, I rec'd the same error message, also
indicating that the package Hmisc could not be loaded.  It too is in the
library folder.

I think the system is searching for some packages in
C:\\Program_Files\R\R-2.12.0\bin\i386

Why would it do that and what is the appropriate commands to tell R where
to look for installed packages?
I've search the archive and have not found a clear understandable answer.

As always, Thanks for the assistance

Steve




Steve Friedman Ph. D.
Ecologist  / Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to remove a package.

2010-11-29 Thread Gavin Simpson
On Sun, 2010-11-28 at 07:58 -0800, Stephen Liu wrote:
 Hi David,
 
 Thanks for your advice.  I got it.
 
 But I can't resolve:

*sigh*

The packages are listed in alphabetical sort order, hence AER comes
before car comes before datasets comes before Ecdat...

You just need to page-down through the list of package data sets to see
later ones.

Seriously, and I don't mean to be rude, but instead of scatter-gun
replies to the list, try engaging your brain and actually look
*properly* at what is displayed...

G

  library(AER)
 Loading required package: car
 Loading required package: MASS
 Loading required package: nnet
 Loading required package: survival
 Loading required package: splines
 Loading required package: Formula
 Loading required package: lmtest
 Loading required package: zoo
 Loading required package: sandwich
 Loading required package: strucchange
 
  data()
 displays Data sets in package ‘AER’:
 
 But;
  library(Ecdat)
  data()
 displays Data sets in package ‘datasets’:
 a large datasets including those in package Ecdat?  NOt only Ecdat 
 separately.
 
 B.R.
 Stephen L
 
 
 
 
 
 From: David Winsemius dwinsem...@comcast.net
 
 Cc: Stefan Grosse singularit...@gmx.net; r-help@r-project.org
 Sent: Sun, November 28, 2010 11:16:34 PM
 Subject: Re: [R] How to remove a package.
 
 
 On Nov 28, 2010, at 7:16 AM, Stephen Liu wrote:
 
  Hi Stefan,
 
  Tks for your advice.
 
 snipped
 
  Installation went through w/o problem.
 
  library(Ecdat)
  data()
 
  Data sets in package ‘datasets’  NOT 'Ecdat'
 
  ??Ecdat
  ...
  Ecdat::Caschool The California Test Score Data Set
  Ecdat::GrilichesWage Datas
  Ecdat::MCAS The Massashusets Test Score Data Set
  Ecdat::MunExp   Municipal Expenditure Data
  Ecdat::Orange   The Orange Juice Data Set
  Ecdat::SolowSolow's Technological Change Data
  Ecdat::TranspEq Statewide Data on Transportation Equipment
 Manufacturing
  
 
 
  Those files are in Ecdat packages.
 
  Caschool
  Error: object 'Caschool' not found
 
  MCAS
  Error: object 'MCAS' not found
 
 Because loading the package does not necessarily register the datasets:
 
  require(Ecdat)
 Loading required package: Ecdat
  data()  #-- produces a large list including
 Car  Stated Preferences for Car Choice
 Caschool The California Test Score Data Set
 Catsup   Choice of Brand for Catsup
 CigarCigarette Consumption
  str(Caschool)
 Error in str(Caschool) : object 'Caschool' not found
 
  data(Car)
  str(Car)
 'data.frame':4654 obs. of  70 variables:
   $ choice: Factor w/ 6 levels choice1,choice2,..: 1 2 5 5 5 5  
 2 5 5 2 ...
   $ college   : num  0 1 0 0 0 0 1 1 0 1 ...
   $ hsg2  : num  0 1 1 0 1 0 1 0 0 0 ...
 
 Note that this dataset was NEVER  spelled car.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Updates for xlsReadWrite (1.5.3) and xlsReadWritePro (1.6.1/3)

2010-11-29 Thread Hans-Peter Suter
The xlsReadWrite[Pro] packages allow to natively read/write Excel files (.xls)
on the Win 32-bit platform.

About a week ago new package versions have been released:
* xlsReadWrite 1.5.3 is available at CRAN (for R2.11/2.12) and from
www.swissr.org/download (binary builds for R2.9 - R2.12)
* xlsReadWritePro 1.6.3 is available from www.swissr.org/download
(binary builds for R2.9 - R2.12)
* (the full download listing is here:
http://dl.dropbox.com/u/2602516/swissrpkg/index.html)

## Changes in xlsReadWrite 1.5.3 (0b78c1) ##

  - *important*: fix AV when reading large data (issue #110: in a subroutine a
pointer (pSExp) had been 'riUnprotect'ed (Rf_unprotect) too early, the total
protect/unprotect count was correct (of course) but when 'anyDuplicated' got
called in the subroutine, the control flow switched to R and R then had the
possibility to free 'my' pointer. Not good).
Thanks to the (anonymous) user which submitted the nice bug report!
  - NaN values will be written as 'NaN' and behaviour (e.g. coercion) better
accounts R (see read.xls.Rd, write.xls.Rd and unitTests/runitNaNaN.R).
  - 'dateTimeAs' argument in read.xls has been renamed to 'dateTime'. When using
the default (which should be fine in most cases) this change won't
affect you.
  - fix startup message scrambling in R2.12.0 (LF instead of CRLF -
reported to Rdevel)
  - simplified file (unitTests/execManually.R) to run RUnit tests
  - misc. small/cosmetic changes (see github commits)
  - (internal) update makefile: support R2.12, much simplify targets, set/modify
Windows System Path from within the makefile


## Changes in xlsReadWritePro 1.6.3 (93a6d7) ##

  - fix for startup message scrambling (use LF instead of CRLF -
reported to Rdevel)
  - some more RUnit tests, cosmetic changes (typos, formatting, etc.)
  - (internal) update makefile: also support R2.12/2.9, much simplify targets,
set/modify Windows System Path from within the makefile

## Changes in xlsReadWritePro version 1.6.1 ##

This is a significant update and may require some small adjustments in
your code.

* Precompiled binary packages for R2.10 and R2.11 (see downloads)
* While the package runs on R2.9 and R2.12, we have some issues
with our automated 'binary-package-building-and-releasing' makefile.
This will be fixed later. Please send us an email if you need the pkg
for these versions now and we try to help.
* Note: works with existing keys (even when already 4 years old;)

Changes:

--- 'meta' ---

o test/ensure functionality with RUnit tests (180)
o improve consistency and add examples
o issue tracking is public now and a forum has been added

--- important ---

o 'KEEP' argument/functionality DROPPED
  - reason: redundant, differences between the so-called 'keep-obj'
and 'xls-obj'
are confusing, complicates lowlevel code and hinders future enhancements
  - resolution: use xls.open, xls.new, xls.save, xls.cancel and
xls.close instead

o area-related arguments (from, rows, cells, ...) in read.xls/write.xls
  - CELLS argument SPLITTED into 'CELLS' and 'RANGE':
- cells: pick single cell values and give them back as a vector or as a
  data.frame (the latter is new, type will be determined for each
cell individually)
- range: read ranges either by name or by a numeric 4-elem-vector
(R1,C1,R2,C2)
  (A1 style, i.e. 'A1:C3', 'Sheet2!B42' could eventually be added here).

o xls.sheet, NAMEORINDEX argument renamed
  - 'nameOrIndex' becomes 'sheet' and the default is the first/active sheet
(depends if file is a physical file or an xls-obj)
  - 'copyAndInsert' action copies from the active sheet

o xls.image, SHEET argument is needed! In light of this obligatory change the
  arguments have been reworked. Here are the current and older declaration:
  - curr.: xls.image(file, action, sheet = NA|NULL, img = NA, range =
NA, target = NA)
  - beta:  xls.image(file, action, img = NA, range = NA, name = NA)
  - old:   xls.image(file = NA, action, nameOrIdx = NA, miscData = NA,
keep = NA)

o xls.range, NAMEORINDEX argument renamed
  - 'nameorindex' becomes 'range'

o template location moved
  - new: R_HOME/library/xlsReadWrite/template/TemplateNew.xls
  - (old/erronous: R_HOME/library/xlsReadWrite/libs/template,
 reported by B. Ripley for free version)
  - (the template location in APPDATA remains unaffected)

--- normal ---

o read.xls
  - colClasses: recognizes boolean strings as logical, recognizes isodatetime
formatted strings, isodatetime/isotime/isodate work for double and character
string values. ??? todo: re-read formula values when one or more
formulas have
been modified: gives back 0 (instead of #NULL!).
  - rownames for data.frames are integers (when not read from Excel)
  - NEW ARGUMENT 'checkNames' to optionally treat colnames with 'make.names'
  - NEW ARGUMENT 'strictArea'
- background: when the library opens an Excel file it determines
the area which
  is 

Re: [R] extracting P values from lm model

2010-11-29 Thread David L Lorenz
Rosario,
 The summary function will compute the f-statistic, from which you can 
compute the attained p-value. Here's a snippet that shows the f-stat.

summary(lm(Y ~ X))$fstatistic
   valuenumdfdendf 
34.23125  1.0  8.0 

Dave



From:
Rosario Garcia Gil m.rosario.gar...@genfys.slu.se
To:
r-help r-help@r-project.org
Date:
11/29/2010 09:30 AM
Subject:
[R] extracting P values from lm model
Sent by:
r-help-boun...@r-project.org



Hello

I am trying to get out of an lm model the fstatistics, however after I run 
the model I write 
 names(Model)

and the fstatistic does not appear only these.

names(Model)
 [1] coefficients  residuals effects   rank 
fitted.values
 [6] assignqrdf.residual   xlevels call  
[11] terms model 

How could I extract the P values? I have run a cbind of 1800 response 
variables so is not easy to do it by hand.

Thanks in advance.
Rosario
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Setting Values of Elements in a Dataframe

2010-11-29 Thread Lorenzo Isella

Dear All,
I am experiencing some problems in resetting the values of some selected 
elements in a dataframe.


Consider


d-seq(-1,1,length=16)
dim(d)-c(4,4)
d-as.data.frame(d)

sel_pos-which(d0, arr.ind=TRUE)

d[sel_pos]- -9

which returns the error

Error in `[-.data.frame`(`*tmp*`, sel_pos, value = -9) :
  only logical matrix subscripts are allowed in replacement

which is obscure to me. I am correctly selecting the positive elements 
in a data.frame and I'd like to reset them to another numerical value.

What I am misunderstanding?
Many thanks

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is gdata?

2010-11-29 Thread Stephen Liu
Hi Gabor,

 Please start at a fresh version of R.  Copy and paste your session
 from the R console rather than relating what happened.  Also, what
 version of gdata are you using?  Older versions did not have
 installXLSXsupport.  Show:

 packageDescription(gdata)$Version
 win.version()
 R.version.string

 packageDescription(gdata)$Version
[1] 2.8.1

 win.version()
[1] Windows 7 x64 (build 7600)

 R.version.string
[1] R version 2.12.0 (2010-10-15)

Both 32 and 64 bits


 2)
 library(gdata)
 data()
 gdata is added to the list of master dataset package

 This is not clear.  Please provide exact and complete output.

File gdata_output.txt is attached to this email.

Following lines are added to the bottom of the file:-
Data sets in package ‘gdata’:

MedUnitsTable of conversions between Intertional
Standard (SI) and US
- end -


B.R.
Stephen L








From: Gabor Grothendieck ggrothendi...@gmail.com
To: Stephen Liu sati...@yahoo.com
Cc: r-help r-help@r-project.org
Sent: Mon, November 29, 2010 11:31:18 PM
Subject: Re: [R] Where is gdata?

On Mon, Nov 29, 2010 at 10:18 AM, Stephen Liu sati...@yahoo.com wrote:
 Hi Gabor,

 - snip -
 ..
  The following object(s) are masked from 'package:utils':

 : object.size
This is just a message that it can't find perl.  If you don't need to
 use read.xls then you don't need perl so you can  ignore the message.
 If you do need to use read.xls then install perl and once you have
 done that then run installXLSXsupport().

 After having installed Strawberry perl the warning disappears.

 library(gdata)
 ?read.xls
 works starting Read Excel files

 I haven't run installXLSXsupport() afterwards.

 Just did it without success.

 installXLSXsupport()
 Error: could not find function installXLSXsupport

 Couldn't proceed further.

Please start at a fresh version of R.  Copy and paste your session
from the R console rather than relating what happened.  Also, what
version of gdata are you using?  Older versions did not have
installXLSXsupport.  Show:

packageDescription(gdata)$Version
win.version()
R.version.string



 I can't resolve follows;

 1)
 library(AER)
 data()
 starts the datasets of AER

 2)
 library(gdata)
 data()
 gdata is added to the list of master dataset package

This is not clear.  Please provide exact and complete output.



 B.R.
 Stephen L


 
 From: Gabor Grothendieck ggrothendi...@gmail.com
 To: Stephen Liu sati...@yahoo.com
 Cc: Liviu Andronic landronim...@gmail.com; r-help r-help@r-project.org
 Sent: Mon, November 29, 2010 8:39:55 PM
 Subject: Re: [R] Where is gdata?

 On Mon, Nov 29, 2010 at 3:44 AM, Stephen Liu sati...@yahoo.com wrote:
 Hi Liviu,

 Not if you
 library(gdata)
 first. Then
 ?read.xls

 should work.

 Yes, I did.


 I found something strange here which I can't explain.

 Win 7 64bit
 R 32/64 bit

 Just rebooted Win 7 and R

 library(gdata)
 gdata: Unable to locate valid perl interpreter
 gdata:
 gdata: read.xls() will be unable to read Excel XLS and XLSX files
 gdata: unless the 'perl=' argument is used to specify the location of a
 gdata: valid perl intrpreter.
 gdata:
 gdata: (To avoid display of this message in the future, please ensure
 gdata: perl is installed and available on the executable search path.)
 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLX' (Excel 97-2004) files.

 gdata: Unable to load perl libaries needed by read.xls()
 gdata: to support 'XLSX' (Excel 2007+) files.

 gdata: Run the function 'installXLSXsupport()'
 gdata: to automatically download and install the perl
 gdata: libaries needed to support Excel XLS and XLSX formats.

 Attaching package: 'gdata'

 The following object(s) are masked from 'package:utils':

 object.size

 This is just a message that it can't find perl.  If you don't need to
 use read.xls then you don't need perl so you can  ignore the message.
 If you do need to use read.xls then install perl and once you have
 done that then run installXLSXsupport().



 It complains.


 ?read.xls
 starting httpd help server ... done

 Read Excel files


 Both 32 and 64 bit R worked.


 If there is NO complaint on running;

 library(gdata)

 Then
 ?read.xls

 can't work.

 Can you clarify when ?read.xls works for you and when it does not?



 Perl seems has been installed.  But I can't recall, when and how;

 C:\dir C:\Users\satimiswin764\Documents\R\win-library\2.12\gdata\
 .
 11/22/2010  10:44 AMDIR  perl
 11/22/2010  10:44 AMDIR  R
 11/22/2010  10:44 AMDIR  unitTests
 11/22/2010  10:44 AMDIR  xls


 The gdata\perl folder contains perl libraries that come with gdata.
 Perl itself is not distributed with gdata and you don't need perl at
 all to use gdata except for read.xls and related functions.

 My understanding is that this question has nothing to do with perl nor
 with read.xls and that the problem is that you seem 

Re: [R] Setting Values of Elements in a Dataframe

2010-11-29 Thread Henrique Dallazuanna
Try this:

 d[d  0] - -9

On Mon, Nov 29, 2010 at 1:56 PM, Lorenzo Isella lorenzo.ise...@gmail.comwrote:

 Dear All,
 I am experiencing some problems in resetting the values of some selected
 elements in a dataframe.

 Consider


 d-seq(-1,1,length=16)
 dim(d)-c(4,4)
 d-as.data.frame(d)

 sel_pos-which(d0, arr.ind=TRUE)

 d[sel_pos]- -9

 which returns the error

 Error in `[-.data.frame`(`*tmp*`, sel_pos, value = -9) :
  only logical matrix subscripts are allowed in replacement

 which is obscure to me. I am correctly selecting the positive elements in a
 data.frame and I'd like to reset them to another numerical value.
 What I am misunderstanding?
 Many thanks

 Lorenzo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting Values of Elements in a Dataframe

2010-11-29 Thread Ivan Calandra

Hi,

Not sure why it doesn't work (I would say it's because of the structure 
of sel_pos, but I don't know how to deal with it).


But just do:
d[d0] - -9
It does work

HTH,
Ivan

Le 11/29/2010 16:56, Lorenzo Isella a écrit :

Dear All,
I am experiencing some problems in resetting the values of some 
selected elements in a dataframe.


Consider


d-seq(-1,1,length=16)
dim(d)-c(4,4)
d-as.data.frame(d)

sel_pos-which(d0, arr.ind=TRUE)

d[sel_pos]- -9

which returns the error

Error in `[-.data.frame`(`*tmp*`, sel_pos, value = -9) :
  only logical matrix subscripts are allowed in replacement

which is obscure to me. I am correctly selecting the positive elements 
in a data.frame and I'd like to reset them to another numerical value.

What I am misunderstanding?
Many thanks

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Two dimensional Array defined on intevals

2010-11-29 Thread Leon Adams
Hi I am new to R and am trying to set up a two-dimensional array/matrix with
the elements defined by the function similar to below. Been trying to use
outer with apply but can't seem to get the indexing quite right. Is their a
simple way of accomplishing this task ??

  -
/
   | 1  x  0.5  y  0.5
   |
  /  2  x  0.5  y  0.5
/
G(x,y) =   \
 \
 |   3  x  0.5  y  0.5
 |
 \   4  x  0.5  y  0.5
   -
-- 
Thanks in advance
Leon Adams

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Setting default path to library

2010-11-29 Thread Henrique Dallazuanna
Take a look in .libPaths()

On Mon, Nov 29, 2010 at 1:32 PM, steve_fried...@nps.gov wrote:


 Hello

 I recently upgraded from 2.11.1 to 2.12.0 on a windows machine. When I
 launch R  via TINN - R,(2.3.7.0)  most things appear correct. The exception
 is the path to the library.

 I store all of the packages in C:\Program_Files \R\R-2.12.0\library

 Last week when I upgraded I rec'd an error:
 Error in loadNamespace(i[[1L]], c(lib.loc, .libPaths(  :
  there is no package called cluster

 Well the cluster package was there, but I installed a new version from a
 local *.zip file to be sure, and things worked fine.

 This morning, when I launched R again, I rec'd the same error message, also
 indicating that the package Hmisc could not be loaded.  It too is in the
 library folder.

 I think the system is searching for some packages in
 C:\\Program_Files\R\R-2.12.0\bin\i386

 Why would it do that and what is the appropriate commands to tell R where
 to look for installed packages?
 I've search the archive and have not found a clear understandable answer.

 As always, Thanks for the assistance

 Steve




 Steve Friedman Ph. D.
 Ecologist  / Spatial Statistical Analyst
 Everglades and Dry Tortugas National Park
 950 N Krome Ave (3rd Floor)
 Homestead, Florida 33034

 steve_fried...@nps.gov
 Office (305) 224 - 4282
 Fax (305) 224 - 4147

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is gdata?

2010-11-29 Thread Gabor Grothendieck
On Mon, Nov 29, 2010 at 11:00 AM, Stephen Liu sati...@yahoo.com wrote:
 Hi Gabor,

 Please start at a fresh version of R.  Copy and paste your session
 from the R console rather than relating what happened.  Also, what
 version of gdata are you using?  Older versions did not have
 installXLSXsupport.  Show:

 packageDescription(gdata)$Version
 win.version()
 R.version.string

 packageDescription(gdata)$Version
 [1] 2.8.1

 win.version()
 [1] Windows 7 x64 (build 7600)

 R.version.string
 [1] R version 2.12.0 (2010-10-15)

 Both 32 and 64 bits


That is the correct version of gdata.  I am using Windows 32 so there
could be some differences due to that.  At any rate, can you start a
fresh R session and show the console output of the problems you have
seen.  Use Rgui --vanilla to start R to be sure you don't have
anything else that might interfere and show everything including the R
startup message in the R console output.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two dimensional Array defined on intevals

2010-11-29 Thread Jonathan P Daily
Does this work for you?

g - function(x,y) ifelse(x  .5, 0, 2) + ifelse(y  .5, 1, 2)

--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

r-help-boun...@r-project.org wrote on 11/29/2010 11:37:41 AM:

 [image removed] 
 
 [R] Two dimensional Array defined on intevals
 
 Leon Adams 
 
 to:
 
 r-help
 
 11/29/2010 11:39 AM
 
 Sent by:
 
 r-help-boun...@r-project.org
 
 Hi I am new to R and am trying to set up a two-dimensional array/matrix 
with
 the elements defined by the function similar to below. Been trying to 
use
 outer with apply but can't seem to get the indexing quite right. Is 
their a
 simple way of accomplishing this task ??
 
   -
 /
| 1  x  0.5  y  0.5
|
   /  2  x  0.5  y  0.5
 /
 G(x,y) =   \
  \
  |   3  x  0.5  y  0.5
  |
  \   4  x  0.5  y  0.5
-
 -- 
 Thanks in advance
 Leon Adams
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significance of the difference between two correlation coefficients

2010-11-29 Thread Adaikalavan Ramasamy
Thanks for providing the example but it would be useful to know who I am 
communicating with or from which institute, but nevermind ...


I don't know much about this subject but a quick google search gives me 
the following site: http://davidmlane.com/hyperstat/A50760.html


Using the info from that website, I can code up the following to give 
the two-tailed p-value of difference in correlations:


 diff.corr - function( r1, n1, r2, n2 ){

   Z1 - 0.5 * log( (1+r1)/(1-r1) )
   Z2 - 0.5 * log( (1+r2)/(1-r2) )

   diff   - Z1 - Z2
   SEdiff - sqrt( 1/(n1 - 3) + 1/(n2 - 3) )
   diff.Z  - diff/SEdiff

   p - 2*pnorm( abs(diff.Z), lower=F)
   cat( Two-tailed p-value, p , \n )
 }

 diff.corr( r1=0.5, n1=100, r2=0.40, n2=80 )
 ## Two-tailed p-value 0.4103526

 diff.corr( r1=0.1, n1=100, r2=-0.1, n2=80 )
 ## Two-tailed p-value 0.1885966

The p-value here is slightly different from the Vassar website because 
the website rounds it's diff.Z values to 2 digits.


Regards, Adai



On 29/11/2010 15:30, syrvn wrote:


Hi,

based on the sample size I want to calculate whether to correlation
coefficients are significantly different or not. I know that as a first step
both coefficients
have to be converted to z values using fisher's z transformation. I have
done this already but I dont know how to further proceed from there.

unlike for correlation coefficients I know that the difference for z values
is mathematically defined but I do not know how to incorporate the sample
size.

I found a couple of websites that provide that service but since I have huge
data sets I need to automate this procedure.

(http://faculty.vassar.edu/lowry/rdiff.html)

Can anyone help?

Cheers,
syrvn



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Array help

2010-11-29 Thread Joshua Wiley
Hi Brian,

I believe there was some miscommunication earlier due to R's array
class for objects and the colloquial usage of array (the idea that
'array' is used colloquially is a bit odd, but I digress).  In any
case, here are some steps I take (certainly not the only ones) when
exploring a new dataset that I am not familiar with:


## load the package
library(PASWR)

## look at the str()ucture of the object of interest
str(StatTemps)

## Hmm, it is a 'data.frame' with 3 variables
## one variable is 'num' and the other two are 'Factor'
## let's see if we can find out more about those data classes
## (pull up the documentation on each, it can be hard to know at first
##  that 'num' stands for numeric and 'Factor' needs to be lowercase)
?data.frame
?numeric
?factor

## in this case, it is easy to print the whole data set so
StatTemps # print to screen
## but you can also get a nice little summary
summary(StatTemps)

## For the documentation on extraction/indexing
?Extract

## and some examples
StatTemps$temperature
StatTemps$gender
StatTemps$class
## now using a different operator than '$'
## You can call by name by quoting
StatTemps[ , temperature]
## or since we know it is column 1
StatTemps[ , 1]
## conversely, we can get row 1
StatTemps[1, ]
## or some combination of rows
StatTemps[c(1:7, 22:34), ]
## or rows and columns
StatTemps[c(1:7, 22:34), c(1, 3)]

## But since you have a factor, there may be an easier way
subset(StatTemps, gender == Male)
subset(StatTemps, gender == Female)

subset(StatTemps, class == 8 a.m.)
subset(StatTemps, class == 9 a.m.)

## on more than one variable
subset(StatTemps, class == 8 a.m.  gender == Male)

## with a continuous variable
subset(StatTemps, temperature  94)

## and we can do calculations by() groups
by(data = StatTemps$temperature, INDICES = StatTemps$gender, FUN = mean)
## but typing the name is annoying
with(StatTemps, by(data = temperature, INDICES = gender, FUN = mean))
## even more detailed (but leaving off the explicit argument names)
with(StatTemps, by(temperature, list(gender, class), mean))

## A couple visual summaries
boxplot(temperature ~ gender, data = StatTemps)
boxplot(temperature ~ class, data = StatTemps)
## or hop on over to lattice for something a little more advanced
bwplot(temperature ~ gender | class, data = StatTemps)

## and you can select certain parts without subset()
## first let's see what happens with
StatTemps$gender == Female
## now if you pass a logical vector to the extraction operator, '['
StatTemps[StatTemps$gender == Female, ]
## same thing but just the first column
StatTemps[StatTemps$gender == Female, 1]
## That came out as a vector, but
StatTemps[StatTemps$gender == Female, 1, drop = FALSE]


HTH,

Josh


On Mon, Nov 29, 2010 at 5:01 AM, bfhancock brianfhanc...@gmail.com wrote:

 if you can load the PASWR package and pull up StatTemps you will see what I
 am talking about.  Otherwise I fear that my question will just be confusing.
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Array-help-tp3062992p3063535.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significance of the difference between two correlation coefficients

2010-11-29 Thread syrvn

Hi,

thanks a lot. that's what i tried to figure out!
it works great and is exactly what i need.

Best,
syrvn

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Significance-of-the-difference-between-two-correlation-coefficients-tp3063765p3063997.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to use by() ?

2010-11-29 Thread Jim Moon
Hello, All!

How might one accomplish this using the by() function?
m1 is a data frame.

# populate column m1$major_allele
for ( i in 1:length(m1$major_allele)) {
  if ( m1$Freq1[i] == m1$MAF[i]){
m1$major_allele[i] = m1$Al1[i]
  }
  else{
 m1$major_allele[i] = m1$Al2[i]
  }
}


Jim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issues with nnet.default for regression/classification

2010-11-29 Thread Jude Ryan
Hi Georg,

The documentation (?nnet) says that y should be a matrix or data frame, but in 
your case it is a vector. This is most likely the problem, if you do not have 
other data issues going on. Convert y to a matrix (or data frame) using 
'as.matrix' and see if this solves your problem. Library 'nnet' can do both 
classification and regression. I was able to replicate your problem, using an 
example from Modern Applied Statistics with S, Venables and Ripley, pages 246 
and 247), by turning y into a vector and verifying that all the predicted 
values are the same when y is a vector. This is not the case when y is part of 
a data frame. You can see this by running the code below. I tried about 4 
neural network packages in the past, including AMORE, but found 'nnet' to be 
the best for my needs.

Hope this helps.

Jude

# Neural Network model in Modern Applied Statistics with S, Venables and 
Ripley, pages 246 and 247
library(nnet)
attach(rock)
dim(rock)
area1 - area/1; peri1 - peri/1
rock1 - data.frame(perm, area = area1, peri = peri1, shape)
dim(rock1)
head(rock1,15)
# skip = T
rock.nn - nnet(log(perm) ~ area + peri + shape, rock1, size=3, decay=1e-3, 
linout=T, skip=T, maxit=1000, Hess=T)
rock1$actual - log(perm)
rock1$predicted - predict(rock.nn)
head(rock1,15)
summary(rock.nn)
sum((log(perm) - predict(rock.nn))^2)

y - as.vector(log(rock1$perm))
head(rock1[,c(2:4)])
test.nn - nnet(x=rock1[,c(2:4)], y=y, size=3, linout=T, maxit=1000)
head(predict(test.nn))

Georg wrote:


Hi,



I'm currently trying desperately to get the nnet function for training a

neural network (with one hidden layer) to perform a regression task.



So I run it like the following:



trainednet - nnet(x=traindata, y=trainresponse, size = 30, linout = TRUE, 
maxit=1000)

(where x is a matrix and y a numerical vector consisting of the target

values for one variable)



To see whether the network learnt anything at all, I checked the network

weights and those have definitely changed. However, when examining the

trainednet$fitted.values, those are all the same so it rather looks as if

the network is doing a classification. I can even set linout=FALSE and

then it outputs 1 (the class?) for each training example. The

trainednet$residuals are correct (difference between predicted/fitted

example and actual response), but rather useless.



The same happens if I run nnet with the formula/data.frame interface, btw.



As per the suggestion in the ?nnet page: If the response is not a factor,

it is passed on unchanged to 'nnet.default', I assume that the network is

doing regression since my trainresponse variable is a numerical vector and

_not_ a factor.



I'm currently lost and I can't see that the AMORE/neuralnet packages are

any better (moreover, they don't implement the formula/dataframe/predict

things). I've read the manpages of nnet and predict.nnet a gazillion

times, but I can't really find an answer there. I don't want to do

classification, but regression.



Thanks for any help.



Georg.

--

Research Assistant

Otto-von-Guericke-Universit?t Magdeburg

resea...@georgruss.de

http://research.georgruss.de


Jude Ryan
MarketShare Partners
1270 Avenue of the Americas, Suite # 2702
New York, NY 10020
http://www.marketsharepartners.com
Work: (646)-745-9916 ext: 222
Cell: (973)-943-2029


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame and formula classes of aggregate

2010-11-29 Thread Peter Ehlers

On 2010-11-29 06:35, David Freedman wrote:


Hi - I apologize for the 2nd post, but I think my question from a few weeks
ago may have been overlooked on a Friday afternoon.

I might be missing something very obvious, but is it widely known that the
aggregate function handles missing values differently depending if a data
frame or a formula is the first argument ?  For example,

(d- data.frame(sex=rep(0:1,each=3),
wt=c(100,110,120,200,210,NA),ht=c(10,20,NA,30,40,50)))
x1- aggregate(d, by = list(d$sex), FUN = mean);
names(x1)[3:4]- c('mean.dfcl.wt','mean.dfcl.ht')
x2- aggregate(cbind(wt,ht)~sex,FUN=mean,data=d);
names(x2)[2:3]- c('mean.formcl.wt','mean.formcl.ht')
cbind(x1,x2)[,c(2,3,6,4,7)]

The output from the data.frame class has an NA if there are missing values
in the group for the variable with missing values.  But, the formula class
output seems to delete the entire row (missing and non-missing values) if
there are any NAs.  Wouldn't one expect that the 2 forms (data frame vs
formula) of aggregate would give the same result?



Wasn't there some discussion of this not long ago? Maybe I'm getting
senile. Anyway, as David W. points out, the defaults differ. Here's
how you can get the same result from both methods:

1. use na.action = na.pass in aggregate.formula;
   this will duplicate your x1 result.

2. use d - d[complete.cases(d), ] in your x1 calculation;
   this will duplicate your x2 result.

Peter Ehlers


thanks very much
david freedman, atlanta






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to use by() ?

2010-11-29 Thread Greg Johnson
Jim Moon moonja at ohsu.edu writes:

 How might one accomplish this using the by() function?
 m1 is a data frame.
 
 # populate column m1$major_allele
 for ( i in 1:length(m1$major_allele)) {
   if ( m1$Freq1[i] == m1$MAF[i]){
 m1$major_allele[i] = m1$Al1[i]
   }
   else{
  m1$major_allele[i] = m1$Al2[i]
   }
 }

You could use:

m1$major_allele - ifelse(  m1$Freq1 == m1$MAF, m1$Al1,  m1$Al2 )

Greg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filling in missing time samples with na.approx

2010-11-29 Thread Jason Edgecombe

On 11/29/2010 10:00 AM, Gabor Grothendieck wrote:

On Mon, Nov 29, 2010 at 9:45 AM, Jason Edgecombe
ja...@rampaginggeek.com  wrote:
   

Hi Everyone,

I have a some data from a sports gps device like the following:

time latitude longitude altitude  distance heartrate
1 1277648884 0.304048 -0.793819  260  0.0094
2 1277648885 0.304056 -0.793772  262  4.30761595
3 127764 0.304060 -0.793696  263 11.26234797
4 1277648894 0.304075 -0.793544  263 25.237911   103
5 1277648898 0.304085 -0.793455  263 33.322525   108
6 1277648902 0.304064 -0.793387  256 40.042988   115

As you can see, the samples have irregular holes in the time column. How can
I fill in the missing samples using na.approx?

I've tried to creating a blank series with no gaps and combine them, but
merge just adds columns and rbind compains about duplicate indexes.

P.S. My GPS still has holes in the data when I turn off smart recording :(

 

Try this:

Lines- time latitude longitude altitude  distance heartrate
1277648884 0.304048 -0.793819  260  0.0094
1277648885 0.304056 -0.793772  262  4.30761595
127764 0.304060 -0.793696  263 11.26234797
1277648894 0.304075 -0.793544  263 25.237911   103
1277648898 0.304085 -0.793455  263 33.322525   108
1277648902 0.304064 -0.793387  256 40.042988   115

# read in data
library(zoo)
z- read.zoo(textConnection(Lines), header = TRUE)

na.approx(z, xout = seq(min(time(z)), max(time(z



   

No change:
 na.approx(z, xout = seq(min(time(z)), max(time(z
   latitude longitude altitude  distance heartrate
1277648884 0.304048 -0.793819  260  0.0094
1277648885 0.304056 -0.793772  262  4.30761595
127764 0.304060 -0.793696  263 11.26234797
1277648894 0.304075 -0.793544  263 25.237911   103
1277648898 0.304085 -0.793455  263 33.322525   108
1277648902 0.304064 -0.793387  256 40.042988   115

There should be 19 samples after the na.approx.

I'm guessing that na.approx is what I need, but I'm open to suggestions.

Thanks,
Jason

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] FW: how to use by() ?

2010-11-29 Thread Jim Moon
Thank you for the suggestion, Bill.  The result is not quite what I would like. 
 Here's sample code for you or anyone else who may be interested:

Al1 = c('A','C','C','C')
Al2 = c('G','G','G','T')
Freq1 = c(0.0078,0.0567,0.9434,0.9908)
MAF = c(0.0078,0.0567,0.0566,0.0092)
m1 = data.frame(Al1=Al1, Al2=Al2,Freq1=Freq1,MAF=MAF,major_allele='')  
m1

Al1 Al2  Freq1MAF major_allele
1   A   G 0.0078 0.0078
2   C   G 0.0567 0.0567
3   C   G 0.9434 0.0566
4   C   T 0.9908 0.0092


Using the suggestion involving with() (I swapped Al1 and Al2 from before, but 
this does not affect the nature of the output):

m1$major_allele - with(m1, ifelse(Freq1==MAF, Al2, Al1));m1

  Al1 Al2  Freq1MAF major_allele
1   A   G 0.0078 0.00781
2   C   G 0.0567 0.05671
3   C   G 0.9434 0.05662
4   C   T 0.9908 0.00922


The output I desire is:
  Al1 Al2  Freq1MAF major_allele
1   A   G 0.0078 0.0078G
2   C   G 0.0567 0.0567G
3   C   G 0.9434 0.0566C
4   C   T 0.9908 0.0092C


Jim


-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Monday, November 29, 2010 10:02 AM
To: Jim Moon
Subject: RE: [R] how to use by() ?

m1$major_allele - with(m1, ifelse(Freq1==MAF, Al1, Al2))

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Moon
 Sent: Monday, November 29, 2010 9:44 AM
 To: r-help@r-project.org
 Subject: [R] how to use by() ?
 
 Hello, All!
 
 How might one accomplish this using the by() function?
 m1 is a data frame.
 
 # populate column m1$major_allele
 for ( i in 1:length(m1$major_allele)) {
   if ( m1$Freq1[i] == m1$MAF[i]){
 m1$major_allele[i] = m1$Al1[i]
   }
   else{
  m1$major_allele[i] = m1$Al2[i]
   }
 }
 
 
 Jim
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issues with nnet.default for regression/classification

2010-11-29 Thread Georg Ruß
On 29/11/10 11:57:31, Jude Ryan wrote:
Hi Georg,


The documentation (?nnet) says that y should be a matrix or data frame,
but in your case it is a vector. This is most likely the problem, if
you do not have other data issues going on. Convert y to a matrix (or
data frame) using ‘as.matrix’ and see if this solves your problem.
Library ‘nnet’ can do both classification and regression. I was able to
replicate your problem, using an example from Modern Applied Statistics
with S, Venables and Ripley, pages 246 and 247), by turning y into a
vector and verifying that all the predicted values are the same when y
is a vector. This is not the case when y is part of a data frame. You
can see this by running the code below. I tried about 4 neural network
packages in the past, including AMORE, but found ‘nnet’ to be the best
for my needs.

Hi Jude,

thanks for the hint. I lately experimented both with the nnet(x,y, ...)
and the nnet(formula, dataframe ...) interfaces to nnet and both yielded
the same results. So changing the format of y from a vector to a matrix or
a data frame didn't change anything at all. However, what _did_ change the
outcome is to introduce the decay parameter (which I didn't have at all
before). By default it is set to 0 which doesn't seem appropriate in my
case. Setting it to decay=1e-3 magically turned my output into an
acceptable regression response instead of spitting out fixed values.

I really love the predict interface for regression in each of the models
I'm using. Clear code :-)

So, for the record, the call for nnet for the regression problem is as
follows:

net.fitted - nnet(formula, data = sp...@data[-testset,], decay=1e-3, size = 
20, linout = TRUE)

(where sp...@data is the data part of a SpatialPointsDataFrame. And yes,
in selecting the [-testset,] data points I'm taking into account the existing
spatial autocorrelation.)

# Neural Network model in Modern Applied Statistics with S, Venables
and Ripley, pages 246 and 247

Thanks for your help and the reference, I'm likely to order the book now
:-) Leaving out the decay parameter changes the fitted.values in the
rock example you mentioned as well, although not that much. Convergence
speed does change as expected, so the parameter is working. I guess my
problem is solved now, the rest is due to the specialties with my data
sets.

Georg.
-- 
Research Assistant
Otto-von-Guericke-Universität Magdeburg
resea...@georgruss.de
http://research.georgruss.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to use by() ?

2010-11-29 Thread Bert Gunter
... or slightly less verbose:

m1 - within(m1,major_allele -  ifelse(  Freq1 == MAF, Al1,  Al2 ))

?within

Cheers,
Bert

On Mon, Nov 29, 2010 at 10:25 AM, Greg Johnson g...@nosnhoj.org wrote:
 Jim Moon moonja at ohsu.edu writes:

 How might one accomplish this using the by() function?
 m1 is a data frame.

 # populate column m1$major_allele
 for ( i in 1:length(m1$major_allele)) {
   if ( m1$Freq1[i] == m1$MAF[i]){
     m1$major_allele[i] = m1$Al1[i]
   }
   else{
      m1$major_allele[i] = m1$Al2[i]
   }
 }

 You could use:

 m1$major_allele - ifelse(  m1$Freq1 == m1$MAF, m1$Al1,  m1$Al2 )

 Greg

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: how to use by() ?

2010-11-29 Thread William Dunlap
ifelse(cond,ifTrue,ifFalse) doesn't do what you
want when ifTrue or ifElse is a factor.
You can use as.character on the factors
   with(m1, ifelse(Freq1==MAF, as.character(Al2), as.character(Al1)))
  [1] G G C C
or use the stringsAsFactors=FALSE argument to
data.frame (or read.table) when you make the
data.frame.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Moon
 Sent: Monday, November 29, 2010 10:37 AM
 To: r-help@r-project.org
 Subject: [R] FW: how to use by() ?
 
 Thank you for the suggestion, Bill.  The result is not quite 
 what I would like.  Here's sample code for you or anyone else 
 who may be interested:
 
 Al1 = c('A','C','C','C')
 Al2 = c('G','G','G','T')
 Freq1 = c(0.0078,0.0567,0.9434,0.9908)
 MAF = c(0.0078,0.0567,0.0566,0.0092)
 m1 = data.frame(Al1=Al1, 
 Al2=Al2,Freq1=Freq1,MAF=MAF,major_allele='')  
 m1
 
 Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.0078
 2   C   G 0.0567 0.0567
 3   C   G 0.9434 0.0566
 4   C   T 0.9908 0.0092
 
 
 Using the suggestion involving with() (I swapped Al1 and 
 Al2 from before, but this does not affect the nature of the output):
 
 m1$major_allele - with(m1, ifelse(Freq1==MAF, Al2, Al1));m1
 
   Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.00781
 2   C   G 0.0567 0.05671
 3   C   G 0.9434 0.05662
 4   C   T 0.9908 0.00922
 
 
 The output I desire is:
   Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.0078G
 2   C   G 0.0567 0.0567G
 3   C   G 0.9434 0.0566C
 4   C   T 0.9908 0.0092C
 
 
 Jim
 
 
 -Original Message-
 From: William Dunlap [mailto:wdun...@tibco.com] 
 Sent: Monday, November 29, 2010 10:02 AM
 To: Jim Moon
 Subject: RE: [R] how to use by() ?
 
 m1$major_allele - with(m1, ifelse(Freq1==MAF, Al1, Al2))
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com  
 
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Moon
  Sent: Monday, November 29, 2010 9:44 AM
  To: r-help@r-project.org
  Subject: [R] how to use by() ?
  
  Hello, All!
  
  How might one accomplish this using the by() function?
  m1 is a data frame.
  
  # populate column m1$major_allele
  for ( i in 1:length(m1$major_allele)) {
if ( m1$Freq1[i] == m1$MAF[i]){
  m1$major_allele[i] = m1$Al1[i]
}
else{
   m1$major_allele[i] = m1$Al2[i]
}
  }
  
  
  Jim
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to use by() ?

2010-11-29 Thread Jim Moon
Jim Moon moonja at ohsu.edu writes:

 How might one accomplish this using the by() function?
 m1 is a data frame.

 # populate column m1$major_allele
 for ( i in 1:length(m1$major_allele)) {
   if ( m1$Freq1[i] == m1$MAF[i]){
 m1$major_allele[i] = m1$Al1[i]
   }
   else{
  m1$major_allele[i] = m1$Al2[i]
   }
 }

You could use:

m1$major_allele - ifelse(  m1$Freq1 == m1$MAF, m1$Al1,  m1$Al2 )

Greg
---

Thank you for the suggestion, Greg.  The result is not quite what I would like. 
 Here's sample code for you or anyone else who may be interested:



Al1 = c('A','C','C','C')

Al2 = c('G','G','G','T')

Freq1 = c(0.0078,0.0567,0.9434,0.9908)

MAF = c(0.0078,0.0567,0.0566,0.0092)

m1 = data.frame(Al1=Al1, Al2=Al2,Freq1=Freq1,MAF=MAF,major_allele='')

m1



Al1 Al2  Freq1MAF major_allele

1   A   G 0.0078 0.0078

2   C   G 0.0567 0.0567

3   C   G 0.9434 0.0566

4   C   T 0.9908 0.0092





Using the suggestion involving ifelse (I swapped Al1 and Al2 from before, but 
this does not affect the nature of the output):



m1$major_allele - ifelse(  m1$Freq1 == m1$MAF, m1$Al2,  m1$Al1 );m1



  Al1 Al2  Freq1MAF major_allele

1   A   G 0.0078 0.00781

2   C   G 0.0567 0.05671

3   C   G 0.9434 0.05662

4   C   T 0.9908 0.00922





The output I desire is:

  Al1 Al2  Freq1MAF major_allele

1   A   G 0.0078 0.0078G

2   C   G 0.0567 0.0567G

3   C   G 0.9434 0.0566C

4   C   T 0.9908 0.0092C


Jim


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filling in missing time samples with na.approx

2010-11-29 Thread Gabor Grothendieck
On Mon, Nov 29, 2010 at 1:33 PM, Jason Edgecombe
ja...@rampaginggeek.com wrote:
 On 11/29/2010 10:00 AM, Gabor Grothendieck wrote:

 On Mon, Nov 29, 2010 at 9:45 AM, Jason Edgecombe
 ja...@rampaginggeek.com  wrote:


 Hi Everyone,

 I have a some data from a sports gps device like the following:

        time latitude longitude altitude  distance heartrate
 1 1277648884 0.304048 -0.793819      260  0.00        94
 2 1277648885 0.304056 -0.793772      262  4.307615        95
 3 127764 0.304060 -0.793696      263 11.262347        97
 4 1277648894 0.304075 -0.793544      263 25.237911       103
 5 1277648898 0.304085 -0.793455      263 33.322525       108
 6 1277648902 0.304064 -0.793387      256 40.042988       115

 As you can see, the samples have irregular holes in the time column. How
 can
 I fill in the missing samples using na.approx?

 I've tried to creating a blank series with no gaps and combine them, but
 merge just adds columns and rbind compains about duplicate indexes.

 P.S. My GPS still has holes in the data when I turn off smart recording
 :(



 Try this:

 Lines- time latitude longitude altitude  distance heartrate
 1277648884 0.304048 -0.793819      260  0.00        94
 1277648885 0.304056 -0.793772      262  4.307615        95
 127764 0.304060 -0.793696      263 11.262347        97
 1277648894 0.304075 -0.793544      263 25.237911       103
 1277648898 0.304085 -0.793455      263 33.322525       108
 1277648902 0.304064 -0.793387      256 40.042988       115

 # read in data
 library(zoo)
 z- read.zoo(textConnection(Lines), header = TRUE)

 na.approx(z, xout = seq(min(time(z)), max(time(z





 No change:
 na.approx(z, xout = seq(min(time(z)), max(time(z
           latitude longitude altitude  distance heartrate
 1277648884 0.304048 -0.793819      260  0.00        94
 1277648885 0.304056 -0.793772      262  4.307615        95
 127764 0.304060 -0.793696      263 11.262347        97
 1277648894 0.304075 -0.793544      263 25.237911       103
 1277648898 0.304085 -0.793455      263 33.322525       108
 1277648902 0.304064 -0.793387      256 40.042988       115


It works for me.

 Lines - time latitude longitude altitude  distance heartrate
+ 1277648884 0.304048 -0.793819  260  0.0094
+ 1277648885 0.304056 -0.793772  262  4.30761595
+ 127764 0.304060 -0.793696  263 11.26234797
+ 1277648894 0.304075 -0.793544  263 25.237911   103
+ 1277648898 0.304085 -0.793455  263 33.322525   108
+ 1277648902 0.304064 -0.793387  256 40.042988   115

 # read in data
 library(zoo)
 z - read.zoo(textConnection(Lines), header = TRUE)

 na.approx(z, xout = seq(min(time(z)), max(time(z
latitude  longitude altitude  distance heartrate
1277648884 0.3040480 -0.7938190 260.  0.00  94.0
1277648885 0.3040560 -0.7937720 262.  4.307615  95.0
1277648886 0.3040573 -0.7937467 262.  6.625859  95.7
1277648887 0.3040587 -0.7937213 262.6667  8.944103  96.3
127764 0.3040600 -0.7936960 263. 11.262347  97.0
1277648889 0.3040625 -0.7936707 263. 13.591608  98.0
1277648890 0.3040650 -0.7936453 263. 15.920868  99.0
1277648891 0.3040675 -0.7936200 263. 18.250129 100.0
1277648892 0.3040700 -0.7935947 263. 20.579390 101.0
1277648893 0.3040725 -0.7935693 263. 22.908650 102.0
1277648894 0.3040750 -0.7935440 263. 25.237911 103.0
1277648895 0.3040775 -0.7935218 263. 27.259065 104.25000
1277648896 0.3040800 -0.7934995 263. 29.280218 105.5
1277648897 0.3040825 -0.7934773 263. 31.301371 106.75000
1277648898 0.3040850 -0.7934550 263. 33.322525 108.0
1277648899 0.3040797 -0.7934380 261.2500 35.002641 109.75000
1277648900 0.3040745 -0.7934210 259.5000 36.682756 111.5
1277648901 0.3040693 -0.7934040 257.7500 38.362872 113.25000
1277648902 0.3040640 -0.7933870 256. 40.042988 115.0



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issues with nnet.default for regression/classification

2010-11-29 Thread Jude Ryan
Good to know that you solved your problem. I did not realize that the default 
decay parameter = 0 was the cause of the problem. Since I have the MASS book, I 
was always setting this parameter, in my own work, as indicated in the book, 
and had no reason to change it. This is probably the first time I have left 
this parameter out! I am not sure that the effect of leaving out the decay 
parameter is documented anywhere. I will have to dig out the book and check, 
but the book is rather terse and to the point and it would not surprise me if 
there is no mention of when to override the default of decay = 0.

Jude Ryan
MarketShare Partners
1270 Avenue of the Americas, Suite # 2702
New York, NY 10020
http://www.marketsharepartners.com
Work: (646)-745-9916 ext: 222
Cell: (973)-943-2029


-Original Message-
From: Georg Ruß [mailto:resea...@georgruss.de] 
Sent: Monday, November 29, 2010 10:37 AM
To: Jude Ryan; R-help@r-project.org
Subject: Re: [R] Issues with nnet.default for regression/classification

On 29/11/10 11:57:31, Jude Ryan wrote:
Hi Georg,


The documentation (?nnet) says that y should be a matrix or data frame,
but in your case it is a vector. This is most likely the problem, if
you do not have other data issues going on. Convert y to a matrix (or
data frame) using ‘as.matrix’ and see if this solves your problem.
Library ‘nnet’ can do both classification and regression. I was able to
replicate your problem, using an example from Modern Applied Statistics
with S, Venables and Ripley, pages 246 and 247), by turning y into a
vector and verifying that all the predicted values are the same when y
is a vector. This is not the case when y is part of a data frame. You
can see this by running the code below. I tried about 4 neural network
packages in the past, including AMORE, but found ‘nnet’ to be the best
for my needs.

Hi Jude,

thanks for the hint. I lately experimented both with the nnet(x,y, ...)
and the nnet(formula, dataframe ...) interfaces to nnet and both yielded
the same results. So changing the format of y from a vector to a matrix or
a data frame didn't change anything at all. However, what _did_ change the
outcome is to introduce the decay parameter (which I didn't have at all
before). By default it is set to 0 which doesn't seem appropriate in my
case. Setting it to decay=1e-3 magically turned my output into an
acceptable regression response instead of spitting out fixed values.

I really love the predict interface for regression in each of the models
I'm using. Clear code :-)

So, for the record, the call for nnet for the regression problem is as
follows:

net.fitted - nnet(formula, data = sp...@data[-testset,], decay=1e-3, size = 
20, linout = TRUE)

(where sp...@data is the data part of a SpatialPointsDataFrame. And yes,
in selecting the [-testset,] data points I'm taking into account the existing
spatial autocorrelation.)

# Neural Network model in Modern Applied Statistics with S, Venables
and Ripley, pages 246 and 247

Thanks for your help and the reference, I'm likely to order the book now
:-) Leaving out the decay parameter changes the fitted.values in the
rock example you mentioned as well, although not that much. Convergence
speed does change as expected, so the parameter is working. I guess my
problem is solved now, the rest is due to the specialties with my data
sets.

Georg.
-- 
Research Assistant
Otto-von-Guericke-Universität Magdeburg
resea...@georgruss.de
http://research.georgruss.de
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: how to use by() ?

2010-11-29 Thread David Winsemius


On Nov 29, 2010, at 1:36 PM, Jim Moon wrote:

Thank you for the suggestion, Bill.  The result is not quite what I  
would like.  Here's sample code for you or anyone else who may be  
interested:


Al1 = c('A','C','C','C')
Al2 = c('G','G','G','T')
Freq1 = c(0.0078,0.0567,0.9434,0.9908)
MAF = c(0.0078,0.0567,0.0566,0.0092)
m1 = data.frame(Al1=Al1, Al2=Al2,Freq1=Freq1,MAF=MAF,major_allele='')
m1

Al1 Al2  Freq1MAF major_allele
1   A   G 0.0078 0.0078
2   C   G 0.0567 0.0567
3   C   G 0.9434 0.0566
4   C   T 0.9908 0.0092


Using the suggestion involving with() (I swapped Al1 and Al2 from  
before, but this does not affect the nature of the output):


m1$major_allele - with(m1, ifelse(Freq1==MAF, Al2, Al1));m1

 Al1 Al2  Freq1MAF major_allele


I suspect that you have just been bitten by the data.frame- 
stringsAsFactors=TRUE crocodile. Since you are comparing floating  
point numbers, you are also wading in rivers where floating-point  
crocodiles are also hungry and searching out their next victim.


--
David.



1   A   G 0.0078 0.00781
2   C   G 0.0567 0.05671
3   C   G 0.9434 0.05662
4   C   T 0.9908 0.00922


The output I desire is:
 Al1 Al2  Freq1MAF major_allele
1   A   G 0.0078 0.0078G
2   C   G 0.0567 0.0567G
3   C   G 0.9434 0.0566C
4   C   T 0.9908 0.0092C


Jim


-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com]
Sent: Monday, November 29, 2010 10:02 AM
To: Jim Moon
Subject: RE: [R] how to use by() ?

m1$major_allele - with(m1, ifelse(Freq1==MAF, Al1, Al2))

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Jim Moon
Sent: Monday, November 29, 2010 9:44 AM
To: r-help@r-project.org
Subject: [R] how to use by() ?

Hello, All!

How might one accomplish this using the by() function?
m1 is a data frame.

# populate column m1$major_allele
for ( i in 1:length(m1$major_allele)) {
 if ( m1$Freq1[i] == m1$MAF[i]){
   m1$major_allele[i] = m1$Al1[i]
 }
 else{
m1$major_allele[i] = m1$Al2[i]
 }
}


Jim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: how to use by() ?

2010-11-29 Thread Jim Moon
Thank you, Bill.  That fixed it.

Jim


-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Monday, November 29, 2010 10:46 AM
To: Jim Moon; r-help@r-project.org
Subject: RE: [R] FW: how to use by() ?

ifelse(cond,ifTrue,ifFalse) doesn't do what you
want when ifTrue or ifElse is a factor.
You can use as.character on the factors
   with(m1, ifelse(Freq1==MAF, as.character(Al2), as.character(Al1)))
  [1] G G C C
or use the stringsAsFactors=FALSE argument to
data.frame (or read.table) when you make the
data.frame.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Moon
 Sent: Monday, November 29, 2010 10:37 AM
 To: r-help@r-project.org
 Subject: [R] FW: how to use by() ?
 
 Thank you for the suggestion, Bill.  The result is not quite 
 what I would like.  Here's sample code for you or anyone else 
 who may be interested:
 
 Al1 = c('A','C','C','C')
 Al2 = c('G','G','G','T')
 Freq1 = c(0.0078,0.0567,0.9434,0.9908)
 MAF = c(0.0078,0.0567,0.0566,0.0092)
 m1 = data.frame(Al1=Al1, 
 Al2=Al2,Freq1=Freq1,MAF=MAF,major_allele='')  
 m1
 
 Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.0078
 2   C   G 0.0567 0.0567
 3   C   G 0.9434 0.0566
 4   C   T 0.9908 0.0092
 
 
 Using the suggestion involving with() (I swapped Al1 and 
 Al2 from before, but this does not affect the nature of the output):
 
 m1$major_allele - with(m1, ifelse(Freq1==MAF, Al2, Al1));m1
 
   Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.00781
 2   C   G 0.0567 0.05671
 3   C   G 0.9434 0.05662
 4   C   T 0.9908 0.00922
 
 
 The output I desire is:
   Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.0078G
 2   C   G 0.0567 0.0567G
 3   C   G 0.9434 0.0566C
 4   C   T 0.9908 0.0092C
 
 
 Jim
 
 
 -Original Message-
 From: William Dunlap [mailto:wdun...@tibco.com] 
 Sent: Monday, November 29, 2010 10:02 AM
 To: Jim Moon
 Subject: RE: [R] how to use by() ?
 
 m1$major_allele - with(m1, ifelse(Freq1==MAF, Al1, Al2))
 
 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com  
 
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Moon
  Sent: Monday, November 29, 2010 9:44 AM
  To: r-help@r-project.org
  Subject: [R] how to use by() ?
  
  Hello, All!
  
  How might one accomplish this using the by() function?
  m1 is a data frame.
  
  # populate column m1$major_allele
  for ( i in 1:length(m1$major_allele)) {
if ( m1$Freq1[i] == m1$MAF[i]){
  m1$major_allele[i] = m1$Al1[i]
}
else{
   m1$major_allele[i] = m1$Al2[i]
}
  }
  
  
  Jim
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: how to use by() ?

2010-11-29 Thread Jim Moon
Well-phrased, David.  :-)

Jim


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Monday, November 29, 2010 10:53 AM
To: Jim Moon
Cc: r-help@r-project.org
Subject: Re: [R] FW: how to use by() ?


On Nov 29, 2010, at 1:36 PM, Jim Moon wrote:

 Thank you for the suggestion, Bill.  The result is not quite what I  
 would like.  Here's sample code for you or anyone else who may be  
 interested:

 Al1 = c('A','C','C','C')
 Al2 = c('G','G','G','T')
 Freq1 = c(0.0078,0.0567,0.9434,0.9908)
 MAF = c(0.0078,0.0567,0.0566,0.0092)
 m1 = data.frame(Al1=Al1, Al2=Al2,Freq1=Freq1,MAF=MAF,major_allele='')
 m1

 Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.0078
 2   C   G 0.0567 0.0567
 3   C   G 0.9434 0.0566
 4   C   T 0.9908 0.0092


 Using the suggestion involving with() (I swapped Al1 and Al2 from  
 before, but this does not affect the nature of the output):

 m1$major_allele - with(m1, ifelse(Freq1==MAF, Al2, Al1));m1

  Al1 Al2  Freq1MAF major_allele

I suspect that you have just been bitten by the data.frame- 
stringsAsFactors=TRUE crocodile. Since you are comparing floating  
point numbers, you are also wading in rivers where floating-point  
crocodiles are also hungry and searching out their next victim.

-- 
David.


 1   A   G 0.0078 0.00781
 2   C   G 0.0567 0.05671
 3   C   G 0.9434 0.05662
 4   C   T 0.9908 0.00922


 The output I desire is:
  Al1 Al2  Freq1MAF major_allele
 1   A   G 0.0078 0.0078G
 2   C   G 0.0567 0.0567G
 3   C   G 0.9434 0.0566C
 4   C   T 0.9908 0.0092C


 Jim


 -Original Message-
 From: William Dunlap [mailto:wdun...@tibco.com]
 Sent: Monday, November 29, 2010 10:02 AM
 To: Jim Moon
 Subject: RE: [R] how to use by() ?

 m1$major_allele - with(m1, ifelse(Freq1==MAF, Al1, Al2))

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Moon
 Sent: Monday, November 29, 2010 9:44 AM
 To: r-help@r-project.org
 Subject: [R] how to use by() ?

 Hello, All!

 How might one accomplish this using the by() function?
 m1 is a data frame.

 # populate column m1$major_allele
 for ( i in 1:length(m1$major_allele)) {
  if ( m1$Freq1[i] == m1$MAF[i]){
m1$major_allele[i] = m1$Al1[i]
  }
  else{
 m1$major_allele[i] = m1$Al2[i]
  }
 }


 Jim

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame and formula classes of aggregate

2010-11-29 Thread David Freedman

Thanks for the information.  

There was a discussion of different results obtained with the formula and
data.frame methods for a paired t-test -- there are many threads, but one is
at 
http://r.789695.n4.nabble.com/Paired-t-tests-td2325956.html#a2326291

david freedman
-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-frame-and-formula-classes-of-aggregate-tp3063668p3064177.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] drop levels problem

2010-11-29 Thread Felipe Carrillo
Hi all:
I am having trouble dropping levels, got a few hints online without success.
Please consider the dataset below:
 I was under the inpression that subset(..drop=TRUE) would work but it 
doesn't

library(ggplot2)
    library(hmisc)

x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, 
46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, 
34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, 
42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766, 
40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419, 
42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634, 
38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second, 
third), class = data.frame, row.names = c(NA, -14L))

 head(x);str(x)
xmelt - melt(x)
 names(xmelt) - c(year,fatPerc)

  # Year variable is a factor with three levels
 # Subset to plot only 'first' year
firstyear - subset(xmelt,year=='first');str(firstyear)
# Plot showing three levels still after I made the subset
  ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

# Try to drop the levels but dropUnusedLevels() doesn't seem to work here
  dropUnusedLevels()
ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

# code below also should drop levels but it doesn't
#data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)} 
else{x}))
str(firstyear)
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filling in missing time samples with na.approx

2010-11-29 Thread Felipe Carrillo
strange,,I don't see any change either, could it be that we have an older 
version of zoo?
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA



- Original Message 
 From: Gabor Grothendieck ggrothendi...@gmail.com
 To: Jason Edgecombe ja...@rampaginggeek.com
 Cc: r-h...@stat.math.ethz.ch
 Sent: Mon, November 29, 2010 10:51:07 AM
 Subject: Re: [R] Filling in missing time samples with na.approx
 
 On Mon, Nov 29, 2010 at 1:33 PM, Jason Edgecombe
 ja...@rampaginggeek.com wrote:
  On 11/29/2010 10:00 AM, Gabor Grothendieck wrote:
 
  On Mon, Nov 29, 2010 at 9:45 AM, Jason Edgecombe
  ja...@rampaginggeek.com  wrote:
 
 
  Hi Everyone,
 
  I have a some data from a sports gps device like the following:
 
         time latitude longitude altitude  distance heartrate
  1 1277648884 0.304048 -0.793819      260  0.00        94
  2 1277648885 0.304056 -0.793772      262  4.307615        95
  3 127764 0.304060 -0.793696      263 11.262347        97
  4 1277648894 0.304075 -0.793544      263 25.237911       103
  5 1277648898 0.304085 -0.793455      263 33.322525       108
  6 1277648902 0.304064 -0.793387      256 40.042988       115
 
  As you can see, the samples have irregular holes in the time column. How
  can
  I fill in the missing samples using na.approx?
 
  I've tried to creating a blank series with no gaps and combine them, but
  merge just adds columns and rbind compains about duplicate indexes.
 
  P.S. My GPS still has holes in the data when I turn off smart recording
  :(
 
 
 
  Try this:
 
  Lines- time latitude longitude altitude  distance heartrate
  1277648884 0.304048 -0.793819      260  0.00        94
  1277648885 0.304056 -0.793772      262  4.307615        95
  127764 0.304060 -0.793696      263 11.262347        97
  1277648894 0.304075 -0.793544      263 25.237911       103
  1277648898 0.304085 -0.793455      263 33.322525       108
  1277648902 0.304064 -0.793387      256 40.042988       115
 
  # read in data
  library(zoo)
  z- read.zoo(textConnection(Lines), header = TRUE)
 
  na.approx(z, xout = seq(min(time(z)), max(time(z
 
 
 
 
 
  No change:
  na.approx(z, xout = seq(min(time(z)), max(time(z
            latitude longitude altitude  distance heartrate
  1277648884 0.304048 -0.793819      260  0.00        94
  1277648885 0.304056 -0.793772      262  4.307615        95
  127764 0.304060 -0.793696      263 11.262347        97
  1277648894 0.304075 -0.793544      263 25.237911       103
  1277648898 0.304085 -0.793455      263 33.322525       108
  1277648902 0.304064 -0.793387      256 40.042988       115
 
 
 It works for me.
 
  Lines - time latitude longitude altitude  distance heartrate
 + 1277648884 0.304048 -0.793819      260  0.00        94
 + 1277648885 0.304056 -0.793772      262  4.307615        95
 + 127764 0.304060 -0.793696      263 11.262347        97
 + 1277648894 0.304075 -0.793544      263 25.237911      103
 + 1277648898 0.304085 -0.793455      263 33.322525      108
 + 1277648902 0.304064 -0.793387      256 40.042988      115
 
  # read in data
  library(zoo)
  z - read.zoo(textConnection(Lines), header = TRUE)
 
  na.approx(z, xout = seq(min(time(z)), max(time(z
             latitude  longitude altitude  distance heartrate
 1277648884 0.3040480 -0.7938190 260.  0.00  94.0
 1277648885 0.3040560 -0.7937720 262.  4.307615  95.0
 1277648886 0.3040573 -0.7937467 262.  6.625859  95.7
 1277648887 0.3040587 -0.7937213 262.6667  8.944103  96.3
 127764 0.3040600 -0.7936960 263. 11.262347  97.0
 1277648889 0.3040625 -0.7936707 263. 13.591608  98.0
 1277648890 0.3040650 -0.7936453 263. 15.920868  99.0
 1277648891 0.3040675 -0.7936200 263. 18.250129 100.0
 1277648892 0.3040700 -0.7935947 263. 20.579390 101.0
 1277648893 0.3040725 -0.7935693 263. 22.908650 102.0
 1277648894 0.3040750 -0.7935440 263. 25.237911 103.0
 1277648895 0.3040775 -0.7935218 263. 27.259065 104.25000
 1277648896 0.3040800 -0.7934995 263. 29.280218 105.5
 1277648897 0.3040825 -0.7934773 263. 31.301371 106.75000
 1277648898 0.3040850 -0.7934550 263. 33.322525 108.0
 1277648899 0.3040797 -0.7934380 261.2500 35.002641 109.75000
 1277648900 0.3040745 -0.7934210 259.5000 36.682756 111.5
 1277648901 0.3040693 -0.7934040 257.7500 38.362872 113.25000
 1277648902 0.3040640 -0.7933870 256. 40.042988 115.0
 
 
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 




__

Re: [R] Filling in missing time samples with na.approx

2010-11-29 Thread Gabor Grothendieck
On Mon, Nov 29, 2010 at 2:08 PM, Felipe Carrillo
mazatlanmex...@yahoo.com wrote:
 strange,,I don't see any change either, could it be that we have an older
 version of zoo?


I am using the most recent one on CRAN which is zoo 1.6.4:

 packageDescription(zoo)$Version
[1] 1.6-4

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Joshua Wiley
Hi Felipe,

On Mon, Nov 29, 2010 at 11:01 AM, Felipe Carrillo
mazatlanmex...@yahoo.com wrote:
 Hi all:
 I am having trouble dropping levels, got a few hints online without success.
 Please consider the dataset below:
  I was under the inpression that subset(..drop=TRUE) would work but it
 doesn't

Here drop is referring to:

data.frame(1:10)[, 1]
data.frame(1:10)[, 1, drop = FALSE]

not to levels of a factor.


 library(ggplot2)
     library(hmisc)

 x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232,
 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056,
 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894,
 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,
 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419,
 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634,
 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second,
 third), class = data.frame, row.names = c(NA, -14L))

Thanks for the nice example!


  head(x);str(x)
 xmelt - melt(x)
  names(xmelt) - c(year,fatPerc)

   # Year variable is a factor with three levels
  # Subset to plot only 'first' year
 firstyear - subset(xmelt,year=='first');str(firstyear)
 # Plot showing three levels still after I made the subset
   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

right, because it is possible to have levels of a factor that have no
observations---sometimes these are the most interesting (e.g., if you
subset by smoking and found that there were no instances of lung
cancer in non-smokers (not that extreme, but you get the point)).


 # Try to drop the levels but dropUnusedLevels() doesn't seem to work here
   dropUnusedLevels()

sorry, I have had some difficulty installing Hmisc on my linux system
and never gotten around to working it out.

 ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

 # code below also should drop levels but it doesn't
 #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)}
 else{x}))

it would if you assigned it back to firstyear.  You do it, and then
just print to screen and the changed data goes off to oblivion.

firstyear - data.frame(lapply(firstyear, function(x) if(is.factor(x))
{factor(x)} else {x}))
str(firstyear) # should now just have one level

Cheers,

Josh

 str(firstyear)

 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Henrique Dallazuanna
Take a look on droplevels function (R = 2.12)

On Mon, Nov 29, 2010 at 5:01 PM, Felipe Carrillo
mazatlanmex...@yahoo.comwrote:

 Hi all:
 I am having trouble dropping levels, got a few hints online without
 success.
 Please consider the dataset below:
  I was under the inpression that subset(..drop=TRUE) would work but it
 doesn't

 library(ggplot2)
 library(hmisc)

 x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232,
 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056,
 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894,
 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,
 40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419,
 42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634,
 38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second,
 third), class = data.frame, row.names = c(NA, -14L))

  head(x);str(x)
 xmelt - melt(x)
  names(xmelt) - c(year,fatPerc)

   # Year variable is a factor with three levels
  # Subset to plot only 'first' year
 firstyear - subset(xmelt,year=='first');str(firstyear)
 # Plot showing three levels still after I made the subset
   ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

 # Try to drop the levels but dropUnusedLevels() doesn't seem to work here
   dropUnusedLevels()
 ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()

 # code below also should drop levels but it doesn't
 #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)}
 else{x}))
 str(firstyear)

 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Felipe Carrillo
Thanks Joshua, I get it now, levels sometimes drive me loco
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA



- Original Message 
 From: Joshua Wiley jwiley.ps...@gmail.com
 To: Felipe Carrillo mazatlanmex...@yahoo.com
 Cc: r-h...@stat.math.ethz.ch
 Sent: Mon, November 29, 2010 11:18:45 AM
 Subject: Re: [R] drop levels problem
 
 Hi Felipe,
 
 On Mon, Nov 29, 2010 at 11:01 AM, Felipe Carrillo
 mazatlanmex...@yahoo.com wrote:
  Hi all:
  I am having trouble dropping levels, got a few hints online without success.
  Please consider the dataset below:
   I was under the inpression that subset(..drop=TRUE) would work but it
  doesn't
 
 Here drop is referring to:
 
 data.frame(1:10)[, 1]
 data.frame(1:10)[, 1, drop = FALSE]
 
 not to levels of a factor.
 
 
  library(ggplot2)
      library(hmisc)
 
  x - structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232,
  46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056,
  34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894,
  42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,
  40.8822, 42.0165, 49.2055), third = c(42.24, 42.992, 37.7419,
  42.3448, 41.9131, 44.385, 42.7811, 44.1963, 40.8088, 43.9634,
  38.7079, 38.0791, 44.3136, 39.5333)), .Names = c(first, second,
  third), class = data.frame, row.names = c(NA, -14L))
 
 Thanks for the nice example!
 
 
   head(x);str(x)
  xmelt - melt(x)
   names(xmelt) - c(year,fatPerc)
 
    # Year variable is a factor with three levels
   # Subset to plot only 'first' year
  firstyear - subset(xmelt,year=='first');str(firstyear)
  # Plot showing three levels still after I made the subset
    ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()
 
 right, because it is possible to have levels of a factor that have no
 observations---sometimes these are the most interesting (e.g., if you
 subset by smoking and found that there were no instances of lung
 cancer in non-smokers (not that extreme, but you get the point)).
 
 
  # Try to drop the levels but dropUnusedLevels() doesn't seem to work here
    dropUnusedLevels()
 
 sorry, I have had some difficulty installing Hmisc on my linux system
 and never gotten around to working it out.
 
  ggplot(firstyear,aes(year,fatPerc)) + geom_boxplot() + geom_jitter()
 
  # code below also should drop levels but it doesn't
  #data.frame(lapply(firstyear, function(x) if (is.factor(x)){ factor(x)}
  else{x}))
 
 it would if you assigned it back to firstyear.  You do it, and then
 just print to screen and the changed data goes off to oblivion.
 
 firstyear - data.frame(lapply(firstyear, function(x) if(is.factor(x))
 {factor(x)} else {x}))
 str(firstyear) # should now just have one level
 
 Cheers,
 
 Josh
 
  str(firstyear)
 
  Felipe D. Carrillo
  Supervisory Fishery Biologist
  Department of the Interior
  US Fish  Wildlife Service
  California, USA
 
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset by using multiple values

2010-11-29 Thread clangkamp

Hi
I would like to extend this item to the following:
I have the following table

  X1   X2   X3 value
1   BVEq AGR 11412 954.75
2 CA_Tot AGR 11412 970.59
...
 str(DC2_m)
'data.frame':   104160 obs. of  4 variables:
 $ X1   : Factor w/ 62 levels BVEq,CA_Tot,..: 1 2 3 4 5 6 45 46 47 48
...
  ..- attr(*, names)= chr  Figure.1 Figure.995 Figure.17873
Figure.17874 ...
 $ X2   : Factor w/ 48 levels AGR,AKZ,ALB,..: 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, names)= chr  1 1 1 1 ...
 $ X3   : int  11412 11412 11412 11412 11412 11412 11412 11412 11412 11412
...
 $ value: num  955 971 NA NA NA ...


And I have a second (manual) table with entries of combinations of X2 and X3
which I want to exclude:
 str(Exclude_Data)
'data.frame':   8 obs. of  2 variables:
 $ Code : Factor w/ 5 levels ALB,ALQ,BAY,..: 3 3 2 4 5 3 1 2
 $ Dates: int  12052 12233 12508 11960 13056 12142 12691 12783

subset(DC2_m, cbind(X2,X3) %in% Exclude_Data[])

Now the trick is to precisely exclude just the combinations chosen, and not
all combinations of Exclude_Data[1] and Exclude_Data[2], which is what
happens when doing two statements X2 in ED[1] AND X3 in ED[3].

Any takers ? Thanks in advance
Christian
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-Subset-by-using-multiple-values-tp815278p3064226.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] map() and pdf clipping

2010-11-29 Thread Ben Tupper

Hello,

Below is a function (test.map) that permits drawing the same map using  
three different devices.  The pdf device doesn't clip polygons to  
the plot region as I see it does by both the native device (in my case  
Quartz) and the png device.


test.map(pdf)  # produces test-map.pdf with no clipping
test.map(png)  # produces test-map.png with clipping
test.map(NA)# draws on the window device with clipping

It doesn't appear to matter what the value of the fill argument is -  
the pdf output shows that the polygons are not being clipped to the  
plot region.


I have viewed the pdf output using Mac OSX's Preview, PDFPen, Adobe  
Reader and Safari and they all render the same way. So my hunch is  
that it is not a viewer issue (although I suppose they might be using  
the same rendering engine under the hood.)


Any help would be greatly appreciated.

Thanks and cheers,
Ben

## BEGIN
library(maps)

test.map - function(to.file = c(pdf, png, NA)[1], fill = TRUE){

   if (!is.na(to.file)){
  ofile = paste(test-map, to.file,sep = .)
  do.call(to.file, list(file=ofile))
   }
   xr - c(-185, -155)
   yr - c(45, 70)

   map(xlim = xr, ylim = yr)
   map.axes()

   m - matrix(seq(0, 1, length = 40*40), nrow = 40)
   mr - as.raster(m)
   rasterImage(m, -180, 50, -160, 65)

   map(xlim = xr, ylim = yr, fill = fill, add = TRUE)

   if (!is.na(to.file)){
  cat(wrote:, ofile, \n)
  dev.state - dev.off()
   }

}


## END


 sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] mapproj_1.1-8.2akima_0.5-4RColorBrewer_1.0-2
[4] mapdata_2.1-3  maps_2.1-5

loaded via a namespace (and not attached):
[1] tools_2.12.0



Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine   04575-0475
http://www.bigelow.org/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drop levels problem

2010-11-29 Thread Joshua Wiley
Just to follow up on my own post a bit:

xmelt$year[xmelt$year == first, drop = TRUE]

will do what you want.  I think because in the subset there are
multiple columns not all of which are factor, the method for '[' being
used is not the factor one that would drop unused levels.  I did not
make that clear at all the first time around (and probably still
butchered it, which some knowledgeable soul may correct me on).  Also
I did get Hmisc installed, but I think dropUnusedLevels() does not
work in this case for a similar reason.

Henrique's solution is, as usual, the shortest :)

Josh

[snip]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] accuracy of GLM dispersion parameters

2010-11-29 Thread Timothy_Handley

I'm confused as to the trustworthiness of the dispersion parameters
reported by glm. Any help or advice would be greatly appreciated.

Context: I'm interested in using a fitted GLM to make some predictions.
Along with the predicted values, I'd also like to have estimates of
variance for each of those predictions. For a Gamma-family model, I believe
this can be done as Var[y] = dispersion parameter * predicted value ^ 2.
Thus, I'm interested in knowing the dispersion parameter for this fitted
model.

Specifics: The summary function says that my fitted GLM has a dispersion
parameter=15.8. On the other hand, the gamma.dispersion function (MASS)
says that the GLM uses a dispersion parameter of 1.86. I could understand
some modest difference, as the help for gamma.shape() says that the MASS
functions return a more accurate dispersion value than summary(). However,
these two numbers differ by a factor of 8, which is quite a lot. Is this
normal? Would you folks expect such a large difference? Which value should
I trust?

R terminal excerpt:


 summary(tempglm_g2)

Call:
glm(formula = precip_sbi ~ precip_oxx + precip_oxx_sq, family = Gamma(link
= identity),
data = w.combo, start = c(0.1, 0.4, 0.02))

Deviance Residuals:
 Min1QMedian3Q   Max
-2.9  -1.63183  -1.00720   0.04878   8.93461

Coefficients:
  Estimate Std. Error t value Pr(|t|)
(Intercept)0.092360.04834   1.911   0.0583 .
precip_oxx 0.268480.35891   0.748   0.4558
precip_oxx_sq  0.051380.13418   0.383   0.7024
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 15.78978)

Null deviance: 528.73  on 130  degrees of freedom
Residual deviance: 305.81  on 128  degrees of freedom
AIC: -100.33

Number of Fisher Scoring iterations: 5

 library(MASS)
 gamma.shape(tempglm_g2)

Alpha: 0.53807358
SE:0.05526108
 gamma.dispersion(tempglm_g2)
[1] 1.858482


Thanks,

Tim Handley
Research Assistant
Channel Islands National Park
(Will be working from both CHIS and SAMO)
CHIS Phone: 805-658-5759
SAMO Phone: 805-370-2300 x2412
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset by using multiple values

2010-11-29 Thread Phil Spector

One possibility would be to paste together the values before
subsetting:

subset(DC2_m,!paste(as.character(X2),X3,sep='\\0') %in% 
paste(as.character(Exclude_Data$Code),Exclude_Data$Dates,sep='\\0'))

(untested due to lack of a reproducible example).

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Mon, 29 Nov 2010, clangkamp wrote:



Hi
I would like to extend this item to the following:
I have the following table

 X1   X2   X3 value
1   BVEq AGR 11412 954.75
2 CA_Tot AGR 11412 970.59
...

str(DC2_m)

'data.frame':   104160 obs. of  4 variables:
$ X1   : Factor w/ 62 levels BVEq,CA_Tot,..: 1 2 3 4 5 6 45 46 47 48
...
 ..- attr(*, names)= chr  Figure.1 Figure.995 Figure.17873
Figure.17874 ...
$ X2   : Factor w/ 48 levels AGR,AKZ,ALB,..: 1 1 1 1 1 1 1 1 1 1 ...
 ..- attr(*, names)= chr  1 1 1 1 ...
$ X3   : int  11412 11412 11412 11412 11412 11412 11412 11412 11412 11412
...
$ value: num  955 971 NA NA NA ...


And I have a second (manual) table with entries of combinations of X2 and X3
which I want to exclude:

str(Exclude_Data)

'data.frame':   8 obs. of  2 variables:
$ Code : Factor w/ 5 levels ALB,ALQ,BAY,..: 3 3 2 4 5 3 1 2
$ Dates: int  12052 12233 12508 11960 13056 12142 12691 12783

subset(DC2_m, cbind(X2,X3) %in% Exclude_Data[])

Now the trick is to precisely exclude just the combinations chosen, and not
all combinations of Exclude_Data[1] and Exclude_Data[2], which is what
happens when doing two statements X2 in ED[1] AND X3 in ED[3].

Any takers ? Thanks in advance
Christian
--
View this message in context: 
http://r.789695.n4.nabble.com/R-Subset-by-using-multiple-values-tp815278p3064226.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >