Re: [R] Efficiency of C Compiler in R CMD SHLIB

2010-04-16 Thread Prof Brian Ripley
Since the context is missing in this message, from others this is 
about 32-bit Windows.


On Thu, 15 Apr 2010, yehengxin wrote:


Thanks for your response.  I found the folder to modify the compiler for C
source codes.  C++ 6.0 is an old C programming environment (1994~1998) but
it is efficient.   When compiling C source codes in C programming
environment, one needs to choose between debug or release modes.


That's true for just one family of compilers in my experience.


release mode is much faster than debug mode.  But in R's R CMD SHLIB,
I did not see such an option.


Then I suspect you did not look in the obvioud place (mentioned on 
the help page), for I see


% R CMD SHLIB --help
...
Windows only:
  -d, --debug   build a debug DLL

An optimized ('release') build is standard, and in any case gcc is 
capable of both optimizing and including debug information, unlike 
some other compilers.  With gcc debug code is normally the same speed, 
just a larger compiled file.



I want to try alternative compilers to see if I can reach that level of
efficiency in R's DLL.


This is *your* DLL, not one in R, surely?  Note that people have 
compiled R with Visual C++ 6.0 (to use the correct name) and it ran 
slower and less accurately than using gcc.  So finding VC++ to produce 
faster code is not usual, and this seems to be something special about 
your C code. The default level of optimization for gcc in R for 
Windows is -O3, and you could try raising it: also if you want to 
target only recent non-Atom chips set -tune= appropriately.


x86 is a very widely used architecture with a competitive field of 
commercial compilers.  On Linux (and AFAIK on Windows) gcc produces 
some of the best-performing code (see the comments in the 'R 
Administration and Installation Manual').  Most of the ways to produce 
faster code lose compliance with IEC60559 and accuracy (VC++ 6 never 
has those).  And the same code compiled with gcc runs on the same 
hardware only slightly slower on Windows than on Linux unless I/O is 
involved (where Windows is much slower).


Later, I may try using OPENMP in my C codes to do parallel 
computing.


gcc 4.2.1 supports OpenMP, and later versions support it better 
(OpenMP 3).



So I need to figure out how to change compiler to
generate DLL for R.  Could you give me some suggestions?  Thanks a lot!


A DLL is a DLL: you can compile it any way you like (although cdecl 
calling conventions work best, and compilers do differ in their 
conventions for function return values -- but those are not used in 
the .C interface).  There is a file README.packages in the R 
distribution with notes about using other compilers under Windows -- 
but the R developers have not used other than VC++ and Intel's ICC 
(not mentioned there) for several years.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression using R

2010-04-16 Thread Dieter Menne


Samuel Bravo wrote:
 
 
 I'm working on a very large project in which we do many calculations which
 include many types of regression such as, Liner, Quadratic, Cubic,
 Exponential, Sinusoidal, and Logarithmic. 
 

Students are often looking at the wrong place. It's not intuitive that
quadratic, cubic can be found under lm, because these are often termed
non-linear in basic university courses.  

Dieter

-- 
View this message in context: 
http://n4.nabble.com/Regression-using-R-tp1934475p1951658.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace / with - in date

2010-04-16 Thread arnaud Gaboury
Why don't you try something like :

Xd$x=as.date(xd$x,format=%y/%m/%d).





 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Christian Raschke
 Sent: Thursday, April 15, 2010 8:28 PM
 To: r-help@r-project.org
 Subject: Re: [R] Replace / with - in date
 
 Is there anything that speaks against just applying gsub to the factor
 levels if one would like to keep everything as factors (and not
 consider
 true Date classes or character vectors)? I.e:
 
   x - c(2000/01/01, 2001/02/01)
   xd - as.data.frame(x)
   levels(xd$x) - gsub(/, -, levels(xd$x))
 
 Christian
 
 
 On 04/15/2010 01:08 PM, David Winsemius wrote:
 
  On Apr 15, 2010, at 1:51 PM, prem_R wrote:
 
 
  Hi,every one .I have searched the solutions in the forum  for
  replacing my
  date value which is in a data frame  ,01/01/2000 to 01-01-2000 using
  replace
  function but got the following warning message
  x-2000/01/01
  xd-as.data.frame(x)
  xd$x-replace(xd$x,xd$x==/,-)
 
  The replace function does not work with factors, it works with
  (complete) vectors, not substrings. It's also a real hassle to do
 such
  operations on factors, so just use character vectors and try gsub
  instead:
 
   x-2000/01/01
   xd-as.data.frame(x, stringsAsFactors=FALSE)
   xd$x2-gsub(/,-, xd$x)
   xd
 x x2
  1 2000/01/01 2000-01-01
 
 
 
  Warning message:
  In `[-.factor`(`*tmp*`, list, value = -) :
   invalid factor level, NAs generated
 
  Is there any other method of doing it? or am i missing something?.
  please
  let me know if you need any more information.
 
  Thanks.
 
  Prem
  --
  View this message in context:
  http://n4.nabble.com/Replace-with-in-date-tp1911391p1911391.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  David Winsemius, MD
  West Hartford, CT
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 --
 Christian Raschke
 Department of Economics
 and
 ISDS Research Lab (HSRG)
 Louisiana State University
 Patrick Taylor Hall, Rm 2128
 Baton Rouge, LA 70803
 cras...@lsu.edu
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does sink stand for anything?

2010-04-16 Thread Barry Rowlingson
On Fri, Apr 16, 2010 at 1:49 AM, Sharpie ch...@sharpsteen.net wrote:

 Sink captures R output and directs it elsewhere- common places are a file or
 device such as /dev/null

 Personally it always connected with the concept of a sink in a
 mathematical system as something that removes constituants from the system.

 Also, note that 'sink' has nothing to do with floating point numbers...

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glmer with non integer weights

2010-04-16 Thread Kay Cichini

thanks thierry,

i considered this transformations already, but variance is not stabilized
and/or normality is neither achieved.
i guess i'll have to look out for non-parametrics?

best regards,
kay
-- 
View this message in context: 
http://n4.nabble.com/glmer-with-non-integer-weights-tp1837179p1965623.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame manipulation

2010-04-16 Thread arnaud Gaboury
Dear group,

Here is my data.frame :


df -
structure(list(DESCRIPTION = c(PRM HGH GD ALU, PRM HGH GD ALU, 
PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, 
STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , 
STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , 
SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, 
SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, 
SPCL HIGH GRAD, SPCL HIGH GRAD), CREATED.DATE = structure(c(14708, 
14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700, 
14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708, 
14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(-1, 
1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, 
-1, 1, 1, 1, -1), CLOSING.PRICE = c(2,415.9000, 2,415.9000, 
25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600, 
2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 
2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300, 
2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 
2,388.4300, 2,388.4300)), .Names = c(DESCRIPTION, CREATED.DATE, 
QUANITY, CLOSING.PRICE), row.names = 26:49, class = data.frame)

I am looking at summarize it in something like this :

 op
 DESCRIPTION POSITION   DATE
1 PRIMARY NICKEL0 2010-03-10
2 PRM HGH GD ALU0 2010-04-09
3 SPCL HIGH GRAD2 2010-04-09
4 STANDARD LEAD 0 2010-04-06



To obtain op, I wrote this following line :

   op=ddply(df, c(DESCRIPTION), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE)).

Until there, fine. But I need to have one more column, CLOSING.PRICE. If I
write this line :


   op1=ddply(c, c(DESCRIPTION,CLOSING.PRICE), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE))

Here is what I get:


 op1
 DESCRIPTION CLOSING.PRICE POSITION   DATE
1 PRIMARY NICKEL   25,755.71000 2010-03-05
2 PRIMARY NICKEL   25,760.86000 2010-03-10
3 PRM HGH GD ALU2,415.90000 2010-04-09
4 SPCL HIGH GRAD2,388.43000 2010-01-25
5 SPCL HIGH GRAD2,420.73001 2010-04-08
6 SPCL HIGH GRAD2,421.05001 2010-04-09
7 STANDARD LEAD 2,355.9600   -1 2010-04-01
8 STANDARD LEAD 2,357.12001 2010-04-06

Not exactly what I want. Can anyone help?
TY

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glmer with non integer weights

2010-04-16 Thread Kay Cichini

thank you thomas for the helpful hint!

yours,
kay
-- 
View this message in context: 
http://n4.nabble.com/glmer-with-non-integer-weights-tp1837179p1965827.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in library(gplots) : there is no package called 'gplots'

2010-04-16 Thread Vava

Thanks for your suggestion Tal. Unfortunately, still no luck with me ...
still get the usual error message:

 Error in library(gplots) : there is no package called 'gplots' , whatever
I try to install.

This is a mystery to me with respect to why /how. I am really stuck with
that problem.

Best, 

Valère
-- 
View this message in context: 
http://n4.nabble.com/Error-in-library-gplots-there-is-no-package-called-gplots-tp1690367p1968197.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in library(gplots) : there is no package called 'gplots'

2010-04-16 Thread james

Hi Vava,
What version of R are you using? I'm not sure but I think that R will 
refuse to install a package in this way if the version of gplots is 
incompatiable with the version of R you're using. You can check the 
depends of packages on CRAN.


Regards,
James

Vava wrote:

Thanks for your suggestion Tal. Unfortunately, still no luck with me ...
still get the usual error message:

 Error in library(gplots) : there is no package called 'gplots' , whatever
I try to install.

This is a mystery to me with respect to why /how. I am really stuck with
that problem.

Best, 


Valère



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Return a variable name

2010-04-16 Thread soeren . vogel

Hello,

how can I return the name of a variable, say a$b, from a function?

fun - function(x){
  return(substitute(x));
}
a  - data.frame(b=1:10);
fun(a$b)

... returns a$b, but this is a type language, thus I can't use it as a  
character string, can I? How?


Thanks for help,

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merge

2010-04-16 Thread n.via...@libero.it
I have a problem with the merge function:
I need to merge the data.frames that you will find as arrachmente...I try all 
the possible combinationsbut none seems to work properly
Does anyone knows how to do it??
thanks
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error at R CMD check

2010-04-16 Thread carol white
Hi,
I generated an R package but at running R CMD check, I got the following error 
message for the first data file:

*** installing help indices
  Building/Updating help pages for package 'jamda'
 Formats: text html latex example 
  f1   texthtmllatex   example
  f2texthtmllatex   example
 f3 texthtmllatex   example
  f4texthtmllatex   example
  f5texthtmllatex   example
  f6  texthtmllatex   example
too many pairs of braces in file 'data1.Rd' at /usr/lib64/R/share/per
l/R/Rdconv.pm line 295, $rdfile line 7076.
ERROR: building help failed for package ‘my_package’

Should the data sets be in a specific format? Mine contains data in float 
seperated by tab with column names and row names. No description in DESCRIPTIOn 
file yet.

Thanks 

Carol





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with the version of R

2010-04-16 Thread arindam fadikar
Dear users,

I am using R in UBUNTU , but the version is 9.1. How can I upgrade it to R
10.1?

-- 
Arindam Fadikar
M.Stat
Indian Statistical Institute.
New Delhi, India

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bootstrapping a repeated measures ANOVA

2010-04-16 Thread Fischer, Felix
Hello everyone,

i have a question regarding the sampling process in boot().

I try to bootstrap F-values for a repeated measures ANOVA to get a confidence 
interval of F-values. Unfortunately, while the aov works fine, it fails in the 
boot()-function. I think the problem might be that the resampling process fails 
to select both lines of data representing the 2 measuring times for one subject 
and I therefore get missing cases.

The data is organised like this:
subject ortmz   PHQ
1  1  1  x
1  1  2  y
2  1  1  z
2  1  2  zz
...


Is there any way to specify, that both lines need to be selected?


Thanks a lot!
Felix Fischer

P.S. If you need to have a look to my code:

F_values - function(formula, data, indices) {
d - data[indices,] # allows boot to select sample
fit=aov(formula,data=d) #fit model
return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F 
value`)) #return F-values
   }

   results - boot(data=anova.daten, statistic=F_values,
  R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz))


Dipl. Psych. Felix Fischer

Medizinische Klinik mit Schwerpunkt Psychosomatik
Charité -- Universitätsmedizin Berlin
Luisenstr. 13a
10117 Berlin

Tel.: 030 - 450 553575
Email: felix.fisc...@charite.demailto:felix.fisc...@charite.de


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SPAM] Re: Error in library(gplots) : there is no package called 'gplots'

2010-04-16 Thread Martin Valere
Dear James,

  i have tried to install the package RODBC1-3.1 with R version 2.9.0 and 
2.10.1 (using Opensuse 11.1, 64bits). I have tried install.packages locally 
(package downloaded and stored locally on the computer) or directly from 
Internet (using different mirrors !). Same results each time ...

Same outcome either if I try to install other packages like for instance e1071. 
So it appears this is not linked to the package itself but rather with R-base 
or R-devel (both are installed) ...

Regards,

Valère

-Ursprüngliche Nachricht-
Von: james [mailto:ja...@ipec.co.uk]
Gesendet: Freitag, 16. April 2010 10:47
An: Martin Valere
Cc: R Help List
Betreff: [SPAM] Re: [R] Error in library(gplots) : there is no package
called 'gplots'
Wichtigkeit: Niedrig


Hi Vava,
What version of R are you using? I'm not sure but I think that R will 
refuse to install a package in this way if the version of gplots is 
incompatiable with the version of R you're using. You can check the 
depends of packages on CRAN.

Regards,
James

Vava wrote:
 Thanks for your suggestion Tal. Unfortunately, still no luck with me ...
 still get the usual error message:

  Error in library(gplots) : there is no package called 'gplots' , whatever
 I try to install.

 This is a mystery to me with respect to why /how. I am really stuck with
 that problem.

 Best, 

 Valère
   

 -Ursprüngliche Nachricht-
Von:Martin Valere  
Gesendet:   Donnerstag, 25. März 2010 10:58
An: 'r-help@R-project.org'
Betreff:Error in library(gplots) : there is no package called 'gplots'

Dear all,

   I have an issue trying to install new packages (have tried with RODBC_1.3-1, 
gplots_2.6.1, gtools_2.7.4 packages) and get the same error message :
Error in library(gplots) : there is no package called 'gplots'

Only clue I have found so far on the Web is related to Perl (Perl modules are 
installed on my computer, but which one is related to gplots if any ?); no 
gplots in usr/lib or /usr/lib64 at least ... I am somewhat lost here, having no 
idea about Perl (if Perl is really the issue ?).

I am using OpenSuse 11.1 (64bits); and R version 2.9.0. Installation of package 
 is performed offline as Root.




Valère, Switzerland
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error at R CMD check

2010-04-16 Thread Duncan Murdoch

carol white wrote:

Hi,
I generated an R package but at running R CMD check, I got the following error 
message for the first data file:

*** installing help indices
  Building/Updating help pages for package 'jamda'
 Formats: text html latex example 
  f1   texthtmllatex   example

  f2texthtmllatex   example
 f3 texthtmllatex   example
  f4texthtmllatex   example
  f5texthtmllatex   example
  f6  texthtmllatex   example
too many pairs of braces in file 'data1.Rd' at /usr/lib64/R/share/per
l/R/Rdconv.pm line 295, $rdfile line 7076.
ERROR: building help failed for package ‘my_package’

Should the data sets be in a specific format? Mine contains data in float 
seperated by tab with column names and row names. No description in DESCRIPTIOn 
file yet.
  


data1.Rd shouldn't be a dataset, it should be a help file describing a 
dataset.


In a more recent version of R you might get a more informative error 
message, telling you where the error was in that file.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R CMD REMOVE etc. query

2010-04-16 Thread Gabor Grothendieck
Assuming its the last package that you loaded detach() without
arguments will detach it.

On Thu, Apr 15, 2010 at 1:45 PM, Prof. John C Nash nas...@uottawa.ca wrote:
 Brian Ripley pointed out that the library() documentation (third screen,
 however) says that library() and require() check current environment to see
 if a package is loaded and only load if it is not present. I may have
 oversimplified, and clarifications welcome. But this is clearly NOT what I
 want, since I need the latest package version to test.

 Tentative solution is outlined, but suggestions welcome on string cleanup
 issue mentioned.

 As I need to remove a package and its dependencies before reloading, I can
 use tool::pkgDepends to get a list.

 I found that a character string extracted from the dependency vector gives
 'invalid name' error in detach(). That is, I can create a variable
 myfoo=package:foo, but detach(myfoo) gives the error while typing
 detach(package:foo) works fine.  Workaround seems to be

   slist-search()
   idx-which(slist==myfoo)
   detach(idx)

 There's still a nuisance issue of how to strip off  the (=0.7.11)
 descriptors in the dependency list. strsplit() will work, but I seem to need
 to loop through the list to use it when only some of the packages are
 restricted by qualifiers.

 If someone has already dealt with this type of issue, I'd be happy to know.
 For example, if there is a forceLoad() somewhere, it would save the effort
 above and could be useful for developers to ensure they are using the right
 version of a package.

 JN



 From: Prof. John C Nash nashjc_at_uottawa.ca
 Date: Thu, 15 Apr 2010 10:17:46 -0400

 I've been working on a fairly complex package that is a wrapper for
 several optimization routines. In this work, I've attempted to do the
 following:

    * edit the package code foo.R
    * in a root terminal at the right directory location

 R CMD REMOVE foo

 R CMD INSTALL foo
 However, I don't get the right code. In fact, if I just do the remove,

     library(foo)

 does not throw an error. If I stop my R session and restart it, I do.

 Is this expected behaviour?

 For information, I run scripted tests that start with

    rm(list=ls())
    library(foo)

 to ensure I'm getting new code each time.

 If desired I can provide a minimal package to show this, but I expect that
 it is a known issue for which I've missed the documentation. Perhaps there
 is a command to reset the session. I did a brief search, but appropriate
 keywords pick up a lot of irrelevant material.

 JN

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation

2010-04-16 Thread Ista Zahn
Hi,
I'm not sure I understand what you want exactly. My best guess is that you
want something like

op=ddply(DF, c(DESCRIPTION), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE), CLOSING.PRICE =
CLOSING.PRICE[CREATED.DATE == max(CREATED.DATE)])

op - unique(op)

Does that do it?

-Ista

On Fri, Apr 16, 2010 at 4:16 AM, arnaud Gaboury arnaud.gabo...@gmail.comwrote:

 Dear group,

 Here is my data.frame :


 df -
 structure(list(DESCRIPTION = c(PRM HGH GD ALU, PRM HGH GD ALU,
 PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL,
 STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
 STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
 SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
 SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
 SPCL HIGH GRAD, SPCL HIGH GRAD), CREATED.DATE = structure(c(14708,
 14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700,
 14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708,
 14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(-1,
 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,
 -1, 1, 1, 1, -1), CLOSING.PRICE = c(2,415.9000, 2,415.9000,
 25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600,
 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600,
 2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300,
 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500,
 2,388.4300, 2,388.4300)), .Names = c(DESCRIPTION, CREATED.DATE,
 QUANITY, CLOSING.PRICE), row.names = 26:49, class = data.frame)

 I am looking at summarize it in something like this :

  op
 DESCRIPTION POSITION   DATE
 1 PRIMARY NICKEL0 2010-03-10
 2 PRM HGH GD ALU0 2010-04-09
 3 SPCL HIGH GRAD2 2010-04-09
 4 STANDARD LEAD 0 2010-04-06



 To obtain op, I wrote this following line :

op=ddply(df, c(DESCRIPTION), summarise, POSITION=
 sum(QUANITY),DATE=max(CREATED.DATE)).

 Until there, fine. But I need to have one more column, CLOSING.PRICE. If
 I
 write this line :


op1=ddply(c, c(DESCRIPTION,CLOSING.PRICE), summarise, POSITION=
 sum(QUANITY),DATE=max(CREATED.DATE))

 Here is what I get:


  op1
 DESCRIPTION CLOSING.PRICE POSITION   DATE
 1 PRIMARY NICKEL   25,755.71000 2010-03-05
 2 PRIMARY NICKEL   25,760.86000 2010-03-10
 3 PRM HGH GD ALU2,415.90000 2010-04-09
 4 SPCL HIGH GRAD2,388.43000 2010-01-25
 5 SPCL HIGH GRAD2,420.73001 2010-04-08
 6 SPCL HIGH GRAD2,421.05001 2010-04-09
 7 STANDARD LEAD 2,355.9600   -1 2010-04-01
 8 STANDARD LEAD 2,357.12001 2010-04-06

 Not exactly what I want. Can anyone help?
 TY

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting number of values by row (text, not numbers)

2010-04-16 Thread Laura Ferrero-Miliani
Hi everyone!
I am very new to R and I am having some difficulties.
My data set looks something like this:

ABCD   E
cat monkey   cat dogcat
cat

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Return a variable name

2010-04-16 Thread Ista Zahn
Hi Sören

Somehow this feels dirty, but you can do

fun - function(x){
  result - capture.output(print(substitute(x)))
 return(result);
}

-Ista

On Fri, Apr 16, 2010 at 5:26 AM, soeren.vo...@eawag.ch wrote:

 Hello,

 how can I return the name of a variable, say a$b, from a function?

 fun - function(x){
  return(substitute(x));
 }
 a  - data.frame(b=1:10);
 fun(a$b)

 ... returns a$b, but this is a type language, thus I can't use it as a
 character string, can I? How?

 Thanks for help,

 Sören

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with the version of R

2010-04-16 Thread Ista Zahn
Hi Arindam,

Follow the instructions at http://lib.stat.cmu.edu/R/CRAN/bin/linux/ubuntu/

-Ista

On Fri, Apr 16, 2010 at 5:54 AM, arindam fadikar
arindam.fadi...@gmail.comwrote:

 Dear users,

 I am using R in UBUNTU , but the version is 9.1. How can I upgrade it to R
 10.1?

 --
 Arindam Fadikar
 M.Stat
 Indian Statistical Institute.
 New Delhi, India

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Counting number of values by row (text, not numbers)

2010-04-16 Thread Laura Ferrero-Miliani
Hi everyone!
I am very new to R and I am having some difficulties.
My data set looks something like this:

subjectA                B                C
   D               E
   1  cat             monkey       cat
dog            cat
   2  cat cat   cat
 cat dog


I want to create three new variables, that count the amount of cat,
monkey and dog per subject

subjectABC
   D   Ecat dog monkey
   1  cat monkey   cat
dogcat  3   1 1
   2  cat cat   cat
 cat dog41 0


I have been looking at rowSums, rowsum, apply, grep, and doing some
searches, but I can only find count for numerical values or NA values.

Thanks in advance,

L

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation

2010-04-16 Thread arnaud Gaboury
When I pass your command line, here is what I get :

op=ddply(df,c(DESCRIPTION),summarise,POSITION=sum(QUANITY),DATE=max(CREAT
ED.DATE),SETTLEMENT=CLOSING.PRICE[CREATED.DATE=max(CREATED.DATE)])
 op

 DESCRIPTION POSITION   DATE SETTLEMENT
1 PRIMARY NICKEL0 2010-03-10   NA
2 PRM HGH GD ALU0 2010-04-09   NA
3 SPCL HIGH GRAD2 2010-04-09   NA
4 STANDARD LEAD 0 2010-04-06   NA


That is exactly what I want, but not with the NA ! the SETTLEMENT column
should show the corresponding CLOSING.PRICE for the CREATED.DATE

***
Arnaud Gaboury
Mobile: +41 79 392 79 56
BBM: 255B488F
***

From: Ista Zahn [mailto:istaz...@gmail.com] 
Sent: Friday, April 16, 2010 1:05 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation

Hi,
I'm not sure I understand what you want exactly. My best guess is that you
want something like

op=ddply(DF, c(DESCRIPTION), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE), CLOSING.PRICE =
CLOSING.PRICE[CREATED.DATE == max(CREATED.DATE)])

op - unique(op)

Does that do it?

-Ista
On Fri, Apr 16, 2010 at 4:16 AM, arnaud Gaboury arnaud.gabo...@gmail.com
wrote:
Dear group,

Here is my data.frame :


df -
structure(list(DESCRIPTION = c(PRM HGH GD ALU, PRM HGH GD ALU,
PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL,
STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
SPCL HIGH GRAD, SPCL HIGH GRAD), CREATED.DATE = structure(c(14708,
14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700,
14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708,
14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(-1,
1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,
-1, 1, 1, 1, -1), CLOSING.PRICE = c(2,415.9000, 2,415.9000,
25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600,
2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600,
2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300,
2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500,
2,388.4300, 2,388.4300)), .Names = c(DESCRIPTION, CREATED.DATE,
QUANITY, CLOSING.PRICE), row.names = 26:49, class = data.frame)

I am looking at summarize it in something like this :

 op
    DESCRIPTION POSITION       DATE
1 PRIMARY NICKEL        0 2010-03-10
2 PRM HGH GD ALU        0 2010-04-09
3 SPCL HIGH GRAD        2 2010-04-09
4 STANDARD LEAD         0 2010-04-06



To obtain op, I wrote this following line :

   op=ddply(df, c(DESCRIPTION), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE)).

Until there, fine. But I need to have one more column, CLOSING.PRICE. If I
write this line :


   op1=ddply(c, c(DESCRIPTION,CLOSING.PRICE), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE))

Here is what I get:


 op1
    DESCRIPTION CLOSING.PRICE POSITION       DATE
1 PRIMARY NICKEL   25,755.7100        0 2010-03-05
2 PRIMARY NICKEL   25,760.8600        0 2010-03-10
3 PRM HGH GD ALU    2,415.9000        0 2010-04-09
4 SPCL HIGH GRAD    2,388.4300        0 2010-01-25
5 SPCL HIGH GRAD    2,420.7300        1 2010-04-08
6 SPCL HIGH GRAD    2,421.0500        1 2010-04-09
7 STANDARD LEAD     2,355.9600       -1 2010-04-01
8 STANDARD LEAD     2,357.1200        1 2010-04-06

Not exactly what I want. Can anyone help?
TY

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Return a variable name

2010-04-16 Thread Duncan Murdoch

On 16/04/2010 5:26 AM, soeren.vo...@eawag.ch wrote:

Hello,

how can I return the name of a variable, say a$b, from a function?


Use deparse(substitute(x)), not just substitute(x).  By the way, to be 
picky, a$b is not the name of a variable.  It is an expression that 
extracts the b element of a.


Duncan Murdoch



fun - function(x){
   return(substitute(x));
}
a  - data.frame(b=1:10);
fun(a$b)

... returns a$b, but this is a type language, thus I can't use it as a  
character string, can I? How?


Thanks for help,

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Counting number of values by row (text, not numbers)

2010-04-16 Thread Ista Zahn
Hi Laura,
Usually this kind of thing is easier if you put your data into a long
format. I would use something like

Dat - read.table(textConnection(subject A  B C D E
1 cat monkey cat dog cat
2 cat cat cat cat dog), header=TRUE)

library(reshape)
m.Dat - melt(Dat, id=subject)
xtabs(~subject+value, m.Dat)
   value
subject cat monkey dog
  1   3  1   1
  2   4  0   1

but you could use the reshape function instead of reshape::melt.

-Ista

On Fri, Apr 16, 2010 at 7:18 AM, Laura Ferrero-Miliani laur...@gmail.comwrote:

 Hi everyone!
 I am very new to R and I am having some difficulties.
 My data set looks something like this:

 subjectABC
D   E
   1  cat monkey   cat
 dogcat
   2  cat cat   cat
 cat dog
 

 I want to create three new variables, that count the amount of cat,
 monkey and dog per subject

 subjectABC
   D   Ecat dog monkey
   1  cat monkey   cat
 dogcat  3   1 1
   2  cat cat   cat
 cat dog41 0
 

 I have been looking at rowSums, rowsum, apply, grep, and doing some
 searches, but I can only find count for numerical values or NA values.

 Thanks in advance,

 L

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation

2010-04-16 Thread Ista Zahn
It works for me...

 DF -
+ structure(list(DESCRIPTION = c(PRM HGH GD ALU, PRM HGH GD ALU,
+ PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL,
+ STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
+ STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
+ SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
+ SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
+ SPCL HIGH GRAD, SPCL HIGH GRAD), CREATED.DATE = structure(c(14708,
+ 14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700,
+ 14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708,
+ 14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(-1,
+ 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,
+ -1, 1, 1, 1, -1), CLOSING.PRICE = c(2,415.9000, 2,415.9000,
+ 25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600,
+ 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600,
+ 2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300,
+ 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500,
+ 2,388.4300, 2,388.4300)), .Names = c(DESCRIPTION, CREATED.DATE,
+ QUANITY, CLOSING.PRICE), row.names = 26:49, class = data.frame)

 library(plyr)

 op=ddply(DF, c(DESCRIPTION), summarise, POSITION=
+ sum(QUANITY),DATE=max(CREATED.DATE), SETTLEMENT =
CLOSING.PRICE[CREATED.DATE == max(CREATED.DATE)])

 op - unique(op)
 op
  DESCRIPTION POSITION   DATE  SETTLEMENT
1  PRIMARY NICKEL0 2010-03-10 25,760.8600
3  PRM HGH GD ALU0 2010-04-09  2,415.9000
5  SPCL HIGH GRAD2 2010-04-09  2,421.0500
10 STANDARD LEAD 0 2010-04-06  2,357.1200


-Ista

On Fri, Apr 16, 2010 at 7:21 AM, arnaud Gaboury arnaud.gabo...@gmail.comwrote:

 When I pass your command line, here is what I get :


 op=ddply(df,c(DESCRIPTION),summarise,POSITION=sum(QUANITY),DATE=max(CREAT
 ED.DATE),SETTLEMENT=CLOSING.PRICE[CREATED.DATE=max(CREATED.DATE)])
  op

 DESCRIPTION POSITION   DATE SETTLEMENT
 1 PRIMARY NICKEL0 2010-03-10   NA
 2 PRM HGH GD ALU0 2010-04-09   NA
 3 SPCL HIGH GRAD2 2010-04-09   NA
 4 STANDARD LEAD 0 2010-04-06   NA


 That is exactly what I want, but not with the NA ! the SETTLEMENT column
 should show the corresponding CLOSING.PRICE for the CREATED.DATE

 ***
 Arnaud Gaboury
 Mobile: +41 79 392 79 56
 BBM: 255B488F
 ***

 From: Ista Zahn [mailto:istaz...@gmail.com]
 Sent: Friday, April 16, 2010 1:05 PM
 To: arnaud Gaboury
 Cc: r-help@r-project.org
 Subject: Re: [R] data frame manipulation

 Hi,
 I'm not sure I understand what you want exactly. My best guess is that you
 want something like

 op=ddply(DF, c(DESCRIPTION), summarise, POSITION=
 sum(QUANITY),DATE=max(CREATED.DATE), CLOSING.PRICE =
 CLOSING.PRICE[CREATED.DATE == max(CREATED.DATE)])

 op - unique(op)

 Does that do it?

 -Ista
 On Fri, Apr 16, 2010 at 4:16 AM, arnaud Gaboury arnaud.gabo...@gmail.com
 wrote:
 Dear group,

 Here is my data.frame :


 df -
 structure(list(DESCRIPTION = c(PRM HGH GD ALU, PRM HGH GD ALU,
 PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL,
 STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
 STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
 SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
 SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
 SPCL HIGH GRAD, SPCL HIGH GRAD), CREATED.DATE = structure(c(14708,
 14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700,
 14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708,
 14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(-1,
 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,
 -1, 1, 1, 1, -1), CLOSING.PRICE = c(2,415.9000, 2,415.9000,
 25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600,
 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600,
 2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300,
 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500,
 2,388.4300, 2,388.4300)), .Names = c(DESCRIPTION, CREATED.DATE,
 QUANITY, CLOSING.PRICE), row.names = 26:49, class = data.frame)

 I am looking at summarize it in something like this :

  op
 DESCRIPTION POSITION   DATE
 1 PRIMARY NICKEL0 2010-03-10
 2 PRM HGH GD ALU0 2010-04-09
 3 SPCL HIGH GRAD2 2010-04-09
 4 STANDARD LEAD 0 2010-04-06



 To obtain op, I wrote this following line :

op=ddply(df, c(DESCRIPTION), summarise, POSITION=
 sum(QUANITY),DATE=max(CREATED.DATE)).

 Until there, fine. But I need to have one more column, CLOSING.PRICE. If
 I
 write this line :


op1=ddply(c, c(DESCRIPTION,CLOSING.PRICE), summarise, POSITION=
 sum(QUANITY),DATE=max(CREATED.DATE))

 Here is what I get:


  op1
 DESCRIPTION CLOSING.PRICE POSITION   DATE
 1 PRIMARY NICKEL   25,755.71000 2010-03-05
 2 PRIMARY NICKEL   25,760.86000 2010-03-10
 3 PRM HGH GD ALU

Re: [R] data frame manipulation

2010-04-16 Thread arnaud Gaboury
Excellent!! You saved me hours and hours of turning around and around.
TY so much.





From: Ista Zahn [mailto:istaz...@gmail.com] 
Sent: Friday, April 16, 2010 1:37 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation

It works for me...

 DF -
+ structure(list(DESCRIPTION = c(PRM HGH GD ALU, PRM HGH GD ALU,
+ PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL,
+ STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
+ STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
+ SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
+ SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
+ SPCL HIGH GRAD, SPCL HIGH GRAD), CREATED.DATE = structure(c(14708,
+ 14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700,
+ 14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708,
+ 14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(-1,
+ 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,
+ -1, 1, 1, 1, -1), CLOSING.PRICE = c(2,415.9000, 2,415.9000,
+ 25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600,
+ 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600,
+ 2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300,
+ 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500,
+ 2,388.4300, 2,388.4300)), .Names = c(DESCRIPTION, CREATED.DATE,
+ QUANITY, CLOSING.PRICE), row.names = 26:49, class = data.frame)
 
 library(plyr)
 
 op=ddply(DF, c(DESCRIPTION), summarise, POSITION=
+ sum(QUANITY),DATE=max(CREATED.DATE), SETTLEMENT =
CLOSING.PRICE[CREATED.DATE == max(CREATED.DATE)])
 
 op - unique(op)
 op
  DESCRIPTION POSITION   DATE  SETTLEMENT
1  PRIMARY NICKEL    0 2010-03-10 25,760.8600
3  PRM HGH GD ALU    0 2010-04-09  2,415.9000
5  SPCL HIGH GRAD    2 2010-04-09  2,421.0500
10 STANDARD LEAD 0 2010-04-06  2,357.1200
 

-Ista
On Fri, Apr 16, 2010 at 7:21 AM, arnaud Gaboury arnaud.gabo...@gmail.com
wrote:
When I pass your command line, here is what I get :

op=ddply(df,c(DESCRIPTION),summarise,POSITION=sum(QUANITY),DATE=max(CREAT
ED.DATE),SETTLEMENT=CLOSING.PRICE[CREATED.DATE=max(CREATED.DATE)])
 op

    DESCRIPTION POSITION       DATE SETTLEMENT
1 PRIMARY NICKEL        0 2010-03-10       NA
2 PRM HGH GD ALU        0 2010-04-09       NA
3 SPCL HIGH GRAD        2 2010-04-09       NA
4 STANDARD LEAD         0 2010-04-06       NA


That is exactly what I want, but not with the NA ! the SETTLEMENT column
should show the corresponding CLOSING.PRICE for the CREATED.DATE

***
Arnaud Gaboury
Mobile: +41 79 392 79 56
BBM: 255B488F
***

From: Ista Zahn [mailto:istaz...@gmail.com]
Sent: Friday, April 16, 2010 1:05 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation

Hi,
I'm not sure I understand what you want exactly. My best guess is that you
want something like

op=ddply(DF, c(DESCRIPTION), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE), CLOSING.PRICE =
CLOSING.PRICE[CREATED.DATE == max(CREATED.DATE)])

op - unique(op)

Does that do it?

-Ista
On Fri, Apr 16, 2010 at 4:16 AM, arnaud Gaboury arnaud.gabo...@gmail.com
wrote:
Dear group,

Here is my data.frame :


df -
structure(list(DESCRIPTION = c(PRM HGH GD ALU, PRM HGH GD ALU,
PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL, PRIMARY NICKEL,
STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
STANDARD LEAD , STANDARD LEAD , STANDARD LEAD , STANDARD LEAD ,
SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD, SPCL HIGH GRAD,
SPCL HIGH GRAD, SPCL HIGH GRAD), CREATED.DATE = structure(c(14708,
14708, 14672, 14673, 14678, 14678, 14700, 14700, 14700, 14700,
14700, 14700, 14700, 14705, 14707, 14707, 14707, 14708, 14708,
14708, 14708, 14708, 14622, 14634), class = Date), QUANITY = c(-1,
1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,
-1, 1, 1, 1, -1), CLOSING.PRICE = c(2,415.9000, 2,415.9000,
25,755.7100, 25,755.7100, 25,760.8600, 25,760.8600, 2,355.9600,
2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600, 2,355.9600,
2,355.9600, 2,357.1200, 2,420.7300, 2,420.7300, 2,420.7300,
2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500, 2,421.0500,
2,388.4300, 2,388.4300)), .Names = c(DESCRIPTION, CREATED.DATE,
QUANITY, CLOSING.PRICE), row.names = 26:49, class = data.frame)

I am looking at summarize it in something like this :

 op
    DESCRIPTION POSITION       DATE
1 PRIMARY NICKEL        0 2010-03-10
2 PRM HGH GD ALU        0 2010-04-09
3 SPCL HIGH GRAD        2 2010-04-09
4 STANDARD LEAD         0 2010-04-06



To obtain op, I wrote this following line :

   op=ddply(df, c(DESCRIPTION), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE)).

Until there, fine. But I need to have one more column, CLOSING.PRICE. If I
write this line :


   op1=ddply(c, c(DESCRIPTION,CLOSING.PRICE), summarise, POSITION=
sum(QUANITY),DATE=max(CREATED.DATE))

Here is what I get:


[R] Help on rKward

2010-04-16 Thread Ronaldo Reis Junior
Hi,

I'm testing rKward and it's become a great GUI for R on linux, mainly 
for new linux users. I'm not a new linux user and I use Emacs for my own 
R's script. But I always try new GUIs or IDEs for to recommend to my 
students. The most dificult for new R and Linux users is: I have 
installed R on my linux but I dont found the R icon. This happen because 
R on linux have only the R-Console in your basic installation. After 
this barrier, users can install JGR, Rcmdr, etc. Anyway rKward is the 
best way to new Linux users start to use R. Lets go to my specific problem.

The problem is the use of rKward for the heavy R user but new Linux 
user. Heavy R user dont use menus (they use rKward most like a script 
IDE than R GUI) and normally dont need to use the rKward output, made 
automatically by using rKward menus or manually using rk.print() and 
rk.header() function. They normally use sweave or a ascii output with 
comments. That is my problem? Exist anyway to save my ascii output from 
rKward to a file without need to use the copy and paste function?

Thanks a lot.
Inte
Ronaldo

-- 
14ª lei - Geralmente, só quando você puder publicar seus resultados,
   eles são bons o suficiente para fazer parte de sua dissertação.

   --Herman, I. P. 2007. Following the law. NATURE, Vol 445, p. 228.
  Prof. Ronaldo Reis Júnior
|  .''`. UNIMONTES/DBG/Lab. Ecologia Comportamental e Computacional
| : :'  : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia
| `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil
|   `- Fone: (38) 3229-8192 | ronaldo.r...@unimontes.br
| http://www.ppgcb.unimontes.br/lecc | LinuxUser#: 205366


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merge

2010-04-16 Thread n.via...@libero.it
I have a problem with the merge command.
I have to merge two dataframe that looks like the following example:

CODPROD   N1   N3   N4
23   3   55 4
24   5  6736
25  3   73 24



second data frame


CODPROD  N1  N2   
30   34   45
45   078
65056


The result should be:

CODPROD N1   N2 N3N4
23 3   NA554
24 5   NA67   36
25 3   NA73   24
30 34 45  NA   NA
45  0  78  NA   NA
65  0   56  NANA

Anyone knows how to do it??

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weights in binomial glm

2010-04-16 Thread Jan van der Laan
I have some questions about the use of weights in binomial glm as I am
not getting the results I would expect. In my case the weights I have
can be seen as 'replicate weights'; one respondent i in my dataset
corresponds to w[i] persons in the population. From the documentation
of the glm method, I understand that the weights can indeed be used
for this: For a binomial GLM prior weights are used to give the
number of trials when the response is the proportion of successes.
From Modern applied statistics with S-Plus 3rd ed. I understand the
same.

However, I am getting some strange results. I generated an example:

Generate some data which is simular to my dataset
 Z - rbinom(1000, 1, 0.1)
 W - round(rnorm(1000, 100, 40))
 W[W  1] - 1

Probability of success can either be estimated using:
 sum(Z*W)/sum(W)
[1] 0.09642109

Or using glm:
 model - glm(Z ~ 1, weights=W, family=binomial())
Warning message:
In glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart,  :
  fitted probabilities numerically 0 or 1 occurred
 predict(model, type=response)[1]
   1
2.220446e-16

These two results are obviously not the same. The strange thing is
that when I scale the weights, such that the total equals one, the
probability is correctly estimated:

 model - glm(Z ~ 1, weights=W/sum(W), family=binomial())
Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
 predict(model, type=response)[1]
 1
0.09642109


However scaling of the weights should, as far as I am aware, not have
an effect on the estimated parameters. I also tried some other
scalings. And, for example scaling the weights by 20 also gives me the
correct result.

 model - glm(Z ~ 1, weights=W/20, family=binomial())
Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
 predict(model, type=response)[1]
 1
0.09642109


Am I misinterpreting the weights? Could this be a numerical problem?

Regards,

Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: merge

2010-04-16 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 16.04.2010 14:00:09:

 I have a problem with the merge command.
 I have to merge two dataframe that looks like the following example:
 
 CODPROD   N1   N3   N4
 23   3   55 4
 24   5  6736
 25  3   73 24
 
 
 
 second data frame
 
 
 CODPROD  N1  N2 
 30   34   45
 45   078
 65056
 
 
 The result should be:
 
 CODPROD N1   N2 N3N4
 23 3   NA554
 24 5   NA67   36
 25 3   NA73   24
 30 34 45  NA NA
 45  0  78  NA NA 
 65  0   56  NA  NA


merge(data1, data2, by=CODPROD, all=T)

should work. So what does not work in your case?

Regards
Petr




 
 Anyone knows how to do it??
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge

2010-04-16 Thread Gabor Grothendieck
Try this:

library(plyr)
rbind.fill(DF1, DF2)

On Fri, Apr 16, 2010 at 8:00 AM, n.via...@libero.it n.via...@libero.it wrote:
 I have a problem with the merge command.
 I have to merge two dataframe that looks like the following example:

 CODPROD       N1           N3           N4
 23                       3               55                 4
 24                       5              67                36
 25                      3               73                 24



 second data frame


 CODPROD                  N1              N2
 30                                   34               45
 45                                   0                    78
 65                                    0                    56


 The result should be:

 CODPROD                 N1       N2         N3            N4
 23                                 3           NA        55            4
 24                                 5           NA        67           36
 25                                 3           NA        73           24
 30                                 34         45          NA           NA
 45                                  0          78          NA           NA
 65                                  0           56          NA            NA

 Anyone knows how to do it??

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hugene10stv1cdf

2010-04-16 Thread James MacDonald

Hi Christoph,

Christoph Knapp wrote:

Hi all,
I'm just tried to start analysing some micro-array chips. And R was
asking for this package. When I tried to install it it says that:

Using R version 2.10.1, biocinstall version 2.5.10.
Installing Bioconductor version 2.5 packages:
[1] hugene10stv1cdf
Please wait...

Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘hugene10stv1cdf’ is not available

What do I wrong and where can I get this package from?


This is a Bioconductor package, so the correct list to ask this is the 
Bioconductor-help list, not R-help.


But what you want is

dat - ReadAffy(cdfname=hugene10stv1.r3cdf)

and go from there. Affy has the unfortunate habit of naming related data 
with inconsistent names, and when I built this package last time I 
didn't notice the inconsistency.


Best,

Jim




Thanks

Christoph

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting an rgl graph

2010-04-16 Thread Michael Friendly

l...@stat.uiowa.edu wrote:

The current issue of JCGS (Vol 18 No 1,
http://pubs.amstat.org/toc/jcgs/19/1) has an editorial on including
animations, 3D visualizations, and movies in on-line PDF files
supporting JCGS articles. The online supplements to the editorial
include examples.  The 3D examples related to the misc3d packages are
also available in
http://www.stat.uiowa.edu/~luke/R/misc3d/misc3d-pdf/.  At some point
the code there will be added to misc3d.  It should be possible to
adapt these ideas to other objects rendered with rgl.

luke


Luke,
Your misc3d-pdf example is very instructive and the .tex file shows how
to embed in LaTeX.  Thanks! (JCGS 19(1) is actually one of the nicest
issues in a long time.)
Of the two approaches you
describe, the Asymptote route seems easier and more capable than the
MeshLab one.

It would be particularly useful to have this capability available for 
rgl.  Any plans for this?


One note:  With Adobe Acrobat Pro 9.3.1, the U3D and PRC images display
on screen, but do not print (replaced by the filename).  Is this your
experience too?

-Michael


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with FUN in Hmisc::summarize

2010-04-16 Thread arnaud chozo
Hi all,

I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
of a single vector argument to create the statistical summaries.

Consider an easy case: I'd like to compute the correlation between two
variables in my dataframe, grouped according to other variables in the same
dataframe.

For exemple, consider the following dataframe D:
V1  V2   V3
A 1-1
A 1 1
A-1-1
B 1 1
B 1 1

I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1), FUN=corr.V2.V3)

where corr.V2.V3 is defined as follows:

corr.V2.V3 = function(x) {
  d = cbind(x$V2, x$V3)

  out = c(cor(d))
  names(out) = c(CORR)
  return(out)
}

I was not able to use Hmisc::summarize in this case because FUN should be a
function of a matrix argument. Any idea?

Thanks in advance,
Arnaud

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weights in binomial glm

2010-04-16 Thread ONKELINX, Thierry
Jan,

It looks like you did not understand the line For a binomial GLM prior
weights are used to give the number of trials when the response is the
proportion of successes.

Weights must be a number of trials (hence integer). Not a proportion of
a population. Here is an example that clarifies the use of weights.

library(boot)
library(reshape)
dataset - data.frame(Person = c(rep(A, 20), rep(B, 10)), Success =
c(rbinom(20, 1, 0.25), rbinom(10, 1, 0.75)))
Aggregated - cast(Person ~ ., data = dataset, value = Success, fun =
list(mean, length))

m0 - glm(Success ~ 1, data = dataset, family = binomial)
m1 - glm(mean ~ 1, data = Aggregated, family = binomial, weights =
length)

inv.logit(coef(m0))
inv.logit(coef(m1))

Have a look at the survey package is you want to analyse stratified
data.

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Jan van der Laan
 Verzonden: vrijdag 16 april 2010 14:11
 Aan: r-help@r-project.org
 Onderwerp: [R] Weights in binomial glm
 
 I have some questions about the use of weights in binomial 
 glm as I am not getting the results I would expect. In my 
 case the weights I have can be seen as 'replicate weights'; 
 one respondent i in my dataset corresponds to w[i] persons in 
 the population. From the documentation of the glm method, I 
 understand that the weights can indeed be used for this: For 
 a binomial GLM prior weights are used to give the number of 
 trials when the response is the proportion of successes.
 From Modern applied statistics with S-Plus 3rd ed. I understand the
 same.
 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to use the neural networks package for time series prediction

2010-04-16 Thread david sabine
Hello all ,
Does any one know how to use the neural networks package for time series
prediction ? Have you a similar example in R language ?
thanks in advance
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data.frame and ddply

2010-04-16 Thread arnaud Gaboury
Dear group,

Here is my df :


futures -
structure(list(CONTRAT = c(WHEAT May/10 , WHEAT May/10 , 
WHEAT May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10
, 
COTTON NO.2 May/10 , PLATINUM Jul/10 ,  SUGAR NO.11 May/10 , 
 SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 , 
 SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10)
May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ), 
QUANTITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 
1, 1, 2, 1, 1, 1, 1, 2, 1, 1), SETTLEMENT = c(467.7500, 
467.7500, 467.7500, 467.7500, 78.1300, 78.1300, 
78.1300, 1,739.4000, 16.5400, 16.5400, 16.5400, 
16.5400, 16.5400, 1,353., 1,353., 1,353., 
1,353., 1,353., 1,353., 1,353., 1,353., 
1,353., 1,353., 1,353., 1,353.)), .Names =
c(CONTRAT, 
QUANTITY, SETTLEMENT), row.names = c(NA, 25L), class = data.frame)

Here is my code :

opfut=ddply(futures, c(CONTRAT,SETTLEMENT), summarise, POSITION=
sum(QUANTITY))

Here is the output:

 opfut
  CONTRAT SETTLEMENT POSITION
1 SUGAR NO.11 May/10 16.54005
2 COTTON NO.2 May/10 78.13003
3PLATINUM Jul/10  1,739.4000   -1
4 ROBUSTA COFFEE (10) May/10  1,353.   15
5   WHEAT May/10467.75004

It is almost exactly what I want, except I am expecting the POSITION column
before the SETTLEMENT column. How can I modified my code to obtain this?

TY



***
Arnaud Gaboury
Mobile: +41 79 392 79 56
BBM: 255B488F

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame and ddply

2010-04-16 Thread Felipe Carrillo
You can do something like this after the output from opfut
opfut - data.frame(opfut$CONTRAT,opfut$POSITION,opfut$SETTLEMENT)
names(opfut) - c('CONTRAT','POSITION','SETTLEMENT')
opfut
 
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA



- Original Message 
 From: arnaud Gaboury arnaud.gabo...@gmail.com
 To: r-help@r-project.org
 Sent: Fri, April 16, 2010 6:28:37 AM
 Subject: [R] data.frame and ddply
 
 Dear group,

Here is my df :


futures 
 -
structure(list(CONTRAT = c(WHEAT May/10 , WHEAT May/10 , 
WHEAT 
 May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10
, 
 
COTTON NO.2 May/10 , PLATINUM Jul/10 ,  SUGAR NO.11 May/10 , 
 
 SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 , 
 
 SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE 
 (10)
May/10 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 
 , 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
 
ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA 
 COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 , 
ROBUSTA COFFEE (10) 
 May/10 , ROBUSTA COFFEE (10) May/10 ), 
    QUANTITY = c(1, 1, 
 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1, 
    1, 1, 2, 1, 1, 1, 1, 
 2, 1, 1), SETTLEMENT = c(467.7500, 
    467.7500, 467.7500, 
 467.7500, 78.1300, 78.1300, 
    78.1300, 1,739.4000, 
 16.5400, 16.5400, 16.5400, 
    16.5400, 16.5400, 
 1,353., 1,353., 1,353., 
    1,353., 
 1,353., 1,353., 1,353., 1,353., 
    
 1,353., 1,353., 1,353., 1,353.)), .Names 
 =
c(CONTRAT, 
QUANTITY, SETTLEMENT), row.names = c(NA, 25L), class = 
 data.frame)

Here is my code :

opfut=ddply(futures, 
 c(CONTRAT,SETTLEMENT), summarise, POSITION=
sum(QUANTITY))

Here is 
 the output:

 opfut
            
           CONTRAT SETTLEMENT POSITION
1    
     SUGAR NO.11 May/10    16.5400        
 5
2        COTTON NO.2 May/10    78.1300  
       3
3            PLATINUM 
 Jul/10  1,739.4000      -1
4 ROBUSTA COFFEE (10) 
 May/10  1,353.      15
5        
       WHEAT May/10    467.7500      
   4

It is almost exactly what I want, except I am expecting the 
 POSITION column
before the SETTLEMENT column. How can I modified my code to 
 obtain this?

TY



***
Arnaud 
 Gaboury
Mobile: +41 79 392 79 56
BBM: 
 255B488F

__
 ymailto=mailto:R-help@r-project.org; 
 href=mailto:R-help@r-project.org;R-help@r-project.org mailing list
 href=https://stat.ethz.ch/mailman/listinfo/r-help; target=_blank 
 https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting 
 guide http://www.R-project.org/posting-guide.html
and provide commented, 
 minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] vector matching

2010-04-16 Thread Michael Nestrud
Hello all,

I have searched the archives for a similar problem to no avail.  I
could use  your help.

I have a bunch of vectors organized into two matrices, x and y.  These
vectors (as rows) consist of combinations of elements such that order
does not matter.

I want to create a third matrix from the first two, which is basically
all the rows in x and all the rows in y, excluding the rows that they
both have in common.

%in% seems to match individual elements, not entire rows, so something
else is needed.

Any help is appreciated.

Thanks,

-Michael

-- 
Michael A. Nestrud
Cornell U. Sensory Science PhD Candidate
m...@ataraxis.org
All that you taste... all that you eat.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how can I plot the histogram like this using R?

2010-04-16 Thread bbslover

 Thanks for your reply, I just want to get the figure like y1.jpg using the
data from y1.txt.
 Through the figure  I want to obtain the split point like y1.jpg, and
consider 2.5 as the plit point.  This figure is drawn by other people, I
just want to draw it using R, but I can not, so I hope, friends can help me.
 
Best wishes!
kevin http://n4.nabble.com/file/n1965378/y1.jpg 
http://n4.nabble.com/file/n1965378/y1.txt y1.txt 
-- 
View this message in context: 
http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p1965378.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R loop.

2010-04-16 Thread mhalsham

Hi every one I’m new to R and I cant figure our how to use the loop to do the
following task, any help would be very kind of every one.
I have a file called (table3.txt) that contains over 1000 row and over 40
columns.
So for example first row would look like that

Deafness,   EYA4,   DIAPH1, MYO7A,  TECTA, COL11A2, POU4F3, MYH9,   
ACTG1,
MYO6

I want the loop stamens to loop thro each row and take first cell which is
(Deafness and second which is EYA4) and but it on the button of the file and
then take the first cell which is (Deafness again and the third cell which
is the DIAPH1) and put it on the button of the file. And so on till I end up
with two columns one consists all the disease and one consist all the genes.

-- 
View this message in context: http://n4.nabble.com/R-loop-tp1979620p1979620.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Image RGB calculation

2010-04-16 Thread ole_roessler

Dear all,

I need to read an image (mostly jpg) and split the channel of this image to
an colour channel calculation like this:

sqrt(R²+G²+B²)


Do you have an idea what package I need to use for it, and is it possible?

Thanky a lot

Ole
-- 
View this message in context: 
http://n4.nabble.com/Image-RGB-calculation-tp1989864p1989864.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems getting symbols() to show table data

2010-04-16 Thread Guy Green

Thanks.  I don't think I would ever have worked that twist out.  It is
perfect.

Guy
-- 
View this message in context: 
http://n4.nabble.com/Problems-getting-symbols-to-show-table-data-tp1839676p1989384.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] generating a SpatialLinesDataFrame (rgdal)

2010-04-16 Thread Simon Goodman

Could somebody give me a pointer on how to generate a SpatialLinesDataFrame
from a dataframe, that contains lat,long coordinates as separate variables.

At the moment the data looks like this:

 lat long
[1] 53.   1.

where as the SpatialLinesDataFrame consists of

Coordinates
[1] (53.xxx, 1.xxx)

This is probably a trival issue, but I'm a relatively new user and searching
the documentation hasn't yielded and obvious way to do it so far.

Thanks, Simon
-- 
View this message in context: 
http://n4.nabble.com/generating-a-SpatialLinesDataFrame-rgdal-tp1990352p1990352.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Removing empty (or very underpopulated) sub-populations

2010-04-16 Thread Kamil Sijko
Hi,

I'm trying to develop a function that will simplify the most common analyses
in my area of interest (social sciences) by computing all required
statistics at one run (for exaple in case of a factor and numeric variable:
1) normality test, then in case variable are normal 2) ANOVA 3) with
efect-size estimation and aprropriate graph).
I test normality in each group with this code:

are.normal - c()
group - as.factor(group)
for (i in 1:length(levels(factor(group {
 are.normal[i] - normality(response[group==levels(factor(group))[i]])
}

whrere: 1) response is response (numeric variable), 2) group is grouping
variable (factor), 4) normality is a function which takes one variable as
argument, and the tries to figure out wheter it's normal (TRUE) or not
(FALSE).

My problem is that sometimes, some combinations of response~group produce
empty populations or very underpopulated (eg. situation when you examine
relation between country of origin and age of respondents, and it turns out,
that you have only one guy from some country). It causes a failure of my
function.

I've been wondering wheter there is some way to exclude those underpopulated
groups from analysis?

Best regards,
Kamil Sijko

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] format() method

2010-04-16 Thread kafkaz

Hello,
I use format() function to get number of the week, like this:
format(tmp,'%U')
Recently, I have spotted something bizarre. For example, I have such object:
(index(tmp$x.delta['2009'][1:16]))
 [1] 2009-01-02 CET  2009-01-09 CET  2009-01-16 CET  2009-01-23 CET 
 [5] 2009-01-30 CET  2009-02-06 CET  2009-02-13 CET  2009-02-20 CET 
 [9] 2009-02-27 CET  2009-03-06 CET  2009-03-13 CET  2009-03-20 CET 
[13] 2009-03-27 CET  2009-04-03 CEST 2009-04-09 CEST 2009-04-17 CEST
dput(index(tmp$x.delta['2009'][1:16]),'%U',file='as.date')
structure(c(1230850800, 1231455600, 1232060400, 1232665200, 123327, 
1233874800, 1234479600, 1235084400, 1235689200, 1236294000, 1236898800, 
1237503600, 1238108400, 1238709600, 1239228000, 1239919200), tzone =
structure(, .Names = TZ), class = c(POSIXt, 
POSIXct))
To get number of the week I run:
format(index(tmp$x.delta['2009'][1:16]),'%U')
Here is the output - the weird thing is, that the first number of the week
is 00.
 [1] 00 01 02 03 04 05 06 07 08 09 10 11 12 13
14
[16] 15

Is it the bug, my mistake or it is supposed to by like that?
Thank you,
kafka
-- 
View this message in context: 
http://n4.nabble.com/format-method-tp1999753p1999753.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplot ontop a contourplot (package: lattice)

2010-04-16 Thread Jay
Hello,

I have a contourplot plot that shows the data I want. However, I would
like to point a certain amount of points from this plot via a
xyplot().

Example:

x - seq(pi/4, 5 * pi, length.out = 100)
y - seq(pi/4, 5 * pi, length.out = 100)
r - as.vector(sqrt(outer(x^2, y^2, +)))
grid - expand.grid(x=x, y=y)
grid$z - cos(r^2) * exp(-r/(pi^3))
levelplot(z~x*y, grid, cuts = 50, panel.xyplot(x~y))


But the point does not show up. What is the correct way to achieve
this?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with FUN in Hmisc::summarize

2010-04-16 Thread Ista Zahn
Hi Arnaud,
I'm not sure how do to this with Hmis::summarize, but it's pretty easy with
plyr::ddply:

D - read.table(textConnection(V1  V2   V3
A 1-1
A 1 1
A-1-1
B 1 1
B 1 1), header=TRUE)
closeAllConnections()

corr.V2.V3 = function(x) {
 out = cor(x$V2, x$V3)
 names(out) = CORR
 return(out)
}

library(plyr)

ddply(D, .(V1), corr.V2.V3)

-Ista

On Fri, Apr 16, 2010 at 9:21 AM, arnaud chozo arnaud.ch...@gmail.comwrote:

 Hi all,

 I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
 of a single vector argument to create the statistical summaries.

 Consider an easy case: I'd like to compute the correlation between two
 variables in my dataframe, grouped according to other variables in the same
 dataframe.

 For exemple, consider the following dataframe D:
 V1  V2   V3
 A 1-1
 A 1 1
 A-1-1
 B 1 1
 B 1 1

 I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1), FUN=corr.V2.V3)

 where corr.V2.V3 is defined as follows:

 corr.V2.V3 = function(x) {
  d = cbind(x$V2, x$V3)

  out = c(cor(d))
  names(out) = c(CORR)
  return(out)
 }

 I was not able to use Hmisc::summarize in this case because FUN should be a
 function of a matrix argument. Any idea?

 Thanks in advance,
 Arnaud

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weights in binomial glm

2010-04-16 Thread Jan van der Laan
Thierry,

Thank you for your answer.

From the documentation it looks like it is valid to assume that the
weights can be used for replicate weights. Continuing your example:

dataset$Success2 - dataset$Success
Aggregated2 - cast(Person+Success ~ ., data = dataset, value =
Success2, fun =list(mean, length))
m2 - glm(mean ~ 1, data = Aggregated2, family = binomial, weights =length)

In this case the weights can be seen as replicate weights. In my case
the proportion of successes for each group is either 0 or 1.

I am familiar with the survey package. However, in this case there
should not be difference between the two as far as the parameter
estimates are concerned (the standard errors are incorrect for glm).

The strange thing in this case is that the estimates seem to depend on
the scaling of the weights, which should not be the case. Also in your
example scaling the weights gives the same estimate:

m1 - glm(mean ~ 1, data = Aggregated, family = binomial, weights = length/10)

Regards,
Jan



On Fri, Apr 16, 2010 at 3:19 PM, ONKELINX, Thierry
thierry.onkel...@inbo.be wrote:
 Jan,

 It looks like you did not understand the line For a binomial GLM prior
 weights are used to give the number of trials when the response is the
 proportion of successes.

 Weights must be a number of trials (hence integer). Not a proportion of
 a population. Here is an example that clarifies the use of weights.

 library(boot)
 library(reshape)
 dataset - data.frame(Person = c(rep(A, 20), rep(B, 10)), Success =
 c(rbinom(20, 1, 0.25), rbinom(10, 1, 0.75)))
 Aggregated - cast(Person ~ ., data = dataset, value = Success, fun =
 list(mean, length))

 m0 - glm(Success ~ 1, data = dataset, family = binomial)
 m1 - glm(mean ~ 1, data = Aggregated, family = binomial, weights =
 length)

 inv.logit(coef(m0))
 inv.logit(coef(m1))

 Have a look at the survey package is you want to analyse stratified
 data.

 Thierry

 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek
 team Biometrie  Kwaliteitszorg
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium

 Research Institute for Nature and Forest
 team Biometrics  Quality Assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium

 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher

 The plural of anecdote is not data.
 ~ Roger Brinner

 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey


 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] Namens Jan van der Laan
 Verzonden: vrijdag 16 april 2010 14:11
 Aan: r-help@r-project.org
 Onderwerp: [R] Weights in binomial glm

 I have some questions about the use of weights in binomial
 glm as I am not getting the results I would expect. In my
 case the weights I have can be seen as 'replicate weights';
 one respondent i in my dataset corresponds to w[i] persons in
 the population. From the documentation of the glm method, I
 understand that the weights can indeed be used for this: For
 a binomial GLM prior weights are used to give the number of
 trials when the response is the proportion of successes.
 From Modern applied statistics with S-Plus 3rd ed. I understand the
 same.


 Druk dit bericht a.u.b. niet onnodig af.
 Please do not print this message unnecessarily.

 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
 en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd 
 is
 door een geldig ondertekend document. The views expressed in  this message
 and any annex are purely those of the writer and may not be regarded as 
 stating
 an official position of INBO, as long as the message is not confirmed by a 
 duly
 signed document.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame and ddply

2010-04-16 Thread arnaud Gaboury
I found a way using the subset command :

opfut=subset(ddply(futures, c(CONTRAT,SETTLEMENT), summarise, POSITION=
sum(QUANTITY)),select=c(CONTRAT,POSITION,SETTLEMENT))

 opfut
  CONTRAT POSITION SETTLEMENT
1 SUGAR NO.11 May/10 516.5400
2 COTTON NO.2 May/10 378.1300
3PLATINUM Jul/10-1 1,739.4000
4 ROBUSTA COFFEE (10) May/1015 1,353.
5   WHEAT May/10 4   467.7500




 -Original Message-
 From: Felipe Carrillo [mailto:mazatlanmex...@yahoo.com]
 Sent: Friday, April 16, 2010 4:02 PM
 To: arnaud Gaboury; r-help@r-project.org
 Subject: Re: [R] data.frame and ddply
 
 You can do something like this after the output from opfut
 opfut - data.frame(opfut$CONTRAT,opfut$POSITION,opfut$SETTLEMENT)
 names(opfut) - c('CONTRAT','POSITION','SETTLEMENT')
 opfut
 
 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA
 
 
 
 - Original Message 
  From: arnaud Gaboury arnaud.gabo...@gmail.com
  To: r-help@r-project.org
  Sent: Fri, April 16, 2010 6:28:37 AM
  Subject: [R] data.frame and ddply
 
  Dear group,
 
 Here is my df :
 
 
 futures
  -
 structure(list(CONTRAT = c(WHEAT May/10 , WHEAT May/10 ,
 WHEAT
  May/10 , WHEAT May/10 , COTTON NO.2 May/10 , COTTON NO.2 May/10
 ,
 
 COTTON NO.2 May/10 , PLATINUM Jul/10 ,  SUGAR NO.11 May/10 ,
 
  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 ,  SUGAR NO.11 May/10 ,
 
  SUGAR NO.11 May/10 , ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE
  (10)
 May/10 ,
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10
  ,
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
 
 ROBUSTA COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
 ROBUSTA
  COFFEE (10) May/10 , ROBUSTA COFFEE (10) May/10 ,
 ROBUSTA COFFEE (10)
  May/10 , ROBUSTA COFFEE (10) May/10 ),
     QUANTITY = c(1, 1,
  1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1,
     1, 1, 2, 1, 1, 1, 1,
  2, 1, 1), SETTLEMENT = c(467.7500,
     467.7500, 467.7500,
  467.7500, 78.1300, 78.1300,
     78.1300, 1,739.4000,
  16.5400, 16.5400, 16.5400,
     16.5400, 16.5400,
  1,353., 1,353., 1,353.,
     1,353.,
  1,353., 1,353., 1,353., 1,353.,
 
  1,353., 1,353., 1,353., 1,353.)), .Names
  =
 c(CONTRAT,
 QUANTITY, SETTLEMENT), row.names = c(NA, 25L), class =
  data.frame)
 
 Here is my code :
 
 opfut=ddply(futures,
  c(CONTRAT,SETTLEMENT), summarise, POSITION=
 sum(QUANTITY))
 
 Here is
  the output:
 
  opfut
 
            CONTRAT SETTLEMENT POSITION
 1
      SUGAR NO.11 May/10    16.5400
  5
 2        COTTON NO.2 May/10    78.1300
        3
 3            PLATINUM
  Jul/10  1,739.4000      -1
 4 ROBUSTA COFFEE (10)
  May/10  1,353.      15
 5
        WHEAT May/10    467.7500
    4
 
 It is almost exactly what I want, except I am expecting the
  POSITION column
 before the SETTLEMENT column. How can I modified my code to
  obtain this?
 
 TY
 
 
 
 ***
 Arnaud
  Gaboury
 Mobile: +41 79 392 79 56
 BBM:
  255B488F
 
 __
  ymailto=mailto:R-help@r-project.org;
  href=mailto:R-help@r-project.org;R-help@r-project.org mailing list
  href=https://stat.ethz.ch/mailman/listinfo/r-help; target=_blank
  https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting
  guide http://www.R-project.org/posting-guide.html
 and provide commented,
  minimal, self-contained, reproducible code.
 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple variables pointing to single dataframe?

2010-04-16 Thread Alex Bryant
Hi,  I have a need to have 2 variables point to the same dataframe (d1),  I 
don't want to simply copy the dataframe ( d2-d1 ) as my understanding is that 
this will create a second dataframe.  Any suggestions on best practice here?

Thank You,

//
// Alex Bryant
// Software Developer
// Integrated Clinical Systems, Inc.
// 908-996-7208



Confidentiality Note: This e-mail, and any attachment to...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Blocking and Nested ANOVA Design. Am I using the aov() function correctly?

2010-04-16 Thread Eleftheria Dalmaris
Dear list members,

I am new member and fairly new into R world! I hope what I have is not
beyond the purpose of this list. I did first search for similar
experimental designs without success.

I want to perform an ANOVA analysis using the aov() function. I am not
100% sure that I have it right. If anyone can help me, that will be
greatly appreciated. My design is not balanced for any of the factors.

My main aim is to compare the 5 different AI (aridity index) groups
and to identify a pattern of the response of the AI groups to the
treatment that I applied for the parameter that I measured. In total I
have 24 different populations of a specific tree species. The
population refers to the geographical area that I choose to collect
seeds from and for every population I know the annual rainfall and
annual evapotranspiration. I started with equal replicate number of
plants per population per treatment, but some died and some where not
healthy enough to include them in the experiment.

My design is as follows:

-  Blocks (6 blocks, those are different days that I planted my
plants. Every block at the beginning had at least one plant for every
population for every treatment. At the end some died or where not
healthy enough and that's why I have an unbalanced design.).
-  Treatments (2 treatments that I selected therefore fixed)
-  AI  (5 AI, this is and Aridity Index, is the ratio of rainfall
to evapotranspiration for each of my populations and therefore each
population goes to the appropriate AI group. When I selected my
populations I did not select them in order to have a balance design
from the AI perspective).
-  Populations nested in AI

and I am interested for the interactions as well.

So if OP is one of my parameters that I measured I right the following
function and when I run it I get the ANOVA table that I show:


b- aov(OP ~ Block + Treat*factor(AI)*(factor(AI)/factor(Pop)))
summary (b)
    Df    Sum Sq
Mean Sq F value  Pr(F)
Block  5  2.187
0.437   2.6350 0.02423 *
Treat   1   126.656
126.656    762.8590 2e-16 ***
factor(AI) 4   2.098
0.525        3.15980.01478 *
Treat:factor(AI)        4    1.057
0.264      1.5912 0.17721
factor(AI):factor(Pop)          19    2.990
0.157      0.9478 0.52430
Treat:factor(AI):factor(Pop)  19    2.811   0.148
    0.8912 0.59429
Residuals     245  40.677
0.166

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1



If this is right I need to correct the F value. Since Pop is nested
within AI I need to use different f ratios. And the rations that I am
using are the ones that I show in the following table. For simplicity
I am using the number from 1 to 7 and not MS.


   Source of variation  df   F-ratio
1   Block   5 1/7
2   Treat1 2/5
3 AI  4 3/5
4 TreatAI 4 4/5
5   AIPop   195/7
6   TreatAIPop  196/5
7 Residuals 254


Have I used the aov() function correctly? Can anyone comment on that?



That’s the first thing that I need to confirm.

The other thing is:

If I exclude the factor(AI) that is outside of the parenthesis, I get
the following:


b1- aov(OP ~ Block + Treat*(factor(AI)/factor(Pop)))
 summary (b1)

   Df    Sum Sq  Mean
Sq    F value     Pr(F)
Block   5  2.187
0.437   2.6350  0.02423 *
Treat    1   126.656   126.656
   762.8590   2e-16 ***
factor(AI)      4   2.098
0.525        3.1598  0.01478 *
factor(AI):factor(Pop)        19  3.056  0.161
    0.9689  0.49862
Treat:factor(AI)      4  0.990  0.248
       1.4909  0.20551
Treat:factor(AI):factor(Pop)    19  2.811  0.148
 0.8912  0.59429
Residuals      245 40.677     0.166

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


The differences between the two tables are not significant at all, but
I’m guessing that the one is more correct then the other one. Which
one is preferable?

I continue with using TukeyHSD, but I’m not going to get into that now.

Not sure if the raw data are necessary but I have attached them.

Thanking you in advance,

Eleftheria
Pop AI  Block   Treat   OP
2   0.2 A   C   1.13
22  0.2 A   C   2.31
3   0.2 A   C   1.56
6   0.2 

Re: [R] Image RGB calculation

2010-04-16 Thread Tobias Verbeke

Hi Ole,

ole_roessler wrote:


I need to read an image (mostly jpg) and split the channel of this image to
an colour channel calculation like this:

sqrt(R²+G²+B²)


Do you have an idea what package I need to use for it, and is it possible?


For general image processing capabilities within R,
I would recommend the EBImage package which you can
find on the BioConductor repositories.


Hope this helps,
Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Yet Testing rKward

2010-04-16 Thread Ronaldo Reis Junior
Hi,

I continue testing rKward. I dont know how to save the results from a 
script execution without use copy and paste or using a rkward output system.

Now I try to understand how adapt my script do use the rKward output system.

Example:

I have this script:
--
## Carregar a tabela de riqueza e equitabilidade
library(gdata)
dadosriq - read.xls(Panalise.xls,h=T,sheet=2)

## Resumo dos dados
summary(dadosriq)
--

Using rkward output system I try:
--
rk.header(Carregar a tabela de riqueza e equitabilidade)
library(gdata)

dadosriq - read.xls(Panalise.xls,h=T,sheet=2)

rk.header(Resumo dos dados)
rk.print(summary(dadosriq))
--

Ok. The problems:

1) my script become rkward specific and it is not a good idea.

2) I cant print the command in output unless I repeat the command like a 
string:
rk.header(dadosriq - read.xls(Panalise.xls,h=T,sheet=2)), but it is 
also not a good idea.

Anyone know if exist a global rkward command to send all (commands e 
results) to the output? In this way if I'm a rkward user I use this 
global command, if I'm not a rkward user I comment this command and my 
script work.

This is possible or I need to forget rkward as a linux R script IDE?

Thanks
Ronaldo

-- 
8ª lei - Colete seus dados hoje como se você soubesse que seu equipamento vai 
quebrar amanhã.

   --Herman, I. P. 2007. Following the law. NATURE, Vol 445, p. 228.
  Prof. Ronaldo Reis Júnior
|  .''`. UNIMONTES/DBG/Lab. Ecologia Comportamental e Computacional
| : :'  : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia
| `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil
|   `- Fone: (38) 3229-8192 | ronaldo.r...@unimontes.br
| http://www.ppgcb.unimontes.br/lecc | LinuxUser#: 205366


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weights in binomial glm

2010-04-16 Thread ONKELINX, Thierry
Jan,

You misread the documentation of ?glm. Note that glm works with different kinds 
of families. So the first statement about weights is rather general: it holds 
for most of the families. It explicitly tells you that is not the case with the 
binomial family. From the documentation: For a binomial GLM prior weights are 
used to give the number of trials when the response is the proportion of 
successes. Nothing more, nothing less.

Scaling the weights will change the results because you change the NUMBER OF 
TRIALS. More trials = more information = lower variances. So you only need to 
give the weights when the response is expressed as a ratio. If you have it as a 
binary variable or as cbind(NummerOfSuccesses,NumberOfFailures) then you don't 
need weights.

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: Jan van der Laan [mailto:djvanderl...@gmail.com] 
 Verzonden: vrijdag 16 april 2010 16:09
 Aan: ONKELINX, Thierry
 CC: r-help@r-project.org
 Onderwerp: Re: [R] Weights in binomial glm
 
 Thierry,
 
 Thank you for your answer.
 
 From the documentation it looks like it is valid to assume 
 that the weights can be used for replicate weights.
 Continuing your example:
 
 dataset$Success2 - dataset$Success
 Aggregated2 - cast(Person+Success ~ ., data = dataset, value 
 = Success2, fun =list(mean, length))
 m2 - glm(mean ~ 1, data = Aggregated2, family = binomial, 
 weights =length)
 
 In this case the weights can be seen as replicate weights. In 
 my case the proportion of successes for each group is either 0 or 1.
 
 I am familiar with the survey package. However, in this case 
 there should not be difference between the two as far as the 
 parameter estimates are concerned (the standard errors are 
 incorrect for glm).
 
 The strange thing in this case is that the estimates seem to 
 depend on the scaling of the weights, which should not be the 
 case. Also in your example scaling the weights gives the same 
 estimate:
 
 m1 - glm(mean ~ 1, data = Aggregated, family = binomial, 
 weights = length/10)
 
 Regards,
 Jan
 
 
 
 On Fri, Apr 16, 2010 at 3:19 PM, ONKELINX, Thierry
 thierry.onkel...@inbo.be wrote:
  Jan,
 
  It looks like you did not understand the line For a binomial GLM 
  prior weights are used to give the number of trials when 
 the response 
  is the proportion of successes.
 
  Weights must be a number of trials (hence integer). Not a 
 proportion 
  of a population. Here is an example that clarifies the use 
 of weights.
 
  library(boot)
  library(reshape)
  dataset - data.frame(Person = c(rep(A, 20), rep(B, 
 10)), Success 
  = c(rbinom(20, 1, 0.25), rbinom(10, 1, 0.75))) Aggregated - 
  cast(Person ~ ., data = dataset, value = Success, fun = 
 list(mean, 
  length))
 
  m0 - glm(Success ~ 1, data = dataset, family = binomial)
  m1 - glm(mean ~ 1, data = Aggregated, family = binomial, weights =
  length)
 
  inv.logit(coef(m0))
  inv.logit(coef(m1))
 
  Have a look at the survey package is you want to analyse stratified 
  data.
 
  Thierry
 
  
 --
  --
  
  ir. Thierry Onkelinx
  Instituut voor natuur- en bosonderzoek team Biometrie  
 Kwaliteitszorg 
  Gaverstraat 4 9500 Geraardsbergen Belgium
 
  Research Institute for Nature and Forest team Biometrics  Quality 
  Assurance Gaverstraat 4 9500 Geraardsbergen Belgium
 
  tel. + 32 54/436 185
  thierry.onkel...@inbo.be
  www.inbo.be
 
  To call in the statistician after the experiment is done may be no 
  more than asking him to perform a post-mortem examination: 
 he may be 
  able to say what the experiment died of.
  ~ Sir Ronald Aylmer Fisher
 
  The plural of anecdote is not data.
  ~ Roger Brinner
 
  The combination of some data and an aching desire for an 
 answer does 
  not ensure that a reasonable answer can be extracted from a 
 given body 
  of data.
  ~ John Tukey
 
 
  -Oorspronkelijk bericht-
  Van: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] Namens Jan van der Laan
  Verzonden: vrijdag 16 april 2010 14:11
  Aan: r-help@r-project.org
  Onderwerp: [R] Weights in binomial glm
 
  I have some questions about the use 

Re: [R] vector matching

2010-04-16 Thread Henrique Dallazuanna
If I understand:

unique(t(apply(rbind(x, y), 1, sort)))

On Fri, Apr 16, 2010 at 11:05 AM, Michael Nestrud m...@ataraxis.org wrote:
 Hello all,

 I have searched the archives for a similar problem to no avail.  I
 could use  your help.

 I have a bunch of vectors organized into two matrices, x and y.  These
 vectors (as rows) consist of combinations of elements such that order
 does not matter.

 I want to create a third matrix from the first two, which is basically
 all the rows in x and all the rows in y, excluding the rows that they
 both have in common.

 %in% seems to match individual elements, not entire rows, so something
 else is needed.

 Any help is appreciated.

 Thanks,

 -Michael

 --
 Michael A. Nestrud
 Cornell U. Sensory Science PhD Candidate
 m...@ataraxis.org
 All that you taste... all that you eat.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] score counts in an aggregate function

2010-04-16 Thread KDT

Dear R-Users,
I have a big data set mydata with repeated observation and some missing
values. It looks like the format below:

userid sex item score1  score2
1 01  1 1
1 02  0 1
1 03  NA   1 
1 04  1 0
2 11  0 1
2 12  NA   1
2 13  1 NA
2 14  NA   0
3 01  1 0
3 02  1 NA
3 03  1 0
3 04  0 0

I wound like to summarise the dataset such that i get something in the
format of 

userid sumscore1  countscore1  meanscore1 sumscore2  countscore2  
meanscore2  
1  230.67  3  
4  0.75
2  120.52 
3   0.67
3  340.75  0  
3   0.00

I tried using :
means - data.frame(aggregate(mydata[,
4:5],by=list(mydata$userid),FUN=mean, na.rm=TRUE))
and
sums - data.frame(aggregate(mydata[, 4:5],by=list(mydata$userid),FUN=sum,
na.rm=TRUE))

so that i could merge the two data.frames later. This works quite okay but i
still can not get a function that can give me a data.frame for the counts!!
Something like this::
counts - data.frame(aggregate(mydata[,
4:5],by=list(mydata$userid),FUN=count, na.rm=TRUE)).

Any advice?

Trevor
Belgium

-- 
View this message in context: 
http://n4.nabble.com/score-counts-in-an-aggregate-function-tp2007152p2007152.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GAMM : how to use a smoother for some levels of a variable, and a linear effect for other levels?

2010-04-16 Thread Simon Wood
Both versions of this should have worked, but you are right that the first 
version didn't when used with `gamm', I've fixed this for mgcv 1.6-2 
(`mgcv:gam' was ok). Thanks for this.

best,
Simon

On Wednesday 14 April 2010 09:03, JANSEN, Ivy wrote:
 Hi,

 I was reading the book on Mixed Effects Models and Extensions in
 Ecology with R by Zuur et al.
 In Section 6.2, an example is discussed where a gamm-model is fitted,
 with a smoother for time, which differs for each value of ID (4
 different bird species). In earlier versions of R, the following code
 was used

 BM2-gamm(Birds~Rain+ID+
s(Time,by=as.numeric(ID==Stilt.Oahu))+
s(Time,by=as.numeric(ID==Stilt.Maui))+
s(Time,by=as.numeric(ID==Coot.Oahu))+
s(Time,by=as.numeric(ID==Coot.Maui)),
  correlation=corAR1(form=~Time |ID ),
  weights=varIdent(form=~1|ID))

 However, in the current version of R, this does not work anymore, and
 should be changed into

 BM2-gamm(Birds~Rain+ID+
s(Time,by=ID),
  correlation=corAR1(form=~Time |ID ),
  weights=varIdent(form=~1|ID))

 It turns out that 2 of the 4 smoothers have estimated degrees of freedom
 of 1, so a linear effect would be sufficient.
 Now my question is how I need to change the code in order to have a time
 smoother for ID=Coot.Oahu and ID=Coot.Maui, and a linear time effect for
 ID=Stilt.Oahu and ID=Stilt.Maui. With the old R-code, this seems
 trivial, but I don't have any idea how to do it in the newest R-version
 (interactions with a dummy variable do not work in gamm).

 Thanks,
 Ivy

 Druk dit bericht a.u.b. niet onnodig af.
 Please do not print this message unnecessarily.

 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver
 weer en binden het INBO onder geen enkel beding, zolang dit bericht niet
 bevestigd is door een geldig ondertekend document. The views expressed in 
 this message and any annex are purely those of the writer and may not be
 regarded as stating an official position of INBO, as long as the message is
 not confirmed by a duly signed document.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] format() method

2010-04-16 Thread Don MacQueen

Were you expecting 01, and is that why you are puzzled?

See ?strftime and the explanation of the %U format. It depends on 
where the first Sunday of the year happens to fall.


-Don

At 5:28 AM -0800 4/16/10, kafkaz wrote:

Hello,
I use format() function to get number of the week, like this:
format(tmp,'%U')
Recently, I have spotted something bizarre. For example, I have such object:
(index(tmp$x.delta['2009'][1:16]))
 [1] 2009-01-02 CET  2009-01-09 CET  2009-01-16 CET  2009-01-23 CET
 [5] 2009-01-30 CET  2009-02-06 CET  2009-02-13 CET  2009-02-20 CET
 [9] 2009-02-27 CET  2009-03-06 CET  2009-03-13 CET  2009-03-20 CET
[13] 2009-03-27 CET  2009-04-03 CEST 2009-04-09 CEST 2009-04-17 CEST
dput(index(tmp$x.delta['2009'][1:16]),'%U',file='as.date')
structure(c(1230850800, 1231455600, 1232060400, 1232665200, 123327,
1233874800, 1234479600, 1235084400, 1235689200, 1236294000, 1236898800,
1237503600, 1238108400, 1238709600, 1239228000, 1239919200), tzone =
structure(, .Names = TZ), class = c(POSIXt,
POSIXct))
To get number of the week I run:
format(index(tmp$x.delta['2009'][1:16]),'%U')
Here is the output - the weird thing is, that the first number of the week
is 00.
 [1] 00 01 02 03 04 05 06 07 08 09 10 11 12 13
14
[16] 15

Is it the bug, my mistake or it is supposed to by like that?
Thank you,
kafka
--
View this message in context: 
http://*n4.nabble.com/format-method-tp1999753p1999753.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with FUN in Hmisc::summarize

2010-04-16 Thread hadley wickham
 corr.V2.V3 = function(x) {
  out = cor(x$V2, x$V3)
  names(out) = CORR
  return(out)
 }

A litte more concisely:

corr.V2.V3 = function(x) {
 c(CORR = cor(x$V2, x$V3))
}


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] score counts in an aggregate function

2010-04-16 Thread Kadengye Trevor
Dear r-list,
I have a big data set mydata with repeated observation and some missing
values. It looks like the format below:

userid sex item score1  score2
1 01  1 1
1 02  0 1
1 03  NA   1
1 04  1 0
2 11  0 1
2 12  NA   1
2 13  1 NA
2 14  NA   0
3 01  1 0
3 02  1 NA
3 03  1 0
3 04  0 0

I wound like to summarise the dataset such that i get something in the
format of

userid sumscore1  countscore1  meanscore1 sumscore2  countscore2
meanscore2
1 23   0.67  3
4  0.75
2 12   0.52
 3   0.67
3 34   0.75  0
3   0.00

I tried using :
means - data.frame(aggregate(mydata[,
4:5],by=list(mydata$userid),FUN=mean, na.rm=TRUE))
and
sums - data.frame(aggregate(mydata[, 4:5],by=list(mydata$userid),FUN=sum,
na.rm=TRUE))

so that i could merge the two data.frames later. This works quite okay but i
still can not get a function that can give me a data.frame for the counts!!
Something like this::
counts - data.frame(aggregate(mydata[, 4:5],by=list(mydata$userid),FUN=*
count*, na.rm=TRUE)).

Any advice?

Trevor
Belgium

-- 
NiceLovely

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how can I plot the histogram like this using R?

2010-04-16 Thread Gustaf Rydevik
On Fri, Apr 16, 2010 at 10:13 AM, bbslover dlu...@yeah.net wrote:

  Thanks for your reply, I just want to get the figure like y1.jpg using the
 data from y1.txt.
  Through the figure  I want to obtain the split point like y1.jpg, and
 consider 2.5 as the plit point.  This figure is drawn by other people, I
 just want to draw it using R, but I can not, so I hope, friends can help me.

 Best wishes!
 kevin http://n4.nabble.com/file/n1965378/y1.jpg
 http://n4.nabble.com/file/n1965378/y1.txt y1.txt
 --
 View this message in context: 
 http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p1965378.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Hi,

Does this do what you want?

temp-read.table(url(http://n4.nabble.com/file/n1965378/y1.txt;))
hist(temp$V1,breaks=seq(0,5.1,by=0.1))
abline(v=2.5,lty=2,lwd=2,col=red)


Regards,
Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] run R script from Excel VBA

2010-04-16 Thread KZ
I wrote a R script say called computeCovarMatrix.R and i want to call and
run this piece from Excel visual basic. does anyone know how to do that?

thanks,
KZ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] score counts in an aggregate function

2010-04-16 Thread Ista Zahn
Hi Trever,
You can do it like this:

count - function(x) {
  length(na.omit(x))
}

counts -
data.frame(aggregate(mydata[,4:5],by=list(mydata$userid),FUN=count))

-Ista

On Fri, Apr 16, 2010 at 10:35 AM, KDT dkaden...@gmail.com wrote:


 Dear R-Users,
 I have a big data set mydata with repeated observation and some missing
 values. It looks like the format below:

 userid sex item score1  score2
 1 01  1 1
 1 02  0 1
 1 03  NA   1
 1 04  1 0
 2 11  0 1
 2 12  NA   1
 2 13  1 NA
 2 14  NA   0
 3 01  1 0
 3 02  1 NA
 3 03  1 0
 3 04  0 0

 I wound like to summarise the dataset such that i get something in the
 format of

 userid sumscore1  countscore1  meanscore1 sumscore2  countscore2
 meanscore2
 1  230.67  3
 4  0.75
 2  120.52
 3   0.67
 3  340.75  0
 3   0.00

 I tried using :
 means - data.frame(aggregate(mydata[,
 4:5],by=list(mydata$userid),FUN=mean, na.rm=TRUE))
 and
 sums - data.frame(aggregate(mydata[,
 4:5],by=list(mydata$userid),FUN=sum,
 na.rm=TRUE))

 so that i could merge the two data.frames later. This works quite okay but
 i
 still can not get a function that can give me a data.frame for the counts!!
 Something like this::
 counts - data.frame(aggregate(mydata[,
 4:5],by=list(mydata$userid),FUN=count, na.rm=TRUE)).

 Any advice?

 Trevor
 Belgium

 --
 View this message in context:
 http://n4.nabble.com/score-counts-in-an-aggregate-function-tp2007152p2007152.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Outlier detection from trayectory data

2010-04-16 Thread Usuario R
Hi all,

I am trying to analyze data coming from trajectories of moving objects. It
can be take as a two dimension time serie. The only method I've found is
this:

http://figment.cse.usf.edu/~sfefilat/data/papers/TuAT10.41.pdf

Anyone know if this method is already implemented in R of if there is any
other alternative implemented?

Thanks in advice.
Patricia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] format() method

2010-04-16 Thread David Winsemius


On Apr 16, 2010, at 9:28 AM, kafkaz wrote:



Hello,
I use format() function to get number of the week, like this:
format(tmp,'%U')
Recently, I have spotted something bizarre. For example, I have such  
object:

(index(tmp$x.delta['2009'][1:16]))
[1] 2009-01-02 CET  2009-01-09 CET  2009-01-16 CET   
2009-01-23 CET
[5] 2009-01-30 CET  2009-02-06 CET  2009-02-13 CET   
2009-02-20 CET
[9] 2009-02-27 CET  2009-03-06 CET  2009-03-13 CET   
2009-03-20 CET
[13] 2009-03-27 CET  2009-04-03 CEST 2009-04-09 CEST  
2009-04-17 CEST

dput(index(tmp$x.delta['2009'][1:16]),'%U',file='as.date')
structure(c(1230850800, 1231455600, 1232060400, 1232665200,  
123327,
1233874800, 1234479600, 1235084400, 1235689200, 1236294000,  
1236898800,

1237503600, 1238108400, 1238709600, 1239228000, 1239919200), tzone =
structure(, .Names = TZ), class = c(POSIXt,
POSIXct))
To get number of the week I run:
format(index(tmp$x.delta['2009'][1:16]),'%U')
Here is the output - the weird thing is, that the first number of  
the week

is 00.


Appears to behave as documented. From ?formatPOSIXct (help page):

%U
Week of the year as decimal number (00–53) using Sunday as the first  
day 1 of the week...



[1] 00 01 02 03 04 05 06 07 08 09 10 11 12  
13

14
[16] 15

Is it the bug, my mistake or it is supposed to by like that?
Thank you,
kafka


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] piecewise nls?

2010-04-16 Thread Derek Ogle
I am looking into fitting a so-called double von Bertalanffy function to fish 
length-at-age data.  Attempting to simplify the situation, the model looks like 
this ...

Y ~ f(X; a,b,c) if x   Z
Y ~ g(X; a,d,e) if x = Z

where

* f and g are non-linear functions (the traditional single von Bertalanffy 
growth function),
* Y (length) and X (age) are observed variables,
* a,b,c,d,e are parameters to be estimated, and
* Z is not a parameter but is a constant computed from b,c,d,e.

I usually fit the traditional single model with nls() but am unsure of how 
to fit this model with the if statement.  I tried search the archives with 
piecewise and either nls, nonlinear, or regression but did not find 
anything that seemed to fit this situation.  One thought I had was to do 
something like this (mostly pseudo-code) ...

nls(Y~ifelse(XZ,1,0)*f(X;a,b,c)+ifelse(X=Z,1,0)*g(X;a,d,e), ...)

but am unsure if this makes sense.

If anyone can offer some help I would be very appreciative.  Thank you in 
advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run R script from Excel VBA

2010-04-16 Thread Erich Neuwirth
Have a look at rcom.univie.ac.at.
We have an Excel addin which will allow you to do that.
Disclaimer: I am the author of the addin.



On 4/16/2010 4:57 PM, KZ wrote:
 I wrote a R script say called computeCovarMatrix.R and i want to call and
 run this piece from Excel visual basic. does anyone know how to do that?
 
 thanks,
 KZ
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapping a repeated measures ANOVA

2010-04-16 Thread Charles C. Berry

On Fri, 16 Apr 2010, Fischer, Felix wrote:


Hello everyone,

i have a question regarding the sampling process in boot().


PLEASE ... provide commented, minimal, self-contained, reproducible 
code. Which means something a correspondent could actually run.


But before that, a careful reading of

?boot

should get you started. Note these bits:

Arguments:

data: The data as a vector, ...

   statistic: A function which when applied to data returns a vector
  containing the statistic(s) of interest.  When
  sim=parametric,   [snip]
  In all other cases
  statistic must take at least two arguments.  The first
  argument passed will always be the original data. The second
  will be a vector of indices, frequencies or weights which
  define the bootstrap sample. ...



HTH,

Chuck




I try to bootstrap F-values for a repeated measures ANOVA to get a 
confidence interval of F-values. Unfortunately, while the aov works 
fine, it fails in the boot()-function. I think the problem might be that 
the resampling process fails to select both lines of data representing 
the 2 measuring times for one subject and I therefore get missing cases.


The data is organised like this:
subject ortmz   PHQ
1  1  1  x
1  1  2  y
2  1  1  z
2  1  2  zz
...


Is there any way to specify, that both lines need to be selected?


Thanks a lot!
Felix Fischer

P.S. If you need to have a look to my code:

F_values - function(formula, data, indices) {
   d - data[indices,] # allows boot to select sample
   fit=aov(formula,data=d) #fit model
   return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F 
value`)) #return F-values
  }

  results - boot(data=anova.daten, statistic=F_values,
 R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz))


Dipl. Psych. Felix Fischer

Medizinische Klinik mit Schwerpunkt Psychosomatik
Charit? -- Universit?tsmedizin Berlin
Luisenstr. 13a
10117 Berlin

Tel.: 030 - 450 553575
Email: felix.fisc...@charite.demailto:felix.fisc...@charite.de


[[alternative HTML version deleted]]




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] return of a function

2010-04-16 Thread Gustave Lefou
Dear R users,

I have a function which takes as arguments big arrays, say : w, x , y and z.

My function changes these arrays and I want them as result/output.

I have tried to write return(w,x,y,z), and thus to replace the previous w,
x, y and z. It does not seem to work.

What can I do ?

Thank you very much,
Gustave

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] return of a function

2010-04-16 Thread jim holtman
You can return a single object from a function.  If you want multiple
values, use a list:

f - function(x,y,z){

return(list(x=x, y=y, z=z))
}

value - f(x,y,z)
# now copy the values
x - value$x
y - value$y
z - value$z



On Fri, Apr 16, 2010 at 12:02 PM, Gustave Lefou gustave5...@gmail.comwrote:

 Dear R users,

 I have a function which takes as arguments big arrays, say : w, x , y and
 z.

 My function changes these arrays and I want them as result/output.

 I have tried to write return(w,x,y,z), and thus to replace the previous w,
 x, y and z. It does not seem to work.

 What can I do ?

 Thank you very much,
 Gustave

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] return of a function

2010-04-16 Thread Bert Gunter
Below 

Bert Gunter
Genentech Nonclinical Statistics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Gustave Lefou
Sent: Friday, April 16, 2010 9:03 AM
To: r-help@r-project.org
Subject: [R] return of a function

Dear R users,

I have a function which takes as arguments big arrays, say : w, x , y and z.

My function changes these arrays and I want them as result/output.

I have tried to write return(w,x,y,z), and thus to replace the previous w,
x, y and z. It does not seem to work.

What can I do ?

-- 1. Read the Help file? -- which says:

return(value)

Arguments:

value: An expression. 

-- and note that w,x,y,z is **not** a legal R expression

2. Have you read the online documentation, including an Introduction to R?
There you would find many examples.

3. return(list(w,x,y,z))   ## is what you want 
## or even 
list(w,x,y,z) ## without the return(), as the last R expression is by
default what is returned.




Thank you very much,
Gustave

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] return of a function

2010-04-16 Thread David Winsemius


On Apr 16, 2010, at 12:02 PM, Gustave Lefou wrote:


Dear R users,

I have a function which takes as arguments big arrays, say : w, x ,  
y and z.


My function changes these arrays and I want them as result/output.

I have tried to write return(w,x,y,z), and thus to replace the  
previous w,

x, y and z. It does not seem to work.


Right. Two misconceptions here. First, return() accepts one object,  
which could be a list of items. Second, just because you return it  
with a name that is the same as some obkect outside the function does  
not mean that the new values will be placed in the outside object.  
In fact if you do not assign the returned value to something, it will  
be temporarily placed in .LastValue and then overwritten when the next  
evaluation operation occurs. You need to assign the result of a  
function to some object.




What can I do ?


Read more about functions and do more examples with small objects to  
see the effects on test cases.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting an rgl graph

2010-04-16 Thread Greg Snow
The easiest approach may be to just install R onto a USB drive 
(flash/thumb/...) then when you go to your coworkers computer just run R from 
the USB drive and show the rgl plot.  I think there is also a tool to create an 
animation from rgl, it is not interactive, but you could e-mail a movie file 
that they could play to see the plot from many angles.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of cgeno...@u-paris10.fr
 Sent: Thursday, April 15, 2010 6:02 AM
 To: ted.hard...@manchester.ac.uk; Barry Rowlingson
 Cc: r-help@r-project.org
 Subject: Re: [R] Exporting an rgl graph
 
 Thanks for you answer. Let me precise my question.
 
 In fact, I do not want to capture a screen, I want to save an object
 that can be seen in 3D. With rgl, using my mouse, I can make the object
 move. This is what I want to export: an real 3D object that my
 collaborator will have the possibility to see in 3D.
 
 Christophe
 
 
  On 15-Apr-10 10:10:54, Barry Rowlingson wrote:
  On Thu, Apr 15, 2010 at 10:24 AM,  cgeno...@u-paris10.fr wrote:
  Hi the list,
 
  I use rgl to produce a 3D graph. I would like to show this graph
  to some collaborator. Is there a way to save it and send it to
  someone else?
 
  See ?rgl.postscript and ?rgl.snapshot
 
   Or use some kind of screen capture system - on Windows the 'Print
  Screen' key can copy the screen to the clipboard, paste into
 Photoshop
  or other graphics program.
 
   On Linux, I use 'scrot' from the command line - type 'scrot -s',
  click on a window, and it makes a PNG file of it.
 
  Again on Linux, since ImageMagick is installed, I use the 'import'
  programme from that suite. When you start that, it produces a
  +-shaped mouse cursor which you can use (selecting a top-left-hand
  corner to start with, and holding down the left mouse button) to
  drag out a bounding frame for the part of the screen you want to
  save. Then, when you release the button, an image of that portion
  of the screen is saved to a file of your choice, in any graphics
  format of your choice that is supported by ImageMagick (including
  PS and EPS, as well as all the common butmap formats).
 
  See 'man import' for pointers to more information.
 
  I have this set up as an icon on my launch panel, so it is just
  a matter of clicking on that, and then doing the above. The command
  behind the icon is
 
   /usr/local/bin/mkscreengrab
 
  and my script file 'mkscreengrab' contains:
 
   #! /bin/bash
   export ScrGrbTmp=`mktemp /home/ted/Screengrabs/screengrab`
   import $ScrGrbTmp.jpg
   rm $ScrGrbTmp
 
  so this makes JPEGs (I could have chosen somthing else, but that's
  the default I mostly want for that activity). This produces a file
  with a name like screengrab4913.jpg which will be unique in that
  directory, and it can later be renamed to your taste.
 
  If I wanted a different file format, I would use 'import' from
  the command line, with appropriate filenam extension (e.g. .png,
  .ps, .eps, ... ).
 
  I hadn't heard of scrot before, but now I've looked it up it
  seems that its output format is limited to PNG.
 
  I've now also located more info about various ways of taking
  screenshots in Linux:
 
  http://tips.webdesign10.com/how-to-take-a-screenshot-on-ubuntu-linux
 
  Ted.
 
  
  E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
  Fax-to-email: +44 (0)870 094 0861
  Date: 15-Apr-10   Time: 12:18:25
  -- XFMail --
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weights in binomial glm

2010-04-16 Thread Thomas Lumley



Jan,

Thierry is correct in saying that you are misusing glm(), but there is also a 
numerical problem.

You are misusing glm() because your model specification claims to have 
Binomial(n,p) observations with w in the vicinity of 100, where there is a 
single common p but the observed binomial proportion is either 1 or 0, never 
anything in between.  These data are a very poor fit to a binomial model.

The correct specification if you have what you call replicate weights and I call 
frequency weights is to produce a single data record for each covariate pattern that has 
both the 1 and 0 observations. This can either be two columns for successes and failures, 
or one column of proportions and one column of weights.  As your quote from MASS says 
weights are used to give the number of trials when the response is the proportion 
of successes. In your data the response is *not* the proportion of successes.


However, the MLE should still be equal to the weighted mean even with this 
misuse.  The reason it is not is because of the starting values.  R has to find 
some starting values for the iterative maximization of the likelihood, and for 
binomial data with y successes out of n it uses  starting values for the fitted 
means of  (y+0.5)/(n+1).  Starting the iteration at the data in this way 
usually makes the Fisher scoring algorithm very reliable -- it is correctly 
scaled to the data, in some sense.   Unfortunately, if you separate out the 
successes and failures, you have some points starting with values very close to 
0.  When I used your code the starting value for the point with the largest 
weight was 0.5/199.   At iteration 2, the estimated mean ends up very small for 
all observations, and then the iteration diverges.  However, if you provide a 
starting value then the fitting works, even if you start the iteration at, say 
beta=1, corresponding to a fitted mean of over 70%.

So, the result is wrong in the sense that it is not the mle, because of a 
failure of convergence, which happens because specifying the weights the way 
you did rather than the documented way leads to bad default starting values for 
the iteration.  You need either to specify the data as recommended or supply 
starting values.

=thomas


On Fri, 16 Apr 2010, Jan van der Laan wrote:


I have some questions about the use of weights in binomial glm as I am
not getting the results I would expect. In my case the weights I have
can be seen as 'replicate weights'; one respondent i in my dataset
corresponds to w[i] persons in the population. From the documentation
of the glm method, I understand that the weights can indeed be used
for this: For a binomial GLM prior weights are used to give the
number of trials when the response is the proportion of successes.

From Modern applied statistics with S-Plus 3rd ed. I understand the

same.

However, I am getting some strange results. I generated an example:

Generate some data which is simular to my dataset

Z - rbinom(1000, 1, 0.1)
W - round(rnorm(1000, 100, 40))
W[W  1] - 1


Probability of success can either be estimated using:

sum(Z*W)/sum(W)

[1] 0.09642109

Or using glm:

model - glm(Z ~ 1, weights=W, family=binomial())

Warning message:
In glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart,  :
 fitted probabilities numerically 0 or 1 occurred

predict(model, type=response)[1]

  1
2.220446e-16

These two results are obviously not the same. The strange thing is
that when I scale the weights, such that the total equals one, the
probability is correctly estimated:


model - glm(Z ~ 1, weights=W/sum(W), family=binomial())

Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!

predict(model, type=response)[1]

1
0.09642109


However scaling of the weights should, as far as I am aware, not have
an effect on the estimated parameters. I also tried some other
scalings. And, for example scaling the weights by 20 also gives me the
correct result.


model - glm(Z ~ 1, weights=W/20, family=binomial())

Warning message:
In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!

predict(model, type=response)[1]

1
0.09642109


Am I misinterpreting the weights? Could this be a numerical problem?

Regards,

Jan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapping a repeated measures ANOVA

2010-04-16 Thread Fischer, Felix

Thank you for your answer. Sorry for the missing example.

In fact, i think, i solved the issue by some data-manipulations in the 
function. I splitted the data (one set for each measuring time), selected the 
cases at random, and then combined the two measuring times again. Results look 
promising to me, but if someone is aware of problems, please let me know.

This code should run:

library(boot)
anova.daten=data.frame(subject=sort(rep(1:10,2)), mz=rep(1:2,10), 
ort=sort(rep(1:2,10)),PHQ_Sum_score=rnorm(20,10,2))  #generate data

summary(aov(PHQ_Sum_score~mz*ort+Error(subject/mz),data=anova.daten))



 F_values - function(formula, data1, indices) {
data2=subset(data1, data1$mz==2)  #subsetting data for each measuring time
data3=subset(data1, data1$mz==1)
data4 - data3[indices,] # allows boot to select sample
subjekte=na.omit(data4$subject)
data5=rbind(data3[subjekte,], data2[subjekte,]) #combine data
data5$subject=factor(rep(1:length(subjekte),2)) #convert repeated subjects 
to unique subjects
fit=aov(formula,data=data5)#fit model
return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F 
value`)) #return F-values
}
  

  results - boot(data=anova.daten, statistic=F_values,   
 R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz))  #bootstrap



Thanks a lot,

Felix Fischer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Poblems wih EBImage

2010-04-16 Thread Gregoire Pau

Hello,

EBImage is a Bioconductor package: please post on the Bioconductor 
mailing list.


EBImage requires the libraries ImageMagick and GTK+ to be installed. Did 
you follow the instructions of the installation manual ?

http://www.bioconductor.org/packages/release/bioc/html/EBImage.html

It looks like EBImage cannot locate ImageMagick and GTK+. Are they 
working properly (try gtk-demo and convert in the command line) ? Did 
you add the GTK path in the system path (most likely c:\gtk\bin) ? Did 
you tick the Install developement headers and libraries checkbox when 
installing ImageMagick ?


Hope this helps,

Regards,

Greg
---
Gregoire Pau
EMBL Research Officer
http://www.embl.de/~gpau/


R Heberto Ghezzo, Dr wrote:

Hello, Working with Windows 7 in a HP laptop with R-2.10.1
I download and installed ImageMagick-6.3.7.7-Q16-Windows-dll.exe and GTK 
2.12.9-win32-2, then downloaded and installed from local file EBImage_3.2.0.zip 
and I got:

library(EBImage)

Loading required package: abind
Error in inDL(x, as.logical(local), as.logical(now), ...) : 
  unable to load shared library 'C:/Programs/R/Cran/EBImage/libs/EBImage.dll':

  LoadLibrary failure:  The specified module could not be found.

In addition: Warning message:
package 'abind' was built under R version c(2, 5, 0) and help will not work 
correctly
Please re-install it 
Error: package/namespace load failed for 'EBImage'

the location C:\Programs\R\Cran\EBImage\libs\EBImage.dll exists
Can somebody tell me what is wrong?
Thanks
Heberto Ghezzo
McGill University
Canada
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] TeachingDemos install bumps out with 'Out of memory!'

2010-04-16 Thread Greg Snow
I have no idea what is happening here (not an ubunto or linux expert), but it 
seems unlikely that the particular package is the main problem, rather that is 
the package you happen to be on when the problem manifests.  TeachingDemos does 
not have any compiled code (all straight R code) and does not run any 
initialization procedures and is not a huge package, so seems unlikely to be 
the culprit (but if it is, let me know and I will try to fix it).  Have you 
tried restarting the computer and installing packages in a fresh session?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: Uwe Dippel [mailto:udip...@uniten.edu.my]
 Sent: Thursday, April 15, 2010 8:56 PM
 To: r-help@r-project.org; Greg Snow
 Subject: TeachingDemos install bumps out with 'Out of memory!'
 
 The same thing that happened to my 'maptools'
 (http://permalink.gmane.org/gmane.comp.lang.r.general/177404)
 also hits me here: It eats all memory until the system dies.
 Alas, in this case, no Ubuntu package.
 Since I installed some tens of packages with the same method in the
 meantime, I guess something must be wrong with the install.packages; at
 least on Ubuntu9.10, amd64.
 
 (And I am not out of memory, really:
 Mem:   3347584k total,   812872k used,  2534712k free,12648k
 buffers
 Swap:  4305380k total,   618464k used,  3686916k free,   272476k
 cached)
 
 Uwe
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCA scores

2010-04-16 Thread phoebe kong
Hi all,

I have a difficulty to calculate the PCA scores. The PCA scores I calculated
doesn't match with the scores generated by R,

mypca-princomp(mymatrix, cor=T)

myscore-as.matrix(mymatrix)%*%as.matrix(mypca$loadings)

Does anybody know how the mypca$scores were calculated? Is my formula not
correct?

Thanks a lot!

Phoebe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] VERY SIMPLE QUESTION

2010-04-16 Thread Kathie

Dear R users,

I am looking for more efficient way to compute the followings

--

a - matrix(c(1,1,1,1,2,2,2,2),4,2)
b - matrix(c(1,2,3,4),4,1)

Eventually, I want to get this matrix, `c`.

c - matrix(c(1/1,1/2,1/3,1/4,2/1,2/2,2/3,2/4),4,2)

--

In fact, #column of `a` is so big..

Is there a more efficient way to compute this instead of using apply or
something? or apply is only way..?

Any suggestion will be greatly appreciated.

Regards,

Kathryn Lord 
-- 
View this message in context: 
http://n4.nabble.com/VERY-SIMPLE-QUESTION-tp2013288p2013288.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VERY SIMPLE QUESTION

2010-04-16 Thread Henrique Dallazuanna
Try this:

sweep(a, 1, b, '/')

On Fri, Apr 16, 2010 at 2:30 PM, Kathie kathryn.lord2...@gmail.com wrote:

 Dear R users,

 I am looking for more efficient way to compute the followings

 --

 a - matrix(c(1,1,1,1,2,2,2,2),4,2)
 b - matrix(c(1,2,3,4),4,1)

 Eventually, I want to get this matrix, `c`.

 c - matrix(c(1/1,1/2,1/3,1/4,2/1,2/2,2/3,2/4),4,2)

 --

 In fact, #column of `a` is so big..

 Is there a more efficient way to compute this instead of using apply or
 something? or apply is only way..?

 Any suggestion will be greatly appreciated.

 Regards,

 Kathryn Lord
 --
 View this message in context: 
 http://n4.nabble.com/VERY-SIMPLE-QUESTION-tp2013288p2013288.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA scores

2010-04-16 Thread Gavin Simpson
On Fri, 2010-04-16 at 10:23 -0700, phoebe kong wrote:
 Hi all,
 
 I have a difficulty to calculate the PCA scores. The PCA scores I calculated
 doesn't match with the scores generated by R,
 
 mypca-princomp(mymatrix, cor=T)
 
 myscore-as.matrix(mymatrix)%*%as.matrix(mypca$loadings)
 
 Does anybody know how the mypca$scores were calculated? Is my formula not
 correct?

You need to apply the centring and scaling done because you set 'cor =
TRUE' in your princomp call. Here's an example using the inbuilt 'swiss'
data set.

data(swiss)
pc - princomp(swiss, cor = TRUE)
my.scr - with(pc, scale(swiss, center = center, scale = scale) %*% 
   loadings(pc))
all.equal(my.scr, pc$scores)

You can see all of this in the princomp code if you look closely:

getAnywhere(princomp.default)

HTH

 
 Thanks a lot!
 
 Phoebe
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VERY SIMPLE QUESTION

2010-04-16 Thread Christian Raschke
Since b is only one column, just make it a vector. 

 a - matrix(c(1,1,1,1,2,2,2,2),4,2)
 b - c(1,2,3,4)

then

 result - a/b
 result
  [,1]  [,2]
[1,] 1.000 2.000
[2,] 0.500 1.000
[3,] 0.333 0.667
[4,] 0.250 0.500

should be what you want. 

It is also a bad idea to name the resulting matrix c since c(...) is a
primitive function.

Christian



On Fri, 2010-04-16 at 09:30 -0800, Kathie wrote:
 Dear R users,
 
 I am looking for more efficient way to compute the followings
 
 --
 
 a - matrix(c(1,1,1,1,2,2,2,2),4,2)
 b - matrix(c(1,2,3,4),4,1)
 
 Eventually, I want to get this matrix, `c`.
 
 c - matrix(c(1/1,1/2,1/3,1/4,2/1,2/2,2/3,2/4),4,2)
 
 --
 
 In fact, #column of `a` is so big..
 
 Is there a more efficient way to compute this instead of using apply or
 something? or apply is only way..?
 
 Any suggestion will be greatly appreciated.
 
 Regards,
 
 Kathryn Lord 

-- 
Christian Raschke
Department of Economics and
ISDS Research Lab (HSRG)
Louisiana State University
Patrick Taylor Hall, Rm 2128
Baton Rouge, LA 70803
cras...@lsu.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VERY SIMPLE QUESTION

2010-04-16 Thread Kathie

thanks a lot.

good day.

Kathie


On Fri, Apr 16, 2010 at 1:43 PM, Henrique Dallazuanna [via R] 
ml-node+2013302-929204043-67...@n4.nabble.comml-node%2b2013302-929204043-67...@n4.nabble.com
 wrote:

 Try this:

 sweep(a, 1, b, '/')

 On Fri, Apr 16, 2010 at 2:30 PM, Kathie [hidden 
 email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=2013302i=0
 wrote:

 
  Dear R users,
 
  I am looking for more efficient way to compute the followings
 
 
 --
 
  a - matrix(c(1,1,1,1,2,2,2,2),4,2)
  b - matrix(c(1,2,3,4),4,1)
 
  Eventually, I want to get this matrix, `c`.
 
  c - matrix(c(1/1,1/2,1/3,1/4,2/1,2/2,2/3,2/4),4,2)
 
 
 --
 
  In fact, #column of `a` is so big..
 
  Is there a more efficient way to compute this instead of using apply or

  something? or apply is only way..?
 
  Any suggestion will be greatly appreciated.
 
  Regards,
 
  Kathryn Lord
  --
  View this message in context:
 http://n4.nabble.com/VERY-SIMPLE-QUESTION-tp2013288p2013288.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  [hidden 
  email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=2013302i=1mailing
   list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

 __
 [hidden 
 email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=2013302i=2mailing
  list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
  View message @
 http://n4.nabble.com/VERY-SIMPLE-QUESTION-tp2013288p2013302.html
 To unsubscribe from VERY SIMPLE QUESTION, click here (link removed) ==.




-- 
View this message in context: 
http://n4.nabble.com/VERY-SIMPLE-QUESTION-tp2013288p2013312.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read xml

2010-04-16 Thread Alex Campos
Hi
I am trying to read selected fields from a xml file with R using xml  
package. So far I have learned the basics of this package by going  
through the manual, examples, tutorial, and so on (www.omegahat.org/RSXML) 
. The problem is that I am getting stuck when it comes down to more  
complex xml files. I am a novice in R and xml, and was wondering if  
someone could help me out with here.

Here is my xml file. I am only interested in the protein_group node.  
Therefore, I have omitted most of the information from the other two  
previous nodes (protein_summary_header, proteinprophet_details).

?xml version=1.0 encoding=UTF-8?
?xml-stylesheet type=text/xsl 
href=http://localhost/ISB/data/interact-LFA1_C18_PME5R1.prot.xsl 
?
protein_summary xmlns=http://regis-web.systemsbiology.net/protXML;  
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;  
xsi:schemaLocation=http://sashimi.sourceforge.net/schema_revision/protXML/protXML_v6.xsd
 
 summary_xml=interact-LFA1_C18_PME5R1.prot.xml
protein_summary_header reference_database=EColi_decoy_v3.0.fasta
program_details analysis=proteinprophet
proteinprophet_details  occam_flag=Y run_options=XML
protein_group group_number=1 probability=1.
   protein protein_name=sp|P4|CYC_HORSE  
n_indistinguishable_proteins=1 probability=1.  
percent_coverage=46.7 unique_stripped_peptides=EDLIAYLK+EETLMEYLENPK 
+KTGQAPGFTYTDANK+TEREDLIAYLK+TGPNLHGLFGR+TGQAPGFTYTDANK  
group_sibling_id=a total_number_peptides=226  
pct_spectrum_ids=2.54 confidence=1.00
  parameter name=prot_length value=107/
  annotation protein_description=Cytochrome c OS=Equus  
caballus GN=CYCS PE=1 SV=2/
  peptide peptide_sequence=KTGQAPGFTYTDANK charge=2  
initial_probability=0.9989 nsp_adjusted_probability=0.9998  
peptide_group_designator=a weight=1.00  
is_nondegenerate_evidence=Y n_enzymatic_termini=2  
n_sibling_peptides=8.50 n_sibling_peptides_bin=6 n_instances=10  
exp_tot_instances=9.94 is_contributing_evidence=Y  
calc_neutral_pep_mass=1597.7737
  /peptide
  peptide peptide_sequence=TGQAPGFTYTDANK charge=2  
initial_probability=0.9989 nsp_adjusted_probability=0.9998  
weight=1.00 is_nondegenerate_evidence=Y n_enzymatic_termini=2  
n_sibling_peptides=8.50 n_sibling_peptides_bin=6 n_instances=90  
exp_tot_instances=89.82 is_contributing_evidence=Y  
calc_neutral_pep_mass=1469.6786
  /peptide
  peptide peptide_sequence=KTGQAPGFTYTDANK charge=3  
initial_probability=0.9990 nsp_adjusted_probability=0.9998  
peptide_group_designator=a weight=1.00  
is_nondegenerate_evidence=Y n_enzymatic_termini=2  
n_sibling_peptides=8.50 n_sibling_peptides_bin=6 n_instances=10  
exp_tot_instances=9.89 is_contributing_evidence=Y  
calc_neutral_pep_mass=1597.7737
  /peptide
   /protein
/protein_group
protein_group group_number=2 probability=1.
   protein protein_name=sp|P00350|6PGD_ECOLI  
n_indistinguishable_proteins=1 probability=1.  
percent_coverage=32.1 unique_stripped_peptides=AGAGTDAAIDSLKPYLDK 
+EAYELVAPILTK+EFVESLETPR+EKTEEVIAENPGK+GDIIIDGGNTFFQDTIR+GPSIMPGGQK 
+GYTVSIFNR+IAAVAEDGEPCVTYIGADGAGHYVK+IVSYAQGFSQLR+QIADDYQQALR 
+TEEVIAENPGK+VLSGPQAQPAGDK group_sibling_id=a  
total_number_peptides=32 pct_spectrum_ids=0.36 confidence=1.00
  parameter name=prot_length value=474/
  annotation protein_description=6-phosphogluconate deh ...


I did the following:
  doc - xmlRoot(xmlTreeParse(myfile.xml))
  xmlApply(doc, names)
$protein_summary_header
   program_details
program_details

$dataset_derivation
list()

$protein_group
   protein
protein

$protein_group
   protein
protein

[IN FACT, THE $protein_group APPEARS A COUPLE HUNDRED TIMES]

So, I want to create a data frame comprising of selected information  
from my $protein_group as follows:

group_numberprotein_nameprobability peptide_sequence 
initial_probability n_instances
1   sp|P4|CYC_HORSE 1.  KTGQAPGFTYTDANK 0.9989  10
1   sp|P4|CYC_HORSE 1.  TGQAPGFTYTDANK  0.9989  90
1   sp|P4|CYC_HORSE 1.  KTGQAPGFTYTDANK 0.9990  10
2   sp|P00350|6PGD_ECOLI1.  NAPGTYCMR   0.9349  8
2   sp|P00350|6PGD_ECOLI1.  TGAHPGPMK   0.9124  2

As I understand the variables from columns 4, 5 and 6 are children  
from protein_group. For each $protein_group, I need to retrieve some  
of its children.
I would greatly appreciate any help.
Thank you very much,
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to get rid of extra areas on spatial data map

2010-04-16 Thread Changyou Sun
Hello All,

I am using sp and maps libraries to have a map for some fire data.
The region covered is Mississippi State in the US. Then I would like to
add a layer of ecoregion on the top (omenrik layer from
nationalatlas.gov). The problem is that the ecoregion layer is larger
than the state boundary. How can I get rid of these ecoregion areas out
of MS state?

The code is like this:

# Both data.fire and data.ecoregion are class of
SpatialPolygonsDataFrame.
# Every fire is within the state boundary so it looks nice.
# Some ecoregions go beyond the state boundary.

library(sp); library(maps)
win.graph(width=4, height=6)
map('state', region = 'mississippi', col='red')
plot(data.fire, add=T) 
plot(data.ecoregion, add=T)

Any hint is greatly appreciated.



Edwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] call R script from Excel VBA/macro

2010-04-16 Thread KZ
i wrote a R script say called computeCovarMatrix.R and i want to call and
run this piece from Excel visual basic. does anyone know how to do that?

thanks,
KZ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run R script from Excel VBA

2010-04-16 Thread Guy Green

See RExcel,  http://rcom.univie.ac.at/ http://rcom.univie.ac.at/  and
especially the video demo  http://rcom.univie.ac.at/RExcelDemo/
http://rcom.univie.ac.at/RExcelDemo/ 

Guy
-- 
View this message in context: 
http://n4.nabble.com/run-R-script-from-Excel-VBA-tp2009478p2011942.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Image RGB calculation

2010-04-16 Thread ole_roessler

Thanks a lot, i will try this out!

Ole
-- 
View this message in context: 
http://n4.nabble.com/Image-RGB-calculation-tp1989864p2013203.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficiency of C Compiler in R CMD SHLIB

2010-04-16 Thread yehengxin

Thank you very much for your kind explanation.  I did find my DLL compiled
using either VC++ 6.0 or Intel Compiler (almost equally fast) is
significanlty faster than that compiled using gcc (55 seconds vs. 78
seconds), the default compiler in R.  I did not choose debug mode when
using gcc so I suppose it generates released version of DLL.  I just
wonder how to switch to using ICC or VC++ 6.0' compiler in R CMD SHLIB. 
Could you give me some advice?  Thanks!
-- 
View this message in context: 
http://n4.nabble.com/Efficiency-of-C-Compiler-in-R-CMD-SHLIB-tp1934429p2004312.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficiency of C Compiler in R CMD SHLIB

2010-04-16 Thread yehengxin

I wonder how to further improve the optimization level of gcc.  I thought
O-3 has already been the best.  
-- 
View this message in context: 
http://n4.nabble.com/Efficiency-of-C-Compiler-in-R-CMD-SHLIB-tp1934429p2008994.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R loop.

2010-04-16 Thread Thomas Stewart
I'm not sure I completely understand your question, but I think the solution
to your problem is the reshape function in the reshape package.  Here is a
silly example of how it would work:

 V-matrix(rbinom(15,4,.5),nrow=3)
 X-data.frame(A=c(A,B,C),V=V)
 X
  A V.1 V.2 V.3 V.4 V.5
1 A   1   2   3   3   3
2 B   4   3   0   2   2
3 C   2   3   2   1   2
 reshape(X,direction=long,varying=c(V.1,V.2,V.3,V.4,V.5))
A time V id
1.1 A1 1  1
2.1 B1 4  2
3.1 C1 2  3
1.2 A2 2  1
2.2 B2 3  2
3.2 C2 3  3
1.3 A3 3  1
2.3 B3 0  2
3.3 C3 2  3
1.4 A4 3  1
2.4 B4 2  2
3.4 C4 1  3
1.5 A5 3  1
2.5 B5 2  2
3.5 C5 2  3

Your two columns of interest are A and V.  The time column lets you know
from which column the V came.

-tgs

On Fri, Apr 16, 2010 at 6:35 AM, mhalsham mhals...@bradford.ac.uk wrote:


 Hi every one I’m new to R and I cant figure our how to use the loop to do
 the
 following task, any help would be very kind of every one.
 I have a file called (table3.txt) that contains over 1000 row and over 40
 columns.
 So for example first row would look like that

 Deafness,   EYA4,   DIAPH1, MYO7A,  TECTA, COL11A2, POU4F3,
 MYH9,   ACTG1,
 MYO6

 I want the loop stamens to loop thro each row and take first cell which is
 (Deafness and second which is EYA4) and but it on the button of the file
 and
 then take the first cell which is (Deafness again and the third cell which
 is the DIAPH1) and put it on the button of the file. And so on till I end
 up
 with two columns one consists all the disease and one consist all the
 genes.

 --
 View this message in context:
 http://n4.nabble.com/R-loop-tp1979620p1979620.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is it ok to apply the z.test this way?

2010-04-16 Thread Atte Tenkanen
Dear R-users,

I want to check if certain values are from random distribution, that includes 
values between 0-1. So, it is not really normal even though shapiro.test says 
it is highly normal... Can I do something like this and think that the values 
given are right. z.test is from package TeachingDemos. 
---
SelectedVals=c()
for(i in seq(0,1,by=0.001))
{
if((z.test(i, mu=mean(Distribution), 
stdev=sd(Distribution))$p.value)=0.05) SelectedVals=c(SelectedVals,i)
}

---
I have marked the border values given by this script to the histogram of the 
original random distribution:

http://www.ag.fimug.fi/~Atte/62Hist100410.pdf

Atte Tenkanen
University of Turku, Finland
Department of Musicology
+35823335278
http://users.utu.fi/attenka/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R loop.

2010-04-16 Thread David Winsemius


On Apr 16, 2010, at 11:52 AM, Thomas Stewart wrote:

I'm not sure I completely understand your question, but I think the  
solution

to your problem is the reshape function in the reshape package.


Except there is no reshape function in the reshape package. Your code  
works because the reshape function is in the stats package which is  
loaded by default.



 Here is a
silly example of how it would work:


V-matrix(rbinom(15,4,.5),nrow=3)
X-data.frame(A=c(A,B,C),V=V)
X

 A V.1 V.2 V.3 V.4 V.5
1 A   1   2   3   3   3
2 B   4   3   0   2   2
3 C   2   3   2   1   2

reshape(X,direction=long,varying=c(V.1,V.2,V.3,V.4,V.5))

   A time V id
1.1 A1 1  1
2.1 B1 4  2
3.1 C1 2  3
1.2 A2 2  1
2.2 B2 3  2
3.2 C2 3  3
1.3 A3 3  1
2.3 B3 0  2
3.3 C3 2  3
1.4 A4 3  1
2.4 B4 2  2
3.4 C4 1  3
1.5 A5 3  1
2.5 B5 2  2
3.5 C5 2  3

Your two columns of interest are A and V.  The time column lets you  
know

from which column the V came.

-tgs

On Fri, Apr 16, 2010 at 6:35 AM, mhalsham mhals...@bradford.ac.uk  
wrote:




Hi every one I’m new to R and I cant figure our how to use the loop  
to do

the
following task, any help would be very kind of every one.
I have a file called (table3.txt) that contains over 1000 row and  
over 40

columns.
So for example first row would look like that

Deafness,   EYA4,   DIAPH1, MYO7A,  TECTA, COL11A2,  
POU4F3,

MYH9,   ACTG1,
MYO6

I want the loop stamens to loop thro each row and take first cell  
which is
(Deafness and second which is EYA4) and but it on the button of the  
file

and
then take the first cell which is (Deafness again and the third  
cell which
is the DIAPH1) and put it on the button of the file. And so on till  
I end

up
with two columns one consists all the disease and one consist all the
genes.

--
View this message in context:
http://n4.nabble.com/R-loop-tp1979620p1979620.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it ok to apply the z.test this way?

2010-04-16 Thread David Winsemius


On Apr 16, 2010, at 12:11 PM, Atte Tenkanen wrote:


Dear R-users,

I want to check if certain values are from random distribution, that  
includes values between 0-1. So, it is not really normal even though  
shapiro.test says it is highly normal... Can I do something like  
this and think that the values given are right. z.test is from  
package TeachingDemos.

---
SelectedVals=c()
for(i in seq(0,1,by=0.001))
{
	if((z.test(i, mu=mean(Distribution), stdev=sd(Distribution)) 
$p.value)=0.05) SelectedVals=c(SelectedVals,i)

}



You are attempting to do statistics on a single number at a time. If  
you do not immediately appreciate the absurdity of this effort, then  
you should consult a real statistician without delay. There are many  
fine statisticians at your university.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with FUN in Hmisc::summarize

2010-04-16 Thread Frank E Harrell Jr

arnaud chozo wrote:

Hi all,

I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
of a single vector argument to create the statistical summaries.

Consider an easy case: I'd like to compute the correlation between two
variables in my dataframe, grouped according to other variables in the same
dataframe.

For exemple, consider the following dataframe D:
V1  V2   V3
A 1-1
A 1 1
A-1-1
B 1 1
B 1 1

I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1), FUN=corr.V2.V3)

where corr.V2.V3 is defined as follows:

corr.V2.V3 = function(x) {
  d = cbind(x$V2, x$V3)

  out = c(cor(d))
  names(out) = c(CORR)
  return(out)
}

I was not able to use Hmisc::summarize in this case because FUN should be a
function of a matrix argument. Any idea?

Thanks in advance,
Arnaud


See the Hmisc mApply or summary.formula functions, or use tapply using a 
vector of possible subscripts (1:n) as the first argument; then you can 
use the subscripts selected to address multiple variables.


Frank

--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it ok to apply the z.test this way?

2010-04-16 Thread Greg Snow
Several points:

1. The Shapiro test does not tell you that something is normal or highly 
normal, only that you don't have enough evidence to disprove that the data came 
from a normal population (powered for a certain type of deviation from 
normality).

2. The z.test function is intended to be used as a stepping stone in learning 
for students, a simple test with unrealistic assumptions to get the ideas, then 
relax the assumptions and learn about t tests and others.

3.  The z test is only used when the population standard deviation is known, 
you calculate the sd from the data, that is what t tests are for.

4.  Calculating the hypothesized mean from the data is backwards.

5.  using a sample size of 1 is questionable, doing this 1,000 times without 
correction is even more questionable.

6.  Your code is equivalent to:

tmp - seq(0,1, by=0.001)
tmp2 - tmp[ abs(tmp-mean(Distribution))/sd(Distribution)  1.96 ]

just slower and less memory efficient.

7. None of this establishes what is from an unknown distribution.

If you can tell us what your real question is, then maybe we can help with a 
real solution.

So to answer your question of if it is ok to use z.test in that way: Leagally 
the license says you can use it anyway you want, 
ethically/morally/aesthetically/or following the intent of the author, No!

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Atte Tenkanen
 Sent: Friday, April 16, 2010 10:11 AM
 To: r-help@r-project.org
 Subject: [R] Is it ok to apply the z.test this way?
 
 Dear R-users,
 
 I want to check if certain values are from random distribution, that
 includes values between 0-1. So, it is not really normal even though
 shapiro.test says it is highly normal... Can I do something like this
 and think that the values given are right. z.test is from package
 TeachingDemos.
 ---
 
 SelectedVals=c()
 for(i in seq(0,1,by=0.001))
 {
   if((z.test(i, mu=mean(Distribution),
 stdev=sd(Distribution))$p.value)=0.05) SelectedVals=c(SelectedVals,i)
 }
 
 ---
 
 I have marked the border values given by this script to the histogram
 of the original random distribution:
 
 http://www.ag.fimug.fi/~Atte/62Hist100410.pdf
 
 Atte Tenkanen
 University of Turku, Finland
 Department of Musicology
 +35823335278
 http://users.utu.fi/attenka/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it ok to apply the z.test this way?

2010-04-16 Thread Christos Argyropoulos

So .. 

are you trying to figure out whether your data hasa substantial number of 
outliers that call into question the adequacy of the normal distro fro your 
data?

 

If this is the case, note that you cannot individually check the values (as you 
are doing) without taking into account of the Bonferoni fallacy i.e. small 
p-values will be found with a respectable frequency as the size of the dataset 
grows (C Robert discusses this in a preprint in arxiv see 
http://arxiv.org/PS_cache/arxiv/pdf/1002/1002.2080v1.pdf ) So even though you 
could check each individual point for normality, testing the whole dataset 
requires that you apply a Bonferoni correction to your z.tests or use 
outlier.test from package car to reduce the amount of code you have to write.

 

Regards, 

Christos
 
 Date: Fri, 16 Apr 2010 19:11:19 +0300
 From: atte...@utu.fi
 To: r-help@r-project.org
 Subject: [R] Is it ok to apply the z.test this way?
 
 Dear R-users,
 
 I want to check if certain values are from random distribution, that includes 
 values between 0-1. So, it is not really normal even though shapiro.test says 
 it is highly normal... Can I do something like this and think that the values 
 given are right. z.test is from package TeachingDemos. 
 ---
 SelectedVals=c()
 for(i in seq(0,1,by=0.001))
 {
 if((z.test(i, mu=mean(Distribution), stdev=sd(Distribution))$p.value)=0.05) 
 SelectedVals=c(SelectedVals,i)
 }
 
 ---
 I have marked the border values given by this script to the histogram of the 
 original random distribution:
 
 http://www.ag.fimug.fi/~Atte/62Hist100410.pdf
 
 Atte Tenkanen
 University of Turku, Finland
 Department of Musicology
 +35823335278
 http://users.utu.fi/attenka/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
_
Hotmail: Powerful Free email with security by Microsoft.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] formatR: farewell to ugly R code

2010-04-16 Thread Yihui Xie
This is an announcement of the release of an R package 'formatR',
which can help us format our R code to make it more human-readable. If
you have ugly (I mean unformatted) R code like this:

 # rotation of the word Animation
# in a loop; change the angle and color
# step by step
for (i in 1:360) {
 # redraw the plot again and again
plot(1,ann=FALSE,type=n,axes=FALSE)
# rotate; use rainbow() colors
text(1,1,Animation,srt=i,col=rainbow(360)[i],cex=7*i/360)
# pause for a while
Sys.sleep(0.01)}

There are no spaces, no appropriate indent... The package 'formatR'
provides a GUI (by gWidgets) to make messy R code clean and tidy, e.g.

# rotation of the word 'Animation'
# in a loop; change the angle and color
# step by step
for (i in 1:360) {
   # redraw the plot again and again
   plot(1, ann = FALSE, type = n, axes = FALSE)
   # rotate; use rainbow() colors
   text(1, 1, Animation, srt = i, col = rainbow(360)[i],
       cex = 7 * i/360)
   # pause for a while
   Sys.sleep(0.01)
}

The usage is simple:

# formatR depends on RGtk+; will be installed automatically
# better use the latest version of R (=2.10.1)
install.packages('formatR')
library(formatR)
# or formatR()

Screen-shots can be found here:
http://yihui.name/en/2010/04/formatr-farewell-to-ugly-r-code/

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-6609 Web: http://yihui.name
Department of Statistics, Iowa State University
3211 Snedecor Hall, Ames, IA

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >