date:20110105

Re: [R] lattice: strip panel function question

2011-01-05 Thread Deepayan Sarkar

On Mon, Dec 6, 2010 at 6:22 PM, Maarten van Iterson
m.van_iterson...@lumc.nl wrote:
 Thanks Chris Campbell,

 I didn't though about that.

 Cheers,
 Maarten

 On Mon, 2010-12-06 at 10:08 +, Chris Campbell wrote:
 data$subjectID - paste(data$groups, data$subjects) # create a
 character
 label

 xyplot(responses~time|subjectID, groups = groups, data = data,
 aspect=xy)


Another option is

xyplot(responses~time | groups:subjects, data = data, aspect=xy)

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] t-test or ANOVA...who wins? Help please!

2011-01-05 Thread Tal Galili

Hello Frodo,

It is not clear to me from your questions some of the basics of your
analysis.

If you only have two levels of a factor, and one response - why in the anova
do you use more factors (and their interactions)?
In that sense, it is obvious that your results would differ from the t-test.

In either case, I am not sure if any of these methods are valid since your
data doesn't seem to be normal.
Here is an example code of how to get the same results from aov and t.test.
 And also a nonparametric option (that might be more fitting)


flat_550_W_realism =c(3,3,5,3,3,3,3,5,3,3,5,7,5,2,3)
flat_550_W_realism_AH =c(7,4,5,3,6,5,3,5,5,7,2,7,5, 5)

x - c(rep(1, length(flat_550_W_realism)),
rep(2, length(flat_550_W_realism_AH)))

y - c(flat_550_W_realism , flat_550_W_realism_AH)

# equal results between t test and anova
t.test(y ~ x, var.equal= T)
summary(aov(y ~ x))

# plotting the data:
boxplot(y ~ x) # group 1 is not at all symetrical...
wilcox.test(y ~ x) # a more fitting test



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Jan 5, 2011 at 12:37 AM, Frodo Jedi frodo.j...@yahoo.com wrote:


 I kindly ask you an help because I really don´t know how to solve this
 problem.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cost-benefit/value for money analysis

2011-01-05 Thread Graham Smith

Ben


   Perhaps you can specify your question more precisely, or differently.
 The way I interpret it, if there are no interactions in price
 (e.g. you get a discount for buying more than one book at a time)
 or in value (e.g. you learn more from one book having read another),
 then you get the best value/price ratio by taking only the book with
 the highest value/price.  (If you take no books at all, your value/price
 ratio is undefined.)  The algebra below shows that combining a lower
 value/price book with a higher one always lowers your overall value/price
 ratio.


Thanks, for the pointers on R functions. My question was as superficial as
it sounded. I have a commercial programme that does this (one of several
that are available), and wondered if there was an R package that provided
the same tools. It's a common tool, and I had hoped to have explained enough
to allow an appropriate package to be identified, so I could have a quick
look at what it does.

But having started this, I now feel obliged to clarify the question. I only
chose books as an easy example, you could substitute alternative marketing
strategies, monitoring programmes, choice of ornaments for a new house, or
holidays etc. So there could be only a few potential combinations or
hundreds.

But to stick with the books, and only three options: A, B and C

Book A costs $100 and I have given it a subjective value of  50
Book B costs  $36  and I have given it a subjective value of  60
Book C costs  $50 and I have given it a subjective value of 80

So book A is costing me $2 per value unit, Book B $0.6 per value unit and
book C £0.63 per value unit.

Buying books A+B gives me a $1.24 per value unit
Buying books A+C gives $1.07 per value unit
Buying books B+C gives 0.61 per value unit
Buying books A+B+C gives 0.97 per value unit

So in terms of value for money, there are three contenders

Book B on its own, Book C on its own, or buying both books B and C.

Book B $36.00 and value 60
Book C $50.00 and value 80
Book B+C  at $76.00 and value 140

Depending on how you are using this tool, you can either use it to decide
how spend an existing budget, or use it to set a budget.

Seems hardly worth the bother for three books but if you are looking at 20
books or 30 different monitoring options etc, it gives a useful insight into
how best to spend or set a budget

The commercial software graphs this costs vs values so you usually end up
with some sort of an asymptotic graph where you can see that spending below
a certain budget gives a very poor return.

Graham

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] openNLP package error

2011-01-05 Thread Allan Engelhardt


Apologies that I am late on this thread.

On 02/12/10 17:39, Sascha Wolfer wrote:
I seem to have a problem with the openNLP package, I'm actually stuck 
in the very beginning. Here's what I did:

 install.packages(openNLP)
 install.packages(openNLPmodels.de, repos = 
http://datacube.wu.ac.at/;, type = source)


 library(openNLPmodels.de)
 library(openNLP)

So I installed the main package as well as the supplementary german 
model. Now, I try to use the sentDetect function:


 s - c(Das hier ist ein Satz. Und hier ist noch einer - sogar mit 
Gedankenstrich. Ist das nicht toll?)

 sentDetect(s, language = de, model = openNLPmodels.de)

I get the following error message which I can't make any sense of:

Fehler in .jnew(opennlp/maxent/io/SuffixSensitiveGISModelReader, 
.jnew(java.io.File,  :
  java.io.FileNotFoundException: openNLPmodels.de (No such file or 
directory)


The correct syntax seems to be

sentDetect(s, model = system.file(models, de-sent.bin, package = 
openNLPmodels.de))


but unfortunately I get

Error in .jcall(.jnew(opennlp/maxent/io/SuffixSensitiveGISModelReader,  :
  java.io.UTFDataFormatException: malformed input around byte 48


YMMV.  But you get the idea on the syntax of the model= argument.  This 
works:


sentDetect(s, model = system.file(models, sentdetect, EnglishSD.bin.gz, package = 
openNLPmodels.en))
# [1] Das hier ist ein Satz. 
# [2] Und hier ist noch einer - sogar mit Gedankenstrich. 

# [3] Ist das nicht toll?


Hope this helps you a little.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bwplot

2011-01-05 Thread Aina Carbonell

I'm trying use the function bwplot, but I receive a message that the
function is not found. I charged the lattice, sm, and Hmrsc, package but
without success. That I trying to do is an unique box-plot with in the
x-axes two levels Season and Area, and in the y axis abundance. 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R(D) Com under R1070

2011-01-05 Thread Henri Leblond


I get the same trouble
Please finally did you succeed fixing this trouble ?

Henri

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Navigating web pages using R

2011-01-05 Thread lcn

You are talking about dig the data from a dynamic webpage. The data
displayed to you, I guess, is fetched via filtering from certain database.
And the dropdowns you saw in the page must be some sort of widgets to do
these filtering.

Some sites offer this filtering via URL parsing, where the final URL changes
along with used filters. But other sites might only offer those data via
embeded widgets and does no change to the URL you see. Maybe your case
beongs to the second type.

Maybe you can do some analysis to the source of that webpage. If you are
lucky, you can find some codes dealing with the filering job. They might
offer some help. :D

2011/1/5 Mike Marchywka marchy...@hotmail.com




  Date: Tue, 4 Jan 2011 10:54:19 -0800
  From: egregory2...@yahoo.com
  To: r-help@r-project.org
  Subject: [R] Navigating web pages using R
 
  R-Help,
 
  I'm trying to obtain some data from a webpage which masks the URL from
 the user,
  so an explicit URL will not work. For example, when one navigates to the
 web
  page the URL looks something like:
  http://137.113.141.205/rpt34s.php?flags=1 (changed for privacy, but i'm
 not sure
  you could access it anyways since it's internal to the agency I work
 for).

 LOL, presuming you are not a disgruntled employee, it is always amusing to
 see some entity with a fancy cryptic web design drink their own Koolaid :)
 This is the most annoying kind of code to write, especially when there is
 no reason such as revenue model to make it hard to get. I've posted in
 other
 forums about the general need for an API if you are providing data to
 others
 in a non-hostile setting.


  The site has three drop-down menus for Site, Month, and Year. When
 a
  combination is selected of these, the resulting URL is
  always http://137.113.141.205/rpt34s (nothing changes, except flags=1
 is
  dropped, so what I need to be able to do is write something that will
 navigate
  to the original URL, then select some combination of Site, Month, and
  Year, and then submit the query to the site to navigate to the page
 with the
  data.
  Is this a capability that R has as a language? Unfortunately, I'm
 unfamiliar
  with html or php programming, so if this question belongs in a forum on
 that I
  apologize. I'm trying to centralize all of my code for my analysis in R!

 I'm sure that ultimately you can code this in R but for digging out what
 you need there may be better approaches.
 First I would try to contact the page author or determine if there is
 a better way to get the same data. Failing that, you may be able to find
 a form section in the html and copy that. Firefox is supposed to have
 something
 called firebug to let you see what the page does but I've never actually
 used
 that. Generally I use linux or cygwin command line tools to diagnose this
 junk,
 R may support some of these features but this is a common issue outside of
 R too
 and so it may be worth while learning the other tools. If all else fails,
 downloading
 a local copy of the page etc, you may be able to do a packet capture and
 just
 see what it does by brute force.

 From what I have seen, the R tools are pretty much named after the linux
 tools,
 curl for example.



 
  Thank you,
  -Erik Gregory
  Student Assistant, California EPA
  CSU Sacramento, Mathematics
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice: strip panel function question

2011-01-05 Thread Maarten van Iterson

Thanks, Deepayan, 

that solution is even more elegant!

Maarten

On Wed, 2011-01-05 at 14:24 +0530, Deepayan Sarkar wrote:
 On Mon, Dec 6, 2010 at 6:22 PM, Maarten van Iterson
 m.van_iterson...@lumc.nl wrote:
  Thanks Chris Campbell,
 
  I didn't though about that.
 
  Cheers,
  Maarten
 
  On Mon, 2010-12-06 at 10:08 +, Chris Campbell wrote:
  data$subjectID - paste(data$groups, data$subjects) # create a
  character
  label
 
  xyplot(responses~time|subjectID, groups = groups, data = data,
  aspect=xy)
 
 
 Another option is
 
 xyplot(responses~time | groups:subjects, data = data, aspect=xy)
 
 -Deepayan


-- 
Maarten van Iterson
Center for Human and Clinical Genetics
Leiden University Medical Center (LUMC)
Research Building, Einthovenweg 20 
Room S-04-038
Phone: 071-526 9439
E-mail: m.van_iterson...@lumc.nl
---
Postal address:
Postzone S-04-P
Postbus 9600
2300 RC Leiden 
The Netherlands

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting Fortran or C++ etc to R

2011-01-05 Thread lcn

I did a quick search for interfacing R and Fortran. Found this past
information. Hope it helps. :D
http://r.789695.n4.nabble.com/Conerned-about-Interfacing-R-with-Fortran-td887428.html

As for your actual requirement to do the convertion, I guess there'd not
exist any quick ways. You have to be both familiar with R and the other
language to make the rewrite work.

2011/1/5 Murray Jorgensen m...@stats.waikato.ac.nz

 I'm going to try my hand at converting some Fortran programs to R. Does
 anyone know of any good articles giving hints at such tasks? I will post a
 selective summary of my gleanings.

 Cheers,  Murray
 --
 Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
 Department of Statistics, University of Waikato, Hamilton, New Zealand
 Email: m...@waikato.ac.nzFax 7 838 4155
 Phone  +64 7 838 4773 wkHome +64 7 825 0441   Mobile 021 0200 8350

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to use S-Plus functions in R

2011-01-05 Thread Hein


Hi

I am very new to R.  I used to work in S-Plus a lot but that was years ago. 
I wrote a large number of functions that I now want to view and edit in R. 
I know I have to tell R where the functions are but I have no idea how.  The
functions are stored on my laptop's c-drive.  I tried everything I could
find e.g. library(myfilepath), source(myfilepath) etc. but nothing seems to
work.

Hein
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-S-Plus-functions-in-R-tp3174963p3174963.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] What are the necessary Oracle software to install and run ROracle ?

2011-01-05 Thread thomas . carrie

Hello,

I am running Linux, I have downloaded

instantclient-basiclite-linux32-11.2.0.2.0.zip
instantclient-sqlplus-linux32-11.2.0.2.0.zip
instantclient-sdk-linux32-11.2.0.2.0.zip
instantclient-precomp-linux32-11.2.0.2.0.zip

All these tarballs are unzipped in /usr/local/lib/instantclient, I have 
added this path in the library path of the host.

I can run sqlplus and proc, they do not complain about missing symbol.

Then I install ROracle : install.packages(ROracle)

Compilation step is OK
But when the test step tries to load the ROracle.so library, it fails :

** testing if installed package can be loaded
Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared library 
'/opt/R-2.11.1/lib/R/library/ROracle/libs/ROracle.so':   
  /opt/R-2.11.1/lib/R/library/ROracle/libs/ROracle.so: undefined symbol: 
sqlprc 

Here is my list of lib in instantclient directory : 
$ find -name *.*o -o -name *.a
./libsqlplusic.so
./sdk/demo/procobdemo.pco
./cobsqlintf.o
./libociicus.so
./libnnz11.so
./libocijdbc11.so
./libsqlplus.so

Do I need so more lib ? From which Oracle tarball ?

Thanks for help




This message and any attachments (the message) is\ int...{{dropped:31}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-downunder] Converting Fortran or C++ etc to R

2011-01-05 Thread Michael Sumner

Hi Murray, at first I thought you meant compiling existing Fortran or
C++ for use in R with .Fortran() and so on, but do you mean literal
conversion from Fortran  to just pure R code? I'm assuming pure R code
for the rest of this:

I've tried with some fairly simple C++ and C code, and that's been
fairly easy - there are a lot of details you can ignore and just try
to figure out the algorithm. It's nice if you have running software so
you can compare outputs, but I did once eventually figure out some
Pascal code from an old text book - it had enough actual example data
printed in the book that allowed me eventually to figure it out. There
were people around me who had once compiled Pascal, but it didn't
sound like it was going to be much fun.

Sometimes C and C++ chunks can be copied over directly and used with
very few changes, but it will just depend. Good luck, and I would just
jump in the deep end and send in questions if you get stuck.

Cheers, Mike.

On Wed, Jan 5, 2011 at 11:02 AM, Murray Jorgensen
m...@stats.waikato.ac.nz wrote:
 I'm going to try my hand at converting some Fortran programs to R. Does
 anyone know of any good articles giving hints at such tasks? I will post a
 selective summary of my gleanings.

 Cheers,  Murray
 --
 Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
 Department of Statistics, University of Waikato, Hamilton, New Zealand
 Email: m...@waikato.ac.nz                                Fax 7 838 4155
 Phone  +64 7 838 4773 wk    Home +64 7 825 0441   Mobile 021 0200 8350
 --
 r-downun...@stat.auckland.ac.nz
 http://www.stat.auckland.ac.nz/r-downunder

 To unsubscribe send an email to r-downunder-unsubscr...@stat.auckland.ac.nz




-- 
Michael Sumner
Institute for Marine and Antarctic Studies, University of Tasmania
Hobart, Australia
e-mail: mdsum...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: bwplot

2011-01-05 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 05.01.2011 09:20:35:

 I'm trying use the function bwplot, but I receive a message that the
 function is not found. I charged the lattice, sm, and Hmrsc, package but

Can you please explain how one can **charge** packages? I never did it.

Besides did you start with

library(lattice)

before trying to issue 

bwplot(anything...)

Regards
Petr


 without success. That I trying to do is an unique box-plot with in the
 x-axes two levels Season and Area, and in the y axis abundance. 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding lines in ggplot2

2011-01-05 Thread Dennis Murphy

Hi Bert:

On Tue, Jan 4, 2011 at 8:39 PM, Bert Gunter gunter.ber...@gene.com wrote:

 Dennis:

 Can't speak to ggplot2, but your comments regarding lattice are not
 quite correct. Many if not all of lattice's basic plot functions are
 generic, which means that one has essentially complete latitude to
 define plotting methods for arbitrary data structures. For example,
 there is an xyplot.ts method for time series -- class ts -- data.

 Of course, for most lattice methods, the data do naturally come in a
 data frame, and a standard lattice argument is to give a frame from
 which to pull the data. But this is not required.

 I'm aware of that, but thank you for clarifying matters. I didn't state
explicitly whether lattice required data frame input or not (my lattice
example indicated no and indeed it does not), but the message was evidently
muddled further down the post. Your comments speak to some of the
differences in the design and philosophy of lattice and ggplot2, and I have
no disagreement with your remarks about lattice.

The point I was trying to make was that by using data frames and the several
packages/base functions that support their manipulation, one can simplify
the coding of graphics within both ggplot2 and lattice. There are many
things one can do with data frames that one cannot with vectors, as you well
know - e.g., extensions with new data (rbind) or new variables
(cbind/transform, etc.), or reshaping, among others.  These features can be
used to advantage in both ggplot2 and lattice. The OP's example is a simple
one - had he used

df - data.frame(x = sqrt(1:10), y = log(1:10)) # oops, forgot 1:10...

qplot(as.numeric(rownames(df)), x, data = df, geom = 'line', colour =
I('darkgreen'))   # ...but it's OK
# or
xyplot(x ~ as.numeric(rownames(x)), data = df, type = 'l', col.line =
'darkgreen')

there would have been no problem. A little inconvenient for a new user,
maybe, but hardly 'very restrictive'.


As for other types of R data objects that are not data frames, offhand I
can't think of too many that are incapable of being converted to data frames
somehow for the purposes of graphics, although I wouldn't be remotely
surprised if some existed. [For example, one can extract fitted values,
residuals and perhaps a model matrix from a model object and place the
results in a data frame.] ggplot2 has a fortify() method to allow one to
transform data objects for use in the package. There is some discussion in
Chapter 9 of Hadley's book, but I'm not in a position to add insight as I
haven't used it personally.

I do think this is a fair statement, though, and it's been said before: if
one wants *complete* control and flexibility of inputs and outputs, use base
graphics. Both lattice and ggplot2, by virtue of being structured graphics
systems, impose certain constraints (e.g., default actions) on the user
which are system-dependent. Prof. Vardeman's quote still applies :)

Dennis



-- Bert

 
  Please explain to me how
 
  df - data.frame(x, y, index = 1:10)
  qplot(index, x, geom = 'line', ...)
 
  is 'very restrictive'. Lattice and ggplot2 are *structured* graphics
 systems
  - to get the gains that they provide, there are some costs. I don't
 perceive
  organization of data into a data frame as being restrictive - in fact, if
  you learn how to construct data for input into ggplot2 to simplify the
 code
  for labeling variables and legends, the data frame requirement is
 actually a
  benefit rather than a restriction. Moreover, one can use the plyr and
  reshape(2) packages to reshape or condense data frames to provide even
 more
  flexibility and freedom to produce ggplot2 and lattice graphics. In
  addition, the documentation for ggplot2 is quite explicit about requiring
  data frames for input, so it is behaving as documented. The complexity
 (and
  interaction) of the graphics code probably has something to do with that.
 
  Since Josh left you a quote, I'll supply another, from Prof. Steve
 Vardeman
  in a class I took with him a long time ago:
  There is no free lunch in statistics: in order to get something, you've
 got
  to give something up.
 
  In this case, if you want the nice infrastructure provided by ggplot2,
 you
  have to create a data frame for input.
 
  Dennis
 
 
  Thanks in advance, and best regards!
 
  Eduardo Horta
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible

Re: [R] how to subset unique factor combinations from a data frame.

2011-01-05 Thread Petr PIKAL

Hi

You probably did not notice xtabs I mentioned before.

as.data.frame(xtabs(~x+xx))

 u - as.data.frame(table(x, xx))
 head(u)
  x xx Freq
1 A  a   18
2 B  a   27
3 C  a   30
4 D  a   30
5 E  a   27
6 F  a   18
 
 v-as.data.frame(xtabs(~x+xx))

 head(v)
  x xx Freq
1 A  a   18
2 B  a   27
3 C  a   30
4 D  a   30
5 E  a   27
6 F  a   18

Regards
Petr


r-help-boun...@r-project.org napsal dne 05.01.2011 08:46:21:

 Hi Dennis,
 
 It worked! this is what I am looking for. Many thanks.
 
 Rgds, 
 
 SNVK
   _ 
 
 From: Dennis Murphy [mailto:djmu...@gmail.com] 
 Sent: Tuesday, January 04, 2011 9:07 PM
 To: SNV Krishna
 Cc: r-help@r-project.org
 Subject: Re: [R] how to subset unique factor combinations from a data 
frame.
 
 
 Hi:
 
 Did you try something like
 
 summdf - as.data.frame(with(df, table(Commodity, Attribute, Unit)))
 
 
 ? 
 The rows of the table should represent the unique combinations of the 
three
 variables
 
 Here's a simple toy example to illustrate:
  x - sample(LETTERS[1:6], 1000, replace = TRUE)
  xx - sample(letters[1:6], 1000, replace = TRUE)
  u - as.data.frame(table(x, xx))
  dim(u)
 [1] 36  3
  head(u)
   x xx Freq
 1 A  a   26
 2 B  a   29
 3 C  a   25
 4 D  a   25
 5 E  a   27
 6 F  a   29
 
 HTH,
 Dennis
 
 
 On Tue, Jan 4, 2011 at 2:19 AM, SNV Krishna kris...@primps.com.sg 
wrote:
 
 
 Hi,
 
 Sorry that my example is not clear. I will give an example of what each
 variable holds. I hope this clearly explains the case.
 
 Names of the dataframe (df) and description
 
 Year :- Year is calendar year, from 1980 to 2010
 
 Country :- is the country name, total no. (levels) of countries is ~ 190
 
 Commodity :- Crude oil, Sugar, Rubber, Coffee  No. (levels) of
 commodities is 20
 
 Attribute: - Production, Consumption, Stock, Import, Export... Levels ~ 
20
 
 Unit :- this is actually not a factor. It describes the unit of 
Attribute.
 Say the unit for Coffee (commodity) - Production (attribute) is 60 kgs.
 While the unit for Crude oil - Production is 1000 barrels
 
 Value :-  value
 
  tail(df, n = 10) // example data//
 
 YearCountry Commodity   Attribute   Unit
 Value
 1991United Kingdom  Wheat, DurumTotal Supply(1000 MT) 70
 1991United Kingdom  Wheat, DurumTY Exports  (1000 MT) 0
 1991United Kingdom  Wheat, DurumTY Imp. from U  (1000 MT) 0
 1991United Kingdom  Wheat, DurumTY Imports  (1000 MT) 60
 1991United Kingdom  Wheat, DurumYield   (MT/HA) 5
 
 Wish this is clear. Any suggestion
 
 Regards,
 
 SNVK
 
 -Original Message-
 From: Petr PIKAL [mailto:petr.pi...@precheza.cz]
 Sent: Tuesday, January 04, 2011 4:06 PM
 To: SNV Krishna
 Cc: r-help@r-project.org
 Subject: Odp: [R] how to subset unique factor combinations from a data
 frame.
 
 Hi
 
 r-help-boun...@r-project.org napsal dne 04.01.2011 05:21:25:
 
  Hi All
 
  I have these questions and request members expert view on this.
 
  a) I have a dataframe (df) with five factors (identity variables) and
 value
  (measured value). The id variables are Year, Country, Commodity,
 Attribute,
  Unit. Value is a value for each combination of this.
 
  I would like to get just the unique combination of Commodity,
  Attribute
 and
  Unit. I just need the unique factor combination into a dataframe or a
 table.
  I know aggregate and subset but dont how to use them in this context.
 
 aggregate(Value, list(Comoditiy, Atribute, Unit), function)
 
 
  b) Is it possible to inclue non- aggregate columns with aggregate
 function
 
  say in the above case  aggregate(Value ~ Commodity + Attribute, data
  =
 df,
  FUN = count). The use of count(Value) is just a round about to return
 the
  combinations of Commodity  Attribute, and I would like to include
 'Unit'
  column in the returned data frame?
 
 Hm. Maybe xtabs? But without any example it is only a guess.
 
 
  c) Is it possible to subset based on unique combination, some thing
  like this.
 
   subset(df, unique(Commodity), select = c(Commodity, Attribute, 
Unit)).
 I
  know this is not correct as it returns an error 'subset needs a
  logical evaluation'. Trying various ways to accomplish the task.
 
 
 Probably sqldf package has tools for doing it but I do not use it so you
 have to try yourself.
 
 df[Comodity==something, c(Commodity, Attribute, Unit)]
 
 can be other way.
 
 Anyway your explanation is ambiguous. Let say you have three rows with 
the
 same Commodity. Which row do you want to select?
 
 Regards
 Petr
 
 
  will be grateful for any ideas and help
 
  Regards,
 
  SNVK
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list

Re: [R] Cost-benefit/value for money analysis

2011-01-05 Thread Dennis Murphy

Hi:

Are you perhaps thinking of conjoint analysis?

Dennis

On Wed, Jan 5, 2011 at 1:30 AM, Graham Smith myotis...@gmail.com wrote:

 Ben


Perhaps you can specify your question more precisely, or differently.
  The way I interpret it, if there are no interactions in price
  (e.g. you get a discount for buying more than one book at a time)
  or in value (e.g. you learn more from one book having read another),
  then you get the best value/price ratio by taking only the book with
  the highest value/price.  (If you take no books at all, your value/price
  ratio is undefined.)  The algebra below shows that combining a lower
  value/price book with a higher one always lowers your overall value/price
  ratio.
 

 Thanks, for the pointers on R functions. My question was as superficial as
 it sounded. I have a commercial programme that does this (one of several
 that are available), and wondered if there was an R package that provided
 the same tools. It's a common tool, and I had hoped to have explained
 enough
 to allow an appropriate package to be identified, so I could have a quick
 look at what it does.

 But having started this, I now feel obliged to clarify the question. I only
 chose books as an easy example, you could substitute alternative marketing
 strategies, monitoring programmes, choice of ornaments for a new house, or
 holidays etc. So there could be only a few potential combinations or
 hundreds.

 But to stick with the books, and only three options: A, B and C

 Book A costs $100 and I have given it a subjective value of  50
 Book B costs  $36  and I have given it a subjective value of  60
 Book C costs  $50 and I have given it a subjective value of 80

 So book A is costing me $2 per value unit, Book B $0.6 per value unit and
 book C £0.63 per value unit.

 Buying books A+B gives me a $1.24 per value unit
 Buying books A+C gives $1.07 per value unit
 Buying books B+C gives 0.61 per value unit
 Buying books A+B+C gives 0.97 per value unit

 So in terms of value for money, there are three contenders

 Book B on its own, Book C on its own, or buying both books B and C.

 Book B $36.00 and value 60
 Book C $50.00 and value 80
 Book B+C  at $76.00 and value 140

 Depending on how you are using this tool, you can either use it to decide
 how spend an existing budget, or use it to set a budget.

 Seems hardly worth the bother for three books but if you are looking at 20
 books or 30 different monitoring options etc, it gives a useful insight
 into
 how best to spend or set a budget

 The commercial software graphs this costs vs values so you usually end up
 with some sort of an asymptotic graph where you can see that spending below
 a certain budget gives a very poor return.

 Graham

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cost-benefit/value for money analysis

2011-01-05 Thread Graham Smith

David,



 I think a similar argument at the margins would show that even if the
 task were specified as maximal value with a budget, simply ordering by the
 value/price and buying until the cumsum of the price was greater than budget
 would solve the alternate statement of the problem. I suppose there might be
 situations where there were marginal choices of buying two books whose
 value/price was less than marginally maximal because two other marginally
 maximal choices would break the budget. This sounds like a homework problem
 and I don't see any student effort yet. Search terms include: decision
 analysis , cost-benefit analysis, or utility theory.


Hopefully,  my response to Ben will clarify my question, and why I am asking
it.  At the moment (and that may change) I'm not specifically interested in
how you do it R, just as to whether there is a package aimed at this kind of
Cost Benefit analysis.

Graham

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting Fortran or C++ etc to R

2011-01-05 Thread Barry Rowlingson

On Wed, Jan 5, 2011 at 7:33 AM, lcn lcn...@gmail.com wrote:

 As for your actual requirement to do the convertion, I guess there'd not
 exist any quick ways. You have to be both familiar with R and the other
 language to make the rewrite work.

 To make the rewrite work _well_ is the bigger problem! The easiest
way to big performance wins is going to be spotting vectorisation
possibilities in the Fortran code. Any time you see a DO K=1,N loop
then look to see if its just a single vector operation in R.

 Another way to big wins is to write test code, so you can check if
your R code gives the same results as the Fortran (C/C++) code at
every stage of the rewrite. Don't just write it all in one go and then
hope it works! Small steps

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cost-benefit/value for money analysis

2011-01-05 Thread Graham Smith

Denis,


 Are you perhaps thinking of conjoint analysis?


Thanks, but as far as I can make out, having just looked at conjoint
analysis,  it looks like some form of discriminant analysis, which is not
what I am looking for.

 I only have two variables cost and value. I am ignoring how you establish
the value, I just need to be able to assess every possible combination of
costs and value. Its common technique in the Decision Analysis literature
(and specialist Decision Analysis software), but I have never seen it given
a Specific name. But of course it may have several names, and be used across
different disciplines for different purposes.

Its such a common tool, that I was hoping that someone would instantly
recognise, what I was describing, and be able to say that it was available
in a particular package.

But I had never looked at conjoint analysis before, so nice to know it
exists.

Graham

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to save graphs out of ACF ?

2011-01-05 Thread Mihai.Mirauta

Hi,

I want to save the autocorrelation plots resulting out of ACF (acf(ts)), not 
just by using the Save as command in the R Gui but using some sort of code, 
which allows me to chose the format and the path.
Thank you,

Mihai


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] update.views(Spatial) does not seem to be able to find RPyGeo package

2011-01-05 Thread Achim Zeileis


On Tue, 4 Jan 2011, Linder, Eric wrote:


I have this problem with loading RPyGeo package when using update.views.
How can I fix this.


Only by changing the operating system. You are using Linux but the RPyGeo 
package require Windows, see http://CRAN.R-project.org/package=RPyGeo


update.views() (or actually the underlying call to install.packages()) 
just informs you about this through a warning.


hth,
Z


I have tried to use other CRAN  mirrors with the same result.
Below is a copy of my session.
-session---
R version 2.12.1 (2010-12-16)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i486-pc-linux-gnu (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]


library(ctv)
update.views('Spatial')

--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
Warning message:
In update.views(Spatial) :
 The following packages are not available: RPyGeo



-session---







The information contained in this communication may be C...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: List to a summary table

2011-01-05 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 05.01.2011 01:26:42:

 
 Hi
 
 Suppose you have the code below. The result I get from the cat function 
is
 from the avgs object. Now, I have 30 diferent objects like this and I 
wish

I would stick to list and do not make 30 objects
a.g. avg.width from avgs object can be extracted by

as.numeric(unlist(sapply(avgs, function(x) x[4])))[-1]

Regards
Petr


 to make a summary table, something like:
 
 Avgs1   Avgs2  
 Avgs3
 
 i= 2 average= 0.515983i   i= 2 average= 0.746983   i= 2
 average= 0.2665983
 i= 3 average= 0.5135953  i= 3 average= 0.7345953 i= 3
 average= 0.23455953
 i= 4 average= 0.4998128  i= 4 average= 0.7233128 i= 4
 average= 0.21398128
 
 
  library(cluster)
  d-hclust(dist(iris[,-5]))
 
  avgs-sapply(1:20,function(x)
 + summary(silhouette(cutree(d,x),
 + dist(iris[,-5]
  # str(avgs)
 
 
  # print out the average widths
  for (i in 2:length(avgs)){  # ignore first item
 + cat('i=', i, 'average=', avgs[[i]]$avg.width, '\n')
 + }
 i= 2 average= 0.515983
 i= 3 average= 0.5135953
 i= 4 average= 0.4998128
 i= 5 average= 0.346174
 i= 6 average= 0.3382031
 i= 7 average= 0.3297649
 i= 8 average= 0.324025
 i= 9 average= 0.3191681
 i= 10 average= 0.3028503
 i= 11 average= 0.3072648
 i= 12 average= 0.2834498
 i= 13 average= 0.2776717
 i= 14 average= 0.2855396
 i= 15 average= 0.2745142
 i= 16 average= 0.2578903
 i= 17 average= 0.2531909
 i= 18 average= 0.2473504
 i= 19 average= 0.2484205
 i= 20 average= 0.2545357
 
 
 thanks
 A.Dias
 -- 
 View this message in context: 
http://r.789695.n4.nabble.com/List-to-a-summary-
 table-tp3174698p3174698.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use S-Plus functions in R

2011-01-05 Thread Duncan Murdoch


On 11-01-05 3:17 AM, Hein wrote:


Hi

I am very new to R.  I used to work in S-Plus a lot but that was years ago.
I wrote a large number of functions that I now want to view and edit in R.
I know I have to tell R where the functions are but I have no idea how.  The
functions are stored on my laptop's c-drive.  I tried everything I could
find e.g. library(myfilepath), source(myfilepath) etc. but nothing seems to
work.

Hein


Save their source as text, and source that.  R can't read the binary 
S-Plus objects for recent S-Plus versions.


Since R and S-Plus are not identical, you may need some modifications to 
the functions to get them to work in R:  so test carefully.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Print plot to pdf, jpg or any other format when using scatter3d error

2011-01-05 Thread Duncan Murdoch


On 11-01-04 5:36 PM, Jurica Seva wrote:

Thank you, Duncan, it works now with rgl.snapshot (i did have to upgrade
to 2.12.1). Is there any way to manipulate the size of the created
image? The created plots are a bit small (256*256)


Sure, they're the size of the window:  it's just a snapshot.  Just make 
it bigger (by mouse, or using par3d(windowRect= ...)) before taking the 
snapshot.


Duncan Murdoch



Thank you for your help once again :)

Best,
Jurica

On Tue, Jan 4, 2011 at 8:31 AM, Duncan Murdoch murdoch.dun...@gmail.com
mailto:murdoch.dun...@gmail.com wrote:

On 03/01/2011 8:17 PM, Jurica Seva wrote:

Hi,

I have been trying to output my graphs to a file (jpeg, pdf, ps, it
doesnt matter) but i cant seem to be able to get it to output.



As Uwe said, you are using rgl graphics, not base graphics.  So none
of the standard devices work, you need to use the tools built into
rgl.  Attach that package, and then read ?rgl.postscript (for
graphics in various vector formats, not just Postscript) and
?rgl.snapshot (for bitmapped graphics).

Some notes:
  - For a while rgl.snapshot wasn't working in the Windows builds
with R 2.12.1; that is now fixed, so you should update rgl before
getting frustrated.
  - rgl.snapshot just takes a copy of the graphics buffer that is
showing on screen, so it is limited to the size you can display
  - rgl.postscript does a better job for the parts of an image that
it can handle, but it is not a perfect OpenGL emulator, so it
doesn't always include all components of a graph properly.

Duncan Murdoch

  I tried a
few things but none of them worked and am lost as what to do
now. I am
using the scatter3d function, and it prints out the graphs on tot he
screen without any problems, but when it comes to writing them
to a file
i cant make it work. Is there any other way of producing
3dimensional
graphs (they dont have to be rotatable/interactive after the
print out)?

The code is fairly simple and is listed down :

#libraries
library(RMySQL)
library(rgl)
library(scatterplot3d)
library(Rcmdr)


##
#database connection
mycon- dbConnect(MySQL(),
user='root',dbname='test',host='localhost',password='')
#distinct sessions
rsSessionsU01- dbSendQuery(mycon, select distinct sessionID from
actiontimes where userID = 'ID01')
sessionU01-fetch(rsSessionsU01)
sessionU01[2,]

#user01 data
mycon- dbConnect(MySQL(),
user='root',dbname='test',host='localhost',password='')
rsUser01- dbSendQuery(mycon, select

a.userID,a.sessionID,a.actionTaken,a.timelineMSEC,a.durationMSEC,b.X,b.Y,b.Rel__dist_,b.Total_dist_
from `actiontimes` as a , `ulogdata` as b where a.originalRECNO =
b.RECNO and a.userID='ID01')
user01- fetch(rsUser01, n= -1)
user01[1,1]

#plot loop

for (i in 1:10){

  userSubset-subset(user01,sessionID ==
sessionU01[i,],select=c(timelineMSEC,X,Y))
  userSubset
  x-as.numeric(userSubset$X)
  y-as.numeric(userSubset$Y)
  scatter3d(x,y,userSubset$timeline,xlim = c(0,1280), ylim =
c(0,1024),
zlim=c(0,180),type=h,main=sessionU01[i,],sub=sessionU01[i,])
  tmp6=rep(.ps)
  tmp7=paste(sessionU01[i,],tmp6,sep=)
  tmp7
  rgl.postscript(tmp7,ps,drawText=FALSE)
  #pdf(file=tmp7)
  #dev.print(file=tmp7, device=pdf, width=600)
  #dev.off(2)
}

__
R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding lines in ggplot2

2011-01-05 Thread ONKELINX, Thierry

Dear Eduardo,

This a solution that you seem to want

n - 1:10
x - sqrt(n)
y - log(n)
qplot(n, x, geom=line, colour=darkgreen) + geom_line(data =
data.frame(n , x = y), colour=red)

But please compare it with the solution (code + result) below.
Formatting the data.frame might be a bit more work, but formatting your
graph is much easier.

n - 1:10
dataset - 
rbind(
data.frame(Number = n, Function = sqrt, Result =
sqrt(n)),
data.frame(Number = n, Function = log, Result =
log(n))
)
#Using the default colours
ggplot(dataset, aes(x = Number, y = Result, colour = Function)) +
geom_line()
#Using user-specified colours
ggplot(dataset, aes(x = Number, y = Result, colour = Function)) +
geom_line() + scale_colour_manual(values = c(sqrt = darkgreen, log =
red))

Think about the gain when you want to display much more than 2 lines...

dataset - expand.grid(Number = n, Power = seq(0, 2, length = 21))
dataset$Result - dataset$Number ^ dataset$Power
ggplot(dataset, aes(x = Number, y = Result, colour = factor(Power))) +
geom_line()

HTH,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Eduardo de Oliveira Horta
 Verzonden: woensdag 5 januari 2011 3:56
 Aan: r-help
 Onderwerp: [R] Adding lines in ggplot2
 
 Hello,
 
 this is probably a recurrent question, but I couldn't find 
 any answers that didn't involve the expression data 
 frame... so perhaps I'm looking for something new here.
 
 I wanted to find a code equivalent to
 
  x=sqrt(1:10)
  y=log(1:10)
  plot(1:10, x, type=lines, col=darkgreen) lines(1:10, y, 
 col=red)
 
 to use with ggplot2. I've tried
 
  x=sqrt(1:10)
  y=log(1:10)
  qplot(1:10, x, geom=line, colour=I(darkgreen)) 
 geom_line(1:10, y, 
  colour=red)
 Error: ggplot2 doesn't know how to deal with data of class numeric
 
 but it seems that the data frame restriction is really very 
 restrictive here. Any solutions that don't imply using 
 as.data.frame to my data?
 
 Thanks in advance, and best regards!
 
 Eduardo Horta
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding lines in ggplot2

2011-01-05 Thread Dennis Murphy

It was gently suggested to me in a private message that to achieve
*complete* control over the inputs and outputs in R graphics one should be
using grid graphics. I concur with that suggestion and wish to amend my
previous statement accordingly.

With kindest thanks,
Dennis

On Wed, Jan 5, 2011 at 3:02 AM, Dennis Murphy djmu...@gmail.com wrote:

 Hi Bert:

 On Tue, Jan 4, 2011 at 8:39 PM, Bert Gunter gunter.ber...@gene.comwrote:

 Dennis:

 Can't speak to ggplot2, but your comments regarding lattice are not
 quite correct. Many if not all of lattice's basic plot functions are
 generic, which means that one has essentially complete latitude to
 define plotting methods for arbitrary data structures. For example,
 there is an xyplot.ts method for time series -- class ts -- data.

 Of course, for most lattice methods, the data do naturally come in a
 data frame, and a standard lattice argument is to give a frame from
 which to pull the data. But this is not required.

 I'm aware of that, but thank you for clarifying matters. I didn't state
 explicitly whether lattice required data frame input or not (my lattice
 example indicated no and indeed it does not), but the message was evidently
 muddled further down the post. Your comments speak to some of the
 differences in the design and philosophy of lattice and ggplot2, and I have
 no disagreement with your remarks about lattice.

 The point I was trying to make was that by using data frames and the
 several packages/base functions that support their manipulation, one can
 simplify the coding of graphics within both ggplot2 and lattice. There are
 many things one can do with data frames that one cannot with vectors, as you
 well know - e.g., extensions with new data (rbind) or new variables
 (cbind/transform, etc.), or reshaping, among others.  These features can be
 used to advantage in both ggplot2 and lattice. The OP's example is a simple
 one - had he used

 df - data.frame(x = sqrt(1:10), y = log(1:10)) # oops, forgot 1:10...

 qplot(as.numeric(rownames(df)), x, data = df, geom = 'line', colour =
 I('darkgreen'))   # ...but it's OK
 # or
 xyplot(x ~ as.numeric(rownames(x)), data = df, type = 'l', col.line =
 'darkgreen')

 there would have been no problem. A little inconvenient for a new user,
 maybe, but hardly 'very restrictive'.


 As for other types of R data objects that are not data frames, offhand I
 can't think of too many that are incapable of being converted to data frames
 somehow for the purposes of graphics, although I wouldn't be remotely
 surprised if some existed. [For example, one can extract fitted values,
 residuals and perhaps a model matrix from a model object and place the
 results in a data frame.] ggplot2 has a fortify() method to allow one to
 transform data objects for use in the package. There is some discussion in
 Chapter 9 of Hadley's book, but I'm not in a position to add insight as I
 haven't used it personally.

 I do think this is a fair statement, though, and it's been said before: if
 one wants *complete* control and flexibility of inputs and outputs, use base
 graphics. Both lattice and ggplot2, by virtue of being structured graphics
 systems, impose certain constraints (e.g., default actions) on the user
 which are system-dependent. Prof. Vardeman's quote still applies :)

 Dennis




 -- Bert

 
  Please explain to me how
 
  df - data.frame(x, y, index = 1:10)
  qplot(index, x, geom = 'line', ...)
 
  is 'very restrictive'. Lattice and ggplot2 are *structured* graphics
 systems
  - to get the gains that they provide, there are some costs. I don't
 perceive
  organization of data into a data frame as being restrictive - in fact,
 if
  you learn how to construct data for input into ggplot2 to simplify the
 code
  for labeling variables and legends, the data frame requirement is
 actually a
  benefit rather than a restriction. Moreover, one can use the plyr and
  reshape(2) packages to reshape or condense data frames to provide even
 more
  flexibility and freedom to produce ggplot2 and lattice graphics. In
  addition, the documentation for ggplot2 is quite explicit about
 requiring
  data frames for input, so it is behaving as documented. The complexity
 (and
  interaction) of the graphics code probably has something to do with
 that.
 
  Since Josh left you a quote, I'll supply another, from Prof. Steve
 Vardeman
  in a class I took with him a long time ago:
  There is no free lunch in statistics: in order to get something, you've
 got
  to give something up.
 
  In this case, if you want the nice infrastructure provided by ggplot2,
 you
  have to create a data frame for input.
 
  Dennis
 
 
  Thanks in advance, and best regards!
 
  Eduardo Horta
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and

Re: [R] dotchart for matrix data

2011-01-05 Thread e-letter

Readers,

The following commands were applied, to create a dot chart with black
dots and blue squares for data:

 library(lattice)
 testdot
  category values
1b 44
2c 51
3d 65
4a 10
5b 64
6c 71
7d 49
8a 27

dotplot(category~values,col=c(black,black,black,black,blue,blue,blue,blue),bg=c(black,black,black,black,blue,blue,blue,blue),pch=c(21,21,21,21,22,22,22,22),xlab=NULL,
data=testdot)

The resultant graph shows correctly coloured points, but not filled,
only the border is coloured. The documentation for the command 'pch'
(?pch) indicates that the commands shown above should show
appropriately coloured solid symbols. What is causing this error
please?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unique limited to 536870912

2011-01-05 Thread jim holtman

Could it be that you are running on a 32-bit version of R?  536870912
* 4 = 2GB if those were integers which would use up all of memory.
You never did show what your error message was or what system you were
using.

On Wed, Jan 5, 2011 at 12:08 AM, Indrajeet Singh sin...@cs.ucr.edu wrote:
 Hi
 I am using R with igraph to analyze an edgelist that is greater than the said 
 amount. Does anyone know a way around this?

 Thanks
 Inder
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to save graphs out of ACF ?

2011-01-05 Thread Ben Bolker

 Mihai.Mirauta at bafin.de writes:

 
 Hi,
 
 I want to save the autocorrelation plots resulting out of ACF (acf(ts)), not
just by using the Save as
 command in the R Gui but using some sort of code, which allows me to chose the
format and the path.
 Thank you,
 
 Mihai

 for example:

a - acf(runif(10))
pdf(acf.pdf)
plot(a)
dev.off()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What are the necessary Oracle software to install and run ROracle ?

2011-01-05 Thread Marc Schwartz


On Jan 5, 2011, at 2:55 AM, thomas.car...@bnpparibas.com wrote:

 Hello,
 
 I am running Linux, I have downloaded
 
 instantclient-basiclite-linux32-11.2.0.2.0.zip
 instantclient-sqlplus-linux32-11.2.0.2.0.zip
 instantclient-sdk-linux32-11.2.0.2.0.zip
 instantclient-precomp-linux32-11.2.0.2.0.zip
 
 All these tarballs are unzipped in /usr/local/lib/instantclient, I have 
 added this path in the library path of the host.
 
 I can run sqlplus and proc, they do not complain about missing symbol.
 
 Then I install ROracle : install.packages(ROracle)
 
 Compilation step is OK
 But when the test step tries to load the ROracle.so library, it fails :
 
 ** testing if installed package can be loaded
 Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared library 
 '/opt/R-2.11.1/lib/R/library/ROracle/libs/ROracle.so':   
  /opt/R-2.11.1/lib/R/library/ROracle/libs/ROracle.so: undefined symbol: 
 sqlprc 
 
 Here is my list of lib in instantclient directory : 
 $ find -name *.*o -o -name *.a
 ./libsqlplusic.so
 ./sdk/demo/procobdemo.pco
 ./cobsqlintf.o
 ./libociicus.so
 ./libnnz11.so
 ./libocijdbc11.so
 ./libsqlplus.so
 
 Do I need so more lib ? From which Oracle tarball ?
 
 Thanks for help


If you have not, read through the INSTALL file for the package:

  http://cran.r-project.org/web/packages/ROracle/INSTALL

Past postings with similar issues regarding the inability to load shared libs 
would suggest that compiling and installing the package outside of R from the 
CLI using 'R CMD INSTALL ...' rather than from within R using 
install.packages(ROracle), may resolve the issue.  Also, be sure you are 
running all of this as root, since installation to default locations will 
require root privileges.

Two more things to consider:

1. R 2.12.1 is the current version of R. If you can, I would recommend updating 
from 2.11.1.

2. Be sure that you don't have a conflict between 32 and 64 bit versions of R 
and the Oracle tool chain. All components need to be one or the other. You seem 
to be using 32 bit versions of the Oracle components above. Check:

  .Machine$sizeof.pointer

in R to see if you are running 32 or 64 bit R. If the former, the above will 
return 4, if the latter, 8.


Another alternative would be to consider using Prof. Ripley's RODBC package and 
connecting to Oracle via ODBC.


If you need further assistance, I would suggest subscribing and posting to 
r-sig-db or contacting the package author directly. More info on the list is 
here:

  https://stat.ethz.ch/mailman/listinfo/r-sig-db

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R not recognized in command line

2011-01-05 Thread Joshua Wiley

Hi Aaditya,

I assume you are running some variant of Windows and by the prompt in
DOS you are using cmd.exe.

Perhaps you are already, but from your examples it looks like either
A) you are not in the same directory as R or B) are not adding the
path to R in the command.  For example, on Windows I always install R
under C:\R\ so for me inside cmd.exe:

C:\directory C:\R\R-devel\bin\x64\R

[[[R starts here]]]

alternately you could switch directories over and then just type R
at the console:

C:\directory cd C:\R\R-devel\bin\x64\
C:\R\R-devel\bin\x64 R

[[[R starts here]]]

or since you have set the environment variables:

C:\directory %R_HOME%\bin\x64\R

[[[R starts here]]]

Alternately, edit the PATH environment variable in Windows and add the
path to R (i.e., R_HOME\bin\i386\ or whatever it is for you), and you
should be able to just enter R at the command prompt and have it
start.

Cheers,

Josh

On Tue, Jan 4, 2011 at 9:39 PM, Aaditya Nanduri
aaditya.nand...@gmail.com wrote:
 Hello all,

 I recently installed rpy2 so that I could use R through Python.

 However, R was not recognized in the command line.

 So I decided to add it to the PATH variables. But it just doesnt work
 And what I mean by it doesnt work is : No matter what I type at the prompt
 in DOS- be it R, Rcmd, R CMD, Rscript- it is not recognized as a command.

 Path variables used :
 1. %R_HOME% -- C:\Program Files\R\R 2.12.1\
 2. %R_HOME%\bin
 3. %R_HOME%\bin\i386
 4. Some Batchscripts I found online that recognize the R.exe in \bin\i386
 but only if I run the batch file...its not natively recognized (if I were to
 type 'R' at the prompt in DOS, its not recognized)

 I would appreciate any help in this matter.
 Or should I do something else so that I can try rpy2?

 Python version 2.6.6
 R 2.12.1
 rpy2 2.0.8


 --
 Aaditya Nanduri
 aaditya.nand...@gmail.com

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] packagename:::functionname vs. importFrom

2011-01-05 Thread Frank Harrell


Thanks very much Luke for clarifying.
Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/packagename-functionname-vs-importFrom-tp3172684p3175567.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stop and call objects

2011-01-05 Thread Sebastien Bihorel


Dear R-users,

Let's consider the following snippet:

f - function(x)  tryCatch(sum(x),error=function(e) stop(e))
f('a')

As expected, the last call returns an error message: Error in sum(x) : 
invalid 'type' (character) of argument


My questions are the following:
1- can I easily ask the stop function to reference the f function in 
addition to sum(x) in the error message?
2- If not, I guess I would have to extract the call and message objects 
from e, coerce the call as a character object, build a custom string, 
and pass it to the stop function using call.=F. How can I coerce a call 
object to a character and maintain the aspect of the printed call 
(i.e. sum(x) instead of the character vector sum x returned by 
as.character(e$call))?


Thank you

Sebastien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding lines in ggplot2

2011-01-05 Thread Eduardo de Oliveira Horta

I thank you all for the insightful answers. I'm quite a rookie in R and have
built a code that didn't take data frames into account. But I suppose I'm
now convinced that they're actually a practical structure for organizing the
data... so I'll adhere to the Data Frame Club soon enough.

Best regards,

Eduardo Horta

On Wed, Jan 5, 2011 at 11:01 AM, Dennis Murphy djmu...@gmail.com wrote:

 It was gently suggested to me in a private message that to achieve
 *complete* control over the inputs and outputs in R graphics one should be
 using grid graphics. I concur with that suggestion and wish to amend my
 previous statement accordingly.

 With kindest thanks,
 Dennis

 On Wed, Jan 5, 2011 at 3:02 AM, Dennis Murphy djmu...@gmail.com wrote:

  Hi Bert:
 
  On Tue, Jan 4, 2011 at 8:39 PM, Bert Gunter gunter.ber...@gene.com
 wrote:
 
  Dennis:
 
  Can't speak to ggplot2, but your comments regarding lattice are not
  quite correct. Many if not all of lattice's basic plot functions are
  generic, which means that one has essentially complete latitude to
  define plotting methods for arbitrary data structures. For example,
  there is an xyplot.ts method for time series -- class ts -- data.
 
  Of course, for most lattice methods, the data do naturally come in a
  data frame, and a standard lattice argument is to give a frame from
  which to pull the data. But this is not required.
 
  I'm aware of that, but thank you for clarifying matters. I didn't state
  explicitly whether lattice required data frame input or not (my lattice
  example indicated no and indeed it does not), but the message was
 evidently
  muddled further down the post. Your comments speak to some of the
  differences in the design and philosophy of lattice and ggplot2, and I
 have
  no disagreement with your remarks about lattice.
 
  The point I was trying to make was that by using data frames and the
  several packages/base functions that support their manipulation, one can
  simplify the coding of graphics within both ggplot2 and lattice. There
 are
  many things one can do with data frames that one cannot with vectors, as
 you
  well know - e.g., extensions with new data (rbind) or new variables
  (cbind/transform, etc.), or reshaping, among others.  These features can
 be
  used to advantage in both ggplot2 and lattice. The OP's example is a
 simple
  one - had he used
 
  df - data.frame(x = sqrt(1:10), y = log(1:10)) # oops, forgot
 1:10...
 
  qplot(as.numeric(rownames(df)), x, data = df, geom = 'line', colour =
  I('darkgreen'))   # ...but it's OK
  # or
  xyplot(x ~ as.numeric(rownames(x)), data = df, type = 'l', col.line =
  'darkgreen')
 
  there would have been no problem. A little inconvenient for a new user,
  maybe, but hardly 'very restrictive'.
 
 
  As for other types of R data objects that are not data frames, offhand I
  can't think of too many that are incapable of being converted to data
 frames
  somehow for the purposes of graphics, although I wouldn't be remotely
  surprised if some existed. [For example, one can extract fitted values,
  residuals and perhaps a model matrix from a model object and place the
  results in a data frame.] ggplot2 has a fortify() method to allow one to
  transform data objects for use in the package. There is some discussion
 in
  Chapter 9 of Hadley's book, but I'm not in a position to add insight as I
  haven't used it personally.
 
  I do think this is a fair statement, though, and it's been said before:
 if
  one wants *complete* control and flexibility of inputs and outputs, use
 base
  graphics. Both lattice and ggplot2, by virtue of being structured
 graphics
  systems, impose certain constraints (e.g., default actions) on the user
  which are system-dependent. Prof. Vardeman's quote still applies :)
 
  Dennis
 
 
 
 
  -- Bert
 
  
   Please explain to me how
  
   df - data.frame(x, y, index = 1:10)
   qplot(index, x, geom = 'line', ...)
  
   is 'very restrictive'. Lattice and ggplot2 are *structured* graphics
  systems
   - to get the gains that they provide, there are some costs. I don't
  perceive
   organization of data into a data frame as being restrictive - in fact,
  if
   you learn how to construct data for input into ggplot2 to simplify the
  code
   for labeling variables and legends, the data frame requirement is
  actually a
   benefit rather than a restriction. Moreover, one can use the plyr and
   reshape(2) packages to reshape or condense data frames to provide even
  more
   flexibility and freedom to produce ggplot2 and lattice graphics. In
   addition, the documentation for ggplot2 is quite explicit about
  requiring
   data frames for input, so it is behaving as documented. The complexity
  (and
   interaction) of the graphics code probably has something to do with
  that.
  
   Since Josh left you a quote, I'll supply another, from Prof. Steve
  Vardeman
   in a class I took with him a long time ago:
   There is no free lunch in statistics: in order to get

Re: [R] R command execution from shell

2011-01-05 Thread Sebastien Bihorel


Thank you for this alternative. Both seem to work on my systems.

Sebastien

Prof Brian Ripley wrote:

On Tue, 4 Jan 2011, Duncan Murdoch wrote:


On 04/01/2011 3:21 PM, Sebastien Bihorel wrote:

Dear R-users,

Is there a way I can ask R to execute the write(hello
world,file=hello.txt) command directly from the UNIX shell, instead
of having to save this command to a .R file and execute this file 
with R

CMD BATCH?


Yes.  Some versions of R support the -e option on the command line to 
execute a particular command.  It's not always easy to work out the 
escapes so your shell passes all the quotes through...  An 
alternative is to echo the command into the shell, e.g.


echo 'cat(hello)' | R --slave

(where the outer ' ' are just for bash).


It is marginally preferable to use Rscript in place of 'R --slave'.
I think in all known shells

Rscript -e write('hello world', file = 'hello.txt')

will work.  (If not, shQuote() will not work for that shell, but this 
does work in sh+clones, csh+clones, zsh and Windows' cmd.exe.)




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dotchart for matrix data

2011-01-05 Thread David Winsemius



On Jan 5, 2011, at 8:11 AM, e-letter wrote:


Readers,

The following commands were applied, to create a dot chart with black
dots and blue squares for data:


library(lattice)
testdot

 category values
1b 44
2c 51
3d 65
4a 10
5b 64
6c 71
7d 49
8a 27

dotplot
(category
~
values
,col
=
c
(black
,black
,black
,black
,blue
,blue
,blue
,blue
),bg
=
c
(black
,black
,black
,black
,blue
,blue,blue,blue),pch=c(21,21,21,21,22,22,22,22),xlab=NULL,
data=testdot)

The resultant graph shows correctly coloured points, but not filled,
only the border is coloured. The documentation for the command 'pch'
(?pch) indicates that the commands shown above should show
appropriately coloured solid symbols. What is causing this error
please?


There is no pch command. It is a graphical parameter. If you are  
looking at the points help page then you are not looking at  
documentation that necessarily applies to a lattice function like  
dotplot. After first looking at ?dotplot, then ?panel.dotplot,  and  
then because it says the points are done with panel.xyplot, my guess  
is that you need to add a fill =TRUE or a fill= color-vector option.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] randomForest speed improvements

2011-01-05 Thread Liaw, Andy

Note that that isn't exactly what I recommended.  If you look at the
example in the help page for combine(), you'll see that it is combining
RF objects trained on the same data; i.e., instead of having one RF with
500 trees, you can combine five RFs trained on the same data with 100
trees each into one 500-tree RF.

The way you are using combine() is basically using sample size to limit
tree size, which you can do by playing with the nodesize argument in
randomForest() as I suggested previously.  Either way is fine as long as
you don't see prediction performance degrading.

Andy

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of apresley
 Sent: Tuesday, January 04, 2011 6:30 PM
 To: r-help@r-project.org
 Subject: Re: [R] randomForest speed improvements
 
 
 Andy,
 
 Thanks for the reply.  I had no idea I could combine them 
 back ... that
 actually will work pretty well.  We can have several worker 
 threads load
 up the RF's on different machines and/or cores, and then 
 re-assemble them. 
 RMPI might be an option down the road, but would be a bit of 
 overhead for us
 now.
 
 Using the method of combine() ... I was able to drastically reduce the
 amount of time to build randomForest objects.  IE, using 
 about 25,000 rows
 (6 columns), it takes maybe 5 minutes on my laptop.  Using 5 
 randomForest
 objects (each with 5k rows), and then combining them, takes  
 1 minute.
 
 --
 Anthony
 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/randomForest-speed-improvements-
 tp3172523p3174621.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] get() within a command, specifically lmer

2011-01-05 Thread Patrick McKann

Hello all.  Why doesn't this work?

d=data.frame(y=rpois(10,1),x=rnorm(10),z=rnorm(10),grp=rep(c('a','b'),each=5))
library(lme4)
model=lmer(y~x+z+(1|grp),family=poisson,data=d)
update(model,~.-z)###works, removes z
var='z'
update(model,~.-get(var))##doesn't remove z
update(model,~. -get(var,pos=d))###doesn't remove z

I am trying to remove z from the model in the update, but I can't do it
using get(), which is what I would like to do for a more complicated
program.  There's something about environments and get() that I don't
understand.

Any suggestions?

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R not recognized in command line

2011-01-05 Thread Duncan Murdoch


On 11-01-05 8:51 AM, Joshua Wiley wrote:

Hi Aaditya,

I assume you are running some variant of Windows and by the prompt in
DOS you are using cmd.exe.

Perhaps you are already, but from your examples it looks like either
A) you are not in the same directory as R or B) are not adding the
path to R in the command.  For example, on Windows I always install R
under C:\R\ so for me inside cmd.exe:

C:\directory  C:\R\R-devel\bin\x64\R

[[[R starts here]]]

alternately you could switch directories over and then just type R
at the console:

C:\directory  cd C:\R\R-devel\bin\x64\
C:\R\R-devel\bin\x64  R

[[[R starts here]]]

or since you have set the environment variables:

C:\directory  %R_HOME%\bin\x64\R

[[[R starts here]]]

Alternately, edit the PATH environment variable in Windows and add the
path to R (i.e., R_HOME\bin\i386\ or whatever it is for you), and you
should be able to just enter R at the command prompt and have it
start.


Editing the PATH is probably the best approach, but a lot of people get 
it wrong because of misunderstanding how it works:


 -  If you change PATH in one process the changes won't propagate 
anywhere else, and will be lost as soon as you close that process.  That 
could be a cmd window, or an R session, or just about any other process 
that lets you change environment variables.


 -  If you want to make global changes to the PATH, you need to do it 
in the control panel System|Advanced|Environment variables entries.


 - Often it is good enough to use a more Unix-like approach, and only 
make the change at startup of the cmd processor.  You use the /k option 
when starting cmd if you want to run something on startup.


Duncan Murdoch





Cheers,

Josh

On Tue, Jan 4, 2011 at 9:39 PM, Aaditya Nanduri
aaditya.nand...@gmail.com  wrote:

Hello all,

I recently installed rpy2 so that I could use R through Python.

However, R was not recognized in the command line.

So I decided to add it to the PATH variables. But it just doesnt work
And what I mean by it doesnt work is : No matter what I type at the prompt
in DOS- be it R, Rcmd, R CMD, Rscript- it is not recognized as a command.

Path variables used :
1. %R_HOME% --  C:\Program Files\R\R 2.12.1\
2. %R_HOME%\bin
3. %R_HOME%\bin\i386
4. Some Batchscripts I found online that recognize the R.exe in \bin\i386
but only if I run the batch file...its not natively recognized (if I were to
type 'R' at the prompt in DOS, its not recognized)

I would appreciate any help in this matter.
Or should I do something else so that I can try rpy2?

Python version 2.6.6
R 2.12.1
rpy2 2.0.8


--
Aaditya Nanduri
aaditya.nand...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simulation - Natrual Selection

2011-01-05 Thread Ben Ward


Hi,

I've been modelling some data over the past few days, of my work, 
repeatedly challenging microbes to a certain concentration of cleaner, 
until the required concentration to inhibit or kill them increaces, at 
which point they are challenged to a slightly higher concentration each 
day. I'm doing ths for two different cleaners and I'm collecting the 
required concentration to kill them as a percentage, the challenge 
number, the cleaner as a two level variable, and the lineage theyre in, 
because I have several different lineages. I'm expecting the values to 
rise for one cleaner but not the other as they aqquire resistance for 
one but not the other. Which has happened, but I have wide variation 
because one linage aqquired a very dramatic change which has made it 
immune to 50%, whereas the others, have exhibited a much more gradual 
increace, and so I have very weak p values for the cleaner variable, 
because it is secondary to the challenge vector, which has the most 
explanatory power, because without time and these challenges, the 
selection would no happen.  I was using two bacterium species, but one 
was keen on giving hight erratic results, and insisted on becoming cross 
contaminated, BUT if I include it's data, It shoves cleaner over the 
p0.05 threshold, so i may just be having a problem with lack of data. So 
I've been asking about bootstrapping, which I plan to do to my cases, 
and thenfit a model to see what the confidence is like then. I assume if 
I bootstrap then it will re-select whole cases, and not jumble 
everything up, otherwise a microbe (totake the most extreme value as an 
example) with 50% concentration tolerance at the beginning, would make 
no sense at all. I'm also planning on doing models lineage by lineage, 
rather than putting them into one whole, just to have a look at what 
happens.


But what I really wanted to know from this email, was if there's a 
package or function for natrual selection simulation I could make use 
of, to see if I can simulate the experiment. I want to start with a 
distribution of concentration tolerance values, taken from the 
inhibitory concentration values from my first lot of microbes, back when 
term began. Draw 3000 from this. Then values in that draw that fall 
below the exposure concentration I did in my experiment, are removed, or 
have a high chance of being removed. Then, from what is left, a draw is 
made again - or perhaps a copy operation (rather than a random draw) 
until I have 3000 again, rather than have all exactly the same 
concentration, then a value can be added to some of them, that increaces 
their concentration tolerance slightly, but not by a great deal, except 
in a few individuals, where it may be increaced dramatically(some sort 
of exponential dstribution perhaps). Then when the distribution of this 
simulated population of microbes has reached the next concentration 
(possibly the mean or mode of the distribution) (I have a series of 1 in 
2 dilutions, so 100% 50%, 25% and so on), then they move on to the next 
concentration.


I know it's probably quite a heavy thing, it was just a thought that 
came to me, if anybody has any experience in this area of R or knows of 
something that allows this to be done, please let me know.


Thanks,
Ben.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rShowMessage Fatal error: unable to open the base package

2011-01-05 Thread ying zhang

Hi All,

 

As you may know I am trying connect R with java by RJava,  now I run the
examples, I got this error

 

rShowMessage Fatal error: unable to open the base package

 

 

I am using 64bits windows 7 and eclipse. Any suggestions?

 

 

Many thanks

Ying


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R not recognized in command line

2011-01-05 Thread Gabor Grothendieck

On Wed, Jan 5, 2011 at 10:41 AM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 11-01-05 8:51 AM, Joshua Wiley wrote:

 Hi Aaditya,

 I assume you are running some variant of Windows and by the prompt in
 DOS you are using cmd.exe.

 Perhaps you are already, but from your examples it looks like either
 A) you are not in the same directory as R or B) are not adding the
 path to R in the command.  For example, on Windows I always install R
 under C:\R\ so for me inside cmd.exe:

 C:\directory  C:\R\R-devel\bin\x64\R

 [[[R starts here]]]

 alternately you could switch directories over and then just type R
 at the console:

 C:\directory  cd C:\R\R-devel\bin\x64\
 C:\R\R-devel\bin\x64  R

 [[[R starts here]]]

 or since you have set the environment variables:

 C:\directory  %R_HOME%\bin\x64\R

 [[[R starts here]]]

 Alternately, edit the PATH environment variable in Windows and add the
 path to R (i.e., R_HOME\bin\i386\ or whatever it is for you), and you
 should be able to just enter R at the command prompt and have it
 start.

 Editing the PATH is probably the best approach, but a lot of people get it
 wrong because of misunderstanding how it works:

  -  If you change PATH in one process the changes won't propagate anywhere
 else, and will be lost as soon as you close that process.  That could be a
 cmd window, or an R session, or just about any other process that lets you
 change environment variables.

  -  If you want to make global changes to the PATH, you need to do it in the
 control panel System|Advanced|Environment variables entries.

  - Often it is good enough to use a more Unix-like approach, and only make
 the change at startup of the cmd processor.  You use the /k option when
 starting cmd if you want to run something on startup.


You can also use Rcmd.bat, R.bat, Rgui.bat, etc. found at
http://batchfiles.googlecode.com

Just put any you wish to use anywhere on your path and it will work on
all cmd instances and will also work when you install a new version of
R since it looks up R's location in the registry.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R not recognized in command line

2011-01-05 Thread Joshua Wiley

On Wed, Jan 5, 2011 at 7:41 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 Editing the PATH is probably the best approach, but a lot of people get it
 wrong because of misunderstanding how it works:

  -  If you change PATH in one process the changes won't propagate anywhere
 else, and will be lost as soon as you close that process.  That could be a
 cmd window, or an R session, or just about any other process that lets you
 change environment variables.

  -  If you want to make global changes to the PATH, you need to do it in the
 control panel System|Advanced|Environment variables entries.

Note it is also possible to make global changes using the powershell
by setting the user to Machine.

[Environment]::SetEnvironmentVariable(TestVariable, Test value., Machine)

Josh


  - Often it is good enough to use a more Unix-like approach, and only make
 the change at startup of the cmd processor.  You use the /k option when
 starting cmd if you want to run something on startup.

 Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OT: Reprinting of Bertin's Semiology of Graphics

2011-01-05 Thread Michael Friendly

Aficionados of graphics may be interested to know that the English 
translation (1984) of Jacques Bertin's

Semiology of Graphics has been reprinted by ESRI.

http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/0299090604
new edition:
http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611/ref=tmm_hrd_title_0

The long out-of-print 1984 edition sells for $380, but the new printing 
is a bargain at ~$49.
It is all the more remarkable in that most of the diagrams and graphs 
were drawn by hand,
yet show a palette of graphical techniques richer than our graphical 
software provides

even today.

best,
-Michael

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R(D) Com under R1070

2011-01-05 Thread Uwe Ligges




Can you please quote what you are referring to?
The subject seems to refer to an R version R-1.7.0 which is for almost a 
decade outdated.


Uwe Ligges



On 05.01.2011 08:31, Henri Leblond wrote:

I get the same trouble
Please finally did you succeed fixing this trouble ?

Henri

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation - Natrual Selection

2011-01-05 Thread Mike Marchywka

 Date: Wed, 5 Jan 2011 15:48:46 +
 From: benjamin.w...@bathspa.org
 To: r-help@r-project.org
 Subject: [R] Simulation - Natrual Selection

 Hi,

 I've been modelling some data over the past few days, of my work,
 repeatedly challenging microbes to a certain concentration of cleaner,
 until the required concentration to inhibit or kill them increaces, at
 which point they are challenged to a slightly higher concentration each
 day. I'm doing ths for two different cleaners and I'm collecting the
 required concentration to kill them as a percentage, the challenge
 number, the cleaner as a two level variable, and the lineage theyre in,
 because I have several different lineages. I'm expecting the values to
 rise for one cleaner but not the other as they aqquire resistance for
 one but not the other. Which has happened, but I have wide variation
 because one linage aqquired a very dramatic change which has made it
 immune to 50%, whereas the others, have exhibited a much more gradual
 increace, and so I have very weak p values for the cleaner variable,
 because it is secondary to the challenge vector, which has the most
 explanatory power, because without time and these challenges, the
 selection would no happen. I was using two bacterium species, but one
 was keen on giving hight erratic results, and insisted on becoming cross
 contaminated, BUT if I include it's data, It shoves cleaner over the
 p0.05 threshold, so i may just be having a problem with lack of data. So
 I've been asking about bootstrapping, which I plan to do to my cases,
 and thenfit a model to see what the confidence is like then. I assume if
 I bootstrap then it will re-select whole cases, and not jumble
 everything up, otherwise a microbe (totake the most extreme value as an
 example) with 50% concentration tolerance at the beginning, would make
 no sense at all. I'm also planning on doing models lineage by lineage,
 rather than putting them into one whole, just to have a look at what
 happens.

You can't really have a p-value without a specific hypothesis to test,
if you have that then all your other questions are probably easy to answer.
Generally you want to sample from things that are iid or maybe you
want to test the identical i. 

Generally you want to have done a lit search ahead of time and 
had some idea of likely evolution dynamics of your system given
your design and things like your forcing functions etc.
Most statisticians would not take seriously a posteriori designs and
indeed it can be hard to avoid rationalization and selection bias ( problems
that always and only effect people who disagree with me LOL) as being
anything other than exploratory or hypothesis generating- you are looking
for predictive value. While it is not always worthwhile doing blind tests,
it may be something worth considering ( do you know which group gets what 
thing?)

 But what I really wanted to know from this email, was if there's a
 package or function for natrual selection simulation I could make use
 of, to see if I can simulate the experiment. I want to start with a

http://www.google.com/#sclient=psyhl=enq=%22R+package%22+natural+selection

but as implied above, R has lots of analysis stuff and maybe you
would find something more useful that is not linked to the keywords
you suggest. You may find, for whatever reason, you could write a differential
equation to express your results but that isn't often used with natural 
selection.

 distribution of concentration tolerance values, taken from th

e
 inhibitory concentration values from my first lot of microbes, back when
 term began. Draw 3000 from this. Then values in that draw that fall
 below the exposure concentration I did in my experiment, are removed, or
 have a high chance of being removed. Then, from what is left, a draw is
 made again - or perhaps a copy operation (rather than a random draw)
 until I have 3000 again, rather than have all exactly the same
 concentration, then a value can be added to some of them, that increaces
 their concentration tolerance slightly, but not by a great deal, except
 in a few individuals, where it may be increaced dramatically(some sort
 of exponential dstribution perhaps). Then when the distribution of this
 simulated population of microbes has reached the next concentration
 (possibly the mean or mode of the distribution) (I have a series of 1 in
 2 dilutions, so 100% 50%, 25% and so on), then they move on to the next
 concentration.

 I know it's probably quite a heavy thing, it was just a thought that
 came to me, if anybody has any experience in this area of R or knows of
 something that allows this to be done, please let me know.

 Thanks,
 Ben.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cost-benefit/value for money analysis

2011-01-05 Thread Liviu Andronic

On Wed, Jan 5, 2011 at 12:29 PM, Graham Smith myotis...@gmail.com wrote:
 maximal choices would break the budget. This sounds like a homework problem
 and I don't see any student effort yet. Search terms include: decision
 analysis , cost-benefit analysis, or utility theory.


 Hopefully,  my response to Ben will clarify my question, and why I am asking
 it.  At the moment (and that may change) I'm not specifically interested in
 how you do it R, just as to whether there is a package aimed at this kind of
 Cost Benefit analysis.

Try this:
 require(sos)
 findFn('cost benefit')
found 12 matches

Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation - Natrual Selection

2011-01-05 Thread Ben Ward

On 05/01/2011 16:37, Mike Marchywka wrote:

Date: Wed, 5 Jan 2011 15:48:46 +
From: benjamin.w...@bathspa.org
To: r-help@r-project.org
Subject: [R] Simulation - Natrual Selection

Hi,

I've been modelling some data over the past few days, of my work,
repeatedly challenging microbes to a certain concentration of cleaner,
until the required concentration to inhibit or kill them increaces, at
which point they are challenged to a slightly higher concentration each
day. I'm doing ths for two different cleaners and I'm collecting the
required concentration to kill them as a percentage, the challenge
number, the cleaner as a two level variable, and the lineage theyre in,
because I have several different lineages. I'm expecting the values to
rise for one cleaner but not the other as they aqquire resistance for
one but not the other. Which has happened, but I have wide variation
because one linage aqquired a very dramatic change which has made it
immune to 50%, whereas the others, have exhibited a much more gradual
increace, and so I have very weak p values for the cleaner variable,
because it is secondary to the challenge vector, which has the most
explanatory power, because without time and these challenges, the
selection would no happen. I was using two bacterium species, but one
was keen on giving hight erratic results, and insisted on becoming cross
contaminated, BUT if I include it's data, It shoves cleaner over the
p0.05 threshold, so i may just be having a problem with lack of data. So
I've been asking about bootstrapping, which I plan to do to my cases,
and thenfit a model to see what the confidence is like then. I assume if
I bootstrap then it will re-select whole cases, and not jumble
everything up, otherwise a microbe (totake the most extreme value as an
example) with 50% concentration tolerance at the beginning, would make
no sense at all. I'm also planning on doing models lineage by lineage,
rather than putting them into one whole, just to have a look at what
happens.

You can't really have a p-value without a specific hypothesis to test,
if you have that then all your other questions are probably easy to answer.
Generally you want to sample from things that are iid or maybe you
want to test the identical i.
My Hypothesis is that Cleaner A (I don't really want to go into names or 
brands), will exhbit a rise in concentration tolerance values, or 
rather, the microbial culture I keep exposed to it, will, reflecting 
aqquisition of antimicrobial resistance. And this has largely happened. 
And that in cleaner B, this will not happen, or if it does, it will not 
be as dramatic and take longer. So I expecting in my model, the cleaner 
variable to have a p below 0.05, and quite hight explanatory power, and 
a satisfying coefficient. The notion behind the hypothesis being that 
one might have a more difficult complex chemical structure, requiring 
more mutations to develop some resistance.
I can't really do anything with genes or chemical structure at my 
current institution and at my level because  of no equippment for that 
sort of thing, and that they felt it would be too far for a 3rd year 
project. So I'm using the concentration required to kill them - or stop 
them from growing, as a indication.

Generally you want to have done a lit search ahead of time and
had some idea of likely evolution dynamics of your system given
your design and things like your forcing functions etc.
Most statisticians would not take seriously a posteriori designs and
indeed it can be hard to avoid rationalization and selection bias ( problems
that always and only effect people who disagree with me LOL) as being
anything other than exploratory or hypothesis generating- you are looking
for predictive value. While it is not always worthwhile doing blind tests,
it may be something worth considering ( do you know which group gets what 
thing?)

But what I really wanted to know from this email, was if there's a
package or function for natrual selection simulation I could make use
of, to see if I can simulate the experiment. I want to start with a

http://www.google.com/#sclient=psyhl=enq=%22R+package%22+natural+selection

but as implied above, R has lots of analysis stuff and maybe you
would find something more useful that is not linked to the keywords
you suggest. You may find, for whatever reason, you could write a differential
equation to express your results but that isn't often used with natural 
selection.

distribution of concentration tolerance values, taken from th

e

inhibitory concentration values from my first lot of microbes, back when
term began. Draw 3000 from this. Then values in that draw that fall
below the exposure concentration I did in my experiment, are removed, or
have a high chance of being removed. Then, from what is left, a draw is
made again - or perhaps a copy operation (rather than a random draw)
until I have 3000 again, rather than have all exactly the same
concentration, then a value

[R] integration Sweave and TexMakerX

2011-01-05 Thread Sebastián Daza


Hi,

Does anyone know how to integrate texmakerx and sweave on Windows? I 
mean, to run .rnw files directly from texmakerx  and get a pdf or dvi file.


Thank you in advance,

--
Sebastián Daza
sebastian.d...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stop and call objects

2011-01-05 Thread Henrique Dallazuanna

Try this:

f - function(x) tryCatch(sum(x),error=function(e)sprintf(Error in %s:
%s,  deparse(sys.call(1)), e$message))
f('a')





On Wed, Jan 5, 2011 at 12:23 PM, Sebastien Bihorel 
sebastien.biho...@cognigencorp.com wrote:

 Dear R-users,

 Let's consider the following snippet:

 f - function(x)  tryCatch(sum(x),error=function(e) stop(e))
 f('a')

 As expected, the last call returns an error message: Error in sum(x) :
 invalid 'type' (character) of argument

 My questions are the following:
 1- can I easily ask the stop function to reference the f function in
 addition to sum(x) in the error message?
 2- If not, I guess I would have to extract the call and message objects
 from e, coerce the call as a character object, build a custom string, and
 pass it to the stop function using call.=F. How can I coerce a call object
 to a character and maintain the aspect of the printed call (i.e. sum(x)
 instead of the character vector sum x returned by as.character(e$call))?

 Thank you

 Sebastien

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] get() within a command, specifically lmer

2011-01-05 Thread Greg Snow

Formula syntax is different from regular syntax, it is quoted and not 
evaluated in the same way as regular commands (otherwise operations like '+' 
and '-' would do very different things).

For what you are trying to do, I would suggest creating the formula as a string 
using paste or sprintf, then use as.formula on that string.  You can also use 
the substitute function, but that tends to be a bit more complicated.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Patrick McKann
 Sent: Wednesday, January 05, 2011 8:26 AM
 To: r-help@r-project.org
 Subject: [R] get() within a command, specifically lmer
 
 Hello all.  Why doesn't this work?
 
 d=data.frame(y=rpois(10,1),x=rnorm(10),z=rnorm(10),grp=rep(c('a','b'),e
 ach=5))
 library(lme4)
 model=lmer(y~x+z+(1|grp),family=poisson,data=d)
 update(model,~.-z)###works, removes z
 var='z'
 update(model,~.-get(var))##doesn't remove z
 update(model,~. -get(var,pos=d))###doesn't remove z
 
 I am trying to remove z from the model in the update, but I can't do it
 using get(), which is what I would like to do for a more complicated
 program.  There's something about environments and get() that I don't
 understand.
 
 Any suggestions?
 
 Thanks.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: Re: Simulation - Natrual Selection

2011-01-05 Thread Ben Ward

 Original Message 
Subject:Re: [R] Simulation - Natrual Selection
Date:   Wed, 05 Jan 2011 17:24:05 +
From:   Ben Ward benjamin.w...@bathspa.org
To: Bert Gunter gunter.ber...@gene.com
CC: Mike Marchywka marchy...@hotmail.com

On 05/01/2011 17:08, Bert Gunter wrote:
  Couple of brief comments inline below. -- Bert

  On Wed, Jan 5, 2011 at 8:56 AM, Ben Wardbenjamin.w...@bathspa.org   wrote:
  On 05/01/2011 16:37, Mike Marchywka wrote:

  Date: Wed, 5 Jan 2011 15:48:46 +
  From: benjamin.w...@bathspa.org
  To: r-help@r-project.org
  Subject: [R] Simulation - Natrual Selection

  Hi,

  I've been modelling some data over the past few days, of my work,
  repeatedly challenging microbes to a certain concentration of cleaner,
  until the required concentration to inhibit or kill them increaces, at
  which point they are challenged to a slightly higher concentration each
  day. I'm doing ths for two different cleaners and I'm collecting the
  required concentration to kill them as a percentage, the challenge
  number, the cleaner as a two level variable, and the lineage theyre in,
  because I have several different lineages. I'm expecting the values to
  rise for one cleaner but not the other as they aqquire resistance for
  one but not the other. Which has happened, but I have wide variation
  because one linage aqquired a very dramatic change which has made it
  immune to 50%, whereas the others, have exhibited a much more gradual
  increace, and so I have very weak p values for the cleaner variable,
  because it is secondary to the challenge vector, which has the most
  explanatory power, because without time and these challenges, the
  selection would no happen. I was using two bacterium species, but one
  was keen on giving hight erratic results, and insisted on becoming cross
  contaminated, BUT if I include it's data, It shoves cleaner over the
  p0.05 threshold, so i may just be having a problem with lack of data. So
  I've been asking about bootstrapping, which I plan to do to my cases,
  and thenfit a model to see what the confidence is like then. I assume if
  I bootstrap then it will re-select whole cases, and not jumble
  everything up, otherwise a microbe (totake the most extreme value as an
  example) with 50% concentration tolerance at the beginning, would make
  no sense at all. I'm also planning on doing models lineage by lineage,
  rather than putting them into one whole, just to have a look at what
  happens.

  You can't really have a p-value without a specific hypothesis to test,
  -- More precisely: A p-value loses its meaning unless the tested
  hypotheses are PRESPECIFIED -- i.e. determined BEFORE looking at the
  data.

My hypothesis was specified before I did my experiment. Whilst far from
perfect, I've tried to do the best I can to assess rise in resistance,
without going into genetics as it's not possible. (Although may be at
the next institution I've applied for MSc).

With my hypothesis (I mentioned it below), I was of the frame of mind
that a nonsignificant p-value on the cleaner variable (for now -
experiment is far from over), indicated a lack of evidence for rejecting
the null. And so at the minute, it looks like the type of cleaner makes
no difference.
  if you have that then all your other questions are probably easy to
  answer.
  Generally you want to sample from things that are iid or maybe you
  want to test the identical i.
  -- This is false. iid is not required. Example: weighted least
  squares. It is true that figuring out the sampling distribution under
  non-iid sampling can be (much) more difficult. For example, pivots may
  not exist; approximations must typically be used.

  -- Bert

  My Hypothesis is that Cleaner A (I don't really want to go into names or
  brands), will exhbit a rise in concentration tolerance values, or rather,
  the microbial culture I keep exposed to it, will, reflecting aqquisition of
  antimicrobial resistance. And this has largely happened. And that in cleaner
  B, this will not happen, or if it does, it will not be as dramatic and take
  longer. So I expecting in my model, the cleaner variable to have a p below
  0.05, and quite hight explanatory power, and a satisfying coefficient. The
  notion behind the hypothesis being that one might have a more difficult
  complex chemical structure, requiring more mutations to develop some
  resistance.
  I can't really do anything with genes or chemical structure at my current
  institution and at my level because  of no equippment for that sort of
  thing, and that they felt it would be too far for a 3rd year project. So I'm
  using the concentration required to kill them - or stop them from growing,
  as a indication.
  Generally you want to have done a lit search ahead of time and
  had some idea of likely evolution dynamics of your system given
  your design and things like your forcing functions etc.
  Most statisticians would not take seriously

Re: [R] Navigating web pages using R

2011-01-05 Thread steven mosher

Hmm,

Rcurl may be able to help you. Not sure I have not played with the query
abilities.

On Tue, Jan 4, 2011 at 10:54 AM, Erik Gregory egregory2...@yahoo.comwrote:

 R-Help,

 I'm trying to obtain some data from a webpage which masks the URL from the
 user,
 so an explicit URL will not work.  For example, when one navigates to the
 web
 page the URL looks something like:
 http://137.113.141.205/rpt34s.php?flags=1 (changed for privacy, but i'm
 not sure
 you could access it anyways since it's internal to the agency I work for).
 The site has three drop-down menus for Site, Month, and Year.  When a
 combination is selected of these, the resulting URL is
 always http://137.113.141.205/rpt34s (nothing changes, except flags=1 is
 dropped, so what I need to be able to do is write something that will
 navigate
 to the original URL, then select some combination of Site, Month, and
 Year, and then submit the query to the site to navigate to the page with
 the
 data.
 Is this a capability that R has as a language?  Unfortunately, I'm
 unfamiliar
 with html or php programming, so if this question belongs in a forum on
 that I
 apologize.  I'm trying to centralize all of my code for my analysis in R!

 Thank you,
 -Erik Gregory
 Student Assistant, California EPA
 CSU Sacramento, Mathematics

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] randomForest speed improvements

2011-01-05 Thread Liaw, Andy

From: Liaw, Andy
 
 Note that that isn't exactly what I recommended.  If you look at the
 example in the help page for combine(), you'll see that it is 
 combining
 RF objects trained on the same data; i.e., instead of having 
 one RF with
 500 trees, you can combine five RFs trained on the same data with 100
 trees each into one 500-tree RF.
 
 The way you are using combine() is basically using sample 
 size to limit
 tree size, which you can do by playing with the nodesize argument in
 randomForest() as I suggested previously.  Either way is fine 
 as long as
 you don't see prediction performance degrading.

I should also mention that another way you can do something similar is
by making use of the sampsize argument in randomForest().  For example,
if you call randomForest() with sampsize=500, it will randomly draw 500
data points to grow each tree.  This way you don't even need to run the
RFs separately and combine them.  

Andy


 Andy
 
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of apresley
  Sent: Tuesday, January 04, 2011 6:30 PM
  To: r-help@r-project.org
  Subject: Re: [R] randomForest speed improvements
  
  
  Andy,
  
  Thanks for the reply.  I had no idea I could combine them 
  back ... that
  actually will work pretty well.  We can have several worker 
  threads load
  up the RF's on different machines and/or cores, and then 
  re-assemble them. 
  RMPI might be an option down the road, but would be a bit of 
  overhead for us
  now.
  
  Using the method of combine() ... I was able to drastically 
 reduce the
  amount of time to build randomForest objects.  IE, using 
  about 25,000 rows
  (6 columns), it takes maybe 5 minutes on my laptop.  Using 5 
  randomForest
  objects (each with 5k rows), and then combining them, takes  
  1 minute.
  
  --
  Anthony
  -- 
  View this message in context: 
  http://r.789695.n4.nabble.com/randomForest-speed-improvements-
  tp3172523p3174621.html
  Sent from the R help mailing list archive at Nabble.com.
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 Notice:  This e-mail message, together with any 
 attachme...{{dropped:11}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Forecasting with STL

2011-01-05 Thread felipe araujo






Dear list,
 
We have been using STL for seasonal decomposition, and would like to use the 
trend and seasonal component to forecast n steps ahead.
 
There is no function called predict.stl, and inside an stl object there is no 
loess model to be predicted either.
 
Our solution is to apply loess or lm using the trend of stl as 
auto-regressors after which we use predict.lm or predict.loess, and then apply 
seasonal modifiers to predicted data.
 
Is there a more straight forward way to do this? Some function we are missing?
 
Thank you in advance.
 
Felipe Araujo

Researcher in finance and economics
Rio de Janeiro, Brasil

 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] t-test or ANOVA...who wins? Help please!

2011-01-05 Thread Frodo Jedi

Dear Tal Galili,
thanks a lot for your answer! I agree with you, the t-test is comparing 2 
conditions at one level of stimulus, while the ANOVA table is testing the 
significance of the interaction between condition and stimulsthe two tests 
are testing two different things.

But still I don´t understand which is the right way to perform the analysis in 
order to solve my problem. 

Let´s consider now only the table I posted before. 
The same stimuli in the table have been presented to subjects in two 
conditions: 
A and AH, where AH is the condition A plus something elese (let´s call it H). 
 
I want to know if AT GLOBAL  LEVEL adding H bring to better results in the 
participants evaluations of the stimuli rather than the stimulus presented only 
with condition A. 

Data in column response are evaluation on realism of the stimulus from a 7 
point scale.

If I calculate the mean for each stmulus in each condition, the results show 
that for each stimulus the AH condition is always greater than the first. 
Anyway, doing a t-test to compare the stimuli by couple (es. flat_550_W_realism 
in condition A,  flat_550_W_realism in condition AH) I get that only sometimes 
the differences are statistically significant.  I ask you if there is a way to 
say that condition AH is better than condition A, at global level. 



In attachment you find the table in .txt and also in .csv format. Is it 
possible 
for you to make an example in R, including also the R results
in order to tell me what to see in the console to see if my problem is solved 
or 
not?

For example, I was checking in the anova results the stimulus:conditon 
line.but I don´t know if my anova analysis was correct or
not.


I am not an expert of R, nor of statistics ;-( 
Anyway I am doing my best to study and understand.

Please enlighten me.

Thanks in advance

Best regards





From: Tal Galili tal.gal...@gmail.com
To: Frodo Jedi frodo.j...@yahoo.com
Cc: r-help@r-project.org
Sent: Wed, January 5, 2011 10:15:41 AM
Subject: Re: [R] t-test or ANOVA...who wins? Help please!


Hello Frodo,

It is not clear to me from your questions some of the basics of your analysis.

If you only have two levels of a factor, and one response - why in the anova do 
you use more factors (and their interactions)?
In that sense, it is obvious that your results would differ from the t-test.

In either case, I am not sure if any of these methods are valid since your data 
doesn't seem to be normal.
Here is an example code of how to get the same results from aov and t.test.  
And 
also a nonparametric option (that might be more fitting)



flat_550_W_realism =c(3,3,5,3,3,3,3,5,3,3,5,7,5,2,3)
flat_550_W_realism_AH =c(7,4,5,3,6,5,3,5,5,7,2,7,5, 5)

x - c(rep(1, length(flat_550_W_realism)),
rep(2, length(flat_550_W_realism_AH)))

y - c(flat_550_W_realism , flat_550_W_realism_AH)

# equal results between t test and anova
t.test(y ~ x, var.equal= T)
summary(aov(y ~ x))

# plotting the data:
boxplot(y ~ x) # group 1 is not at all symetrical...
wilcox.test(y ~ x) # a more fitting test



Contact 
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | 
www.r-statistics.com (English)
--






On Wed, Jan 5, 2011 at 12:37 AM, Frodo Jedi frodo.j...@yahoo.com wrote:


I kindly ask you an help because I really don´t know how to solve this problem.




  number  stimulus condition response
1 flat_550_W_realism A3
2 flat_550_W_realism A3
3 flat_550_W_realism A5
4 flat_550_W_realism A3
5 flat_550_W_realism A3
6 flat_550_W_realism A3
7 flat_550_W_realism A3
8 flat_550_W_realism A5
9 flat_550_W_realism A3
10flat_550_W_realism A3
11flat_550_W_realism A5
12flat_550_W_realism A7
13flat_550_W_realism A5
14flat_550_W_realism A2
15flat_550_W_realism A3
16flat_550_W_realismAH7
17flat_550_W_realismAH4
18flat_550_W_realismAH5
19flat_550_W_realismAH3
20flat_550_W_realismAH6
21flat_550_W_realismAH5
22flat_550_W_realismAH3
23flat_550_W_realismAH5
24flat_550_W_realismAH5
25flat_550_W_realismAH7
26flat_550_W_realism

[R] Assumptions for ANOVA: the right way to check the normality

2011-01-05 Thread Frodo Jedi

Dear all,
I would like to know which is the right way to check the normality  assumption 
for performing ANOVA. How do you check normality for the  following example?

I did an experiment where people had to evaluate on a 7 point scale, the  
degree 
of realism of some stimuli presented in 2 conditions.

The problem is that if I check normality with the Shapiro test I get that the 
data are not normally distributed.

Someone suggested me that I don´t have to check the normality of the  data, but 
the normality of the residuals I get after the fitting of the  linear model.
I really ask you to help me to understand this point as I don´t find enough 
material online where to solve it.

If the data are not normally distributed I have to use the kruskal wallys test 
and not the ANOVA...so please help
me to understand.

I make a numerical example, could you please tell me if the data in this table 
are normally distributed or not?

Help!


number  stimulus condition response
1 flat_550_W_realism A3
2 flat_550_W_realism A3
3 flat_550_W_realism A5
4 flat_550_W_realism A3
5 flat_550_W_realism A3
6 flat_550_W_realism A3
7 flat_550_W_realism A3
8 flat_550_W_realism A5
9 flat_550_W_realism A3
10flat_550_W_realism A3
11flat_550_W_realism A5
12flat_550_W_realism A7
13flat_550_W_realism A5
14flat_550_W_realism A2
15flat_550_W_realism A3
16flat_550_W_realismAH7
17flat_550_W_realismAH4
18flat_550_W_realismAH5
19flat_550_W_realismAH3
20flat_550_W_realismAH6
21flat_550_W_realismAH5
22flat_550_W_realismAH3
23flat_550_W_realismAH5
24flat_550_W_realismAH5
25flat_550_W_realismAH7
26flat_550_W_realismAH2
27flat_550_W_realismAH7
28flat_550_W_realismAH5
29flat_550_W_realismAH5
30 bump_2_step_W_realism A1
31 bump_2_step_W_realism A3
32 bump_2_step_W_realism A5
33 bump_2_step_W_realism A1
34 bump_2_step_W_realism A3
35 bump_2_step_W_realism A2
36 bump_2_step_W_realism A5
37 bump_2_step_W_realism A4
38 bump_2_step_W_realism A4
39 bump_2_step_W_realism A4
40 bump_2_step_W_realism A4
41 bump_2_step_W_realismAH3
42 bump_2_step_W_realismAH5
43 bump_2_step_W_realismAH1
44 bump_2_step_W_realismAH5
45 bump_2_step_W_realismAH4
46 bump_2_step_W_realismAH4
47 bump_2_step_W_realismAH5
48 bump_2_step_W_realismAH4
49 bump_2_step_W_realismAH3
50 bump_2_step_W_realismAH4
51 bump_2_step_W_realismAH5
52 bump_2_step_W_realismAH4
53 hole_2_step_W_realism A3
54 hole_2_step_W_realism A3
55 hole_2_step_W_realism A4
56 hole_2_step_W_realism A1
57 hole_2_step_W_realism A4
58 hole_2_step_W_realism A3
59 hole_2_step_W_realism A5
60 hole_2_step_W_realism A4
61 hole_2_step_W_realism A3
62 hole_2_step_W_realism A4
63 hole_2_step_W_realism A7
64 hole_2_step_W_realism A5
65 hole_2_step_W_realism A1
66 hole_2_step_W_realism A4
67 hole_2_step_W_realismAH7
68 hole_2_step_W_realismAH5
69 hole_2_step_W_realismAH5
70 hole_2_step_W_realismAH1
71 hole_2_step_W_realismAH5
72 hole_2_step_W_realismAH5
73 hole_2_step_W_realismAH5
74 hole_2_step_W_realismAH2
75 hole_2_step_W_realismAH6
76 hole_2_step_W_realismAH5
77 hole_2_step_W_realismAH

[R] R Commander - how to disable the alphabetical sorting of variable names?

2011-01-05 Thread Iurie Malai


I try to disable alphabetical sorting of the variable names but I fail, R
Commander does not store any changes made in the  Commander Options menu /
window. I tried to insert options(sort.names = FALSE) in Rprofile.site and
.Rprofile config files but without success. Does anyone know the solution?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-Commander-how-to-disable-the-alphabetical-sorting-of-variable-names-tp3175426p3175426.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] update.views(Spatial) does not seem to be able to find RPyGeo package

2011-01-05 Thread Roger Bivand


The package is stated only to run under Windows (see the SystemRequirements
field on its CRAN page), and you are on Linux - does this explain anything?
Maybe ask the package maintainer?

Roger



Linder, Eric wrote:
 
 I have this problem with loading RPyGeo package when using update.views.
 How can I fix this.  I have tried to use other CRAN  mirrors with the same
 result.
 Below is a copy of my session.
 -session---
 R version 2.12.1 (2010-12-16)
 Copyright (C) 2010 The R Foundation for Statistical Computing
 ISBN 3-900051-07-0
 Platform: i486-pc-linux-gnu (32-bit)
 
 R is free software and comes with ABSOLUTELY NO WARRANTY.
 You are welcome to redistribute it under certain conditions.
 Type 'license()' or 'licence()' for distribution details.
 
 R is a collaborative project with many contributors.
 Type 'contributors()' for more information and
 'citation()' on how to cite R or R packages in publications.
 
 Type 'demo()' for some demos, 'help()' for on-line help, or
 'help.start()' for an HTML browser interface to help.
 Type 'q()' to quit R.
 
 [Previously saved workspace restored]
 
 library(ctv)
 update.views('Spatial')
 --- Please select a CRAN mirror for use in this session ---
 Loading Tcl/Tk interface ... done
 Warning message:
 In update.views(Spatial) :
   The following packages are not available: RPyGeo

 -session---
 
 
 
 
 
 
 
 The information contained in this communication may be C...{{dropped:11}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


-
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway

-- 
View this message in context: 
http://r.789695.n4.nabble.com/update-views-Spatial-does-not-seem-to-be-able-to-find-RPyGeo-package-tp3174870p3175299.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] loop variable names as function arguments

2011-01-05 Thread P Wilson

Dear all, is there a way to loop the rp.doublebutton function in the rpanel
package? The difficulty I'm having lies with the variable name argument.

library(rpanel)
if (interactive()) {
   draw - function(panel) {
 plot(unlist(panel$V),ylim=0:1)
 panel
 }
   panel - rp.control(V=as.list(rep(.5,3)))
   rp.doublebutton(panel, var = V[[1]], step = 0.05, action = draw, range = c(0,
1))
   rp.doublebutton(panel, var = V[[2]], step = 0.05, action = draw, range = c(0,
1))
   rp.doublebutton(panel, var = V[[3]], step = 0.05, action = draw, range = c(0,
1))
   }

Regards,
Philip

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] looking for the RMySQL package for R 2.12.0 under XP

2011-01-05 Thread PtitBleu


Hello David,

As I had no time to try to compile the RMySQL package, I finally followed
your advice and moved to RODBC.
The decision to modify my scripts was taken after I discovered the function
odbcDriverConnect which allow to directly connect to database (like RMySQL)
without declaring the database through ODBC window. 

With the following command,
ch-odbcDriverConnect(connection=SERVER=localhost;DRIVER=MySQL ODBC 5.1
Driver;DATABASE=my_db;UID=my_user;PWD=my_pwd;case=tolower)
the connection runs nicely.

Thanks again for the link.

Happy New Year to all the R-users,
Ptit Bleu. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/looking-for-the-RMySQL-package-for-R-2-12-0-under-XP-tp3057537p3175513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] speed up in R apply

2011-01-05 Thread Young Cho

Hi,

I am doing some simulations and found a bottle neck in my R script. I made
an example:

 a = matrix(rnorm(500),100,5)
 tt  = Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() - tt
[1] -1291.026
Time difference of 0.2354031 secs

 tt  = Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt
[1] -1291.026
Time difference of 20.23150 secs

Is there a faster way of calculating sum of products (of columns, or of
rows)? And is this an expected behavior?

Thanks for your advice in advance,

Young

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to 'explode' a matrix

2011-01-05 Thread Kevin Ummel

Hi everyone,

I'm looking for a way to 'explode' a matrix like this:

 matrix(1:4,2,2)
 [,1] [,2]
[1,]13
[2,]24

into a matrix like this:

 matrix(c(1,1,2,2,1,1,2,2,3,3,4,4,3,3,4,4),4,4)
 [,1] [,2] [,3] [,4]
[1,]1133
[2,]1133
[3,]2244
[4,]2244

My current kludge is this:

v1=rep(1:4,each=2,times=2)
v2=v1[order(rep(1:2,each=4,times=2))]
matrix(v2,4,4)

But I'm hoping there's a more efficient solution that I'm not aware of.

Many thanks,
Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unique limited to 536870912

2011-01-05 Thread Indrajeet Singh

Hi
I am using the 64 bit version. To check that i went in the bin folder and
executed file r . It gave the following output

ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux
2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not
stripped

The error i got when reading in my edgelist was

length 927365385 is too large for hashing

This number was the number of entries in the edgelist btw

I hope this helps.

Thanks

On Wed, Jan 5, 2011 at 5:15 AM, jim holtman jholt...@gmail.com wrote:

 Could it be that you are running on a 32-bit version of R?  536870912
 * 4 = 2GB if those were integers which would use up all of memory.
 You never did show what your error message was or what system you were
 using.

 On Wed, Jan 5, 2011 at 12:08 AM, Indrajeet Singh sin...@cs.ucr.edu
 wrote:
  Hi
  I am using R with igraph to analyze an edgelist that is greater than the
 said amount. Does anyone know a way around this?
 
  Thanks
  Inder
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] vector of character with unequal width

2011-01-05 Thread jose Bartolomei


 
  
Dear R users,


The best in this new year 2011.


I am dealing with a character vector  (xx) whose nchar are not the same.


Ex.
nchar(xx)
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 4 4 4 4 4 4 4 4
[75] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ... 9


I need xx to be nchar = 9 


My best guest was to paste 0's. Then I need substring (xx, 6, 9).


I came with:


xx[1:61]-paste(, xx[1:61], sep=)


xx[62:66]-paste(00, xx[62:66], sep=)


xx[67:100]-paste(0, xx[67:100], sep=)


..
 
 nchar(xx)
[1] 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
[38] 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
[75] 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9  9

 
xx-substring(xx, 6, 9)




This is a solution for one data set would be sufficient but not if I will 
continuously deal with this same issue.


Furthermore, I am trying to automate the process but I have not be able to came 
with adequate solution.


I was thinking to create a character vector of 0's 9-nchar(xx). 
Then paste it to xx. 
9-nchar(xx)
[1] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
[38] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 6 6 6 6 6 5 5 5 5 5 5 5 5
[75] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 ..1




Nevertheless, I have not been able to create this vector nor I do not know if 
this is the best option.


Another way I thought was to create an if statement, but this will be long and 
not efficient (I think).
 


Any suggestion, will be appreciated.


Jose



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] real time R

2011-01-05 Thread Marcelo Barbudas

Hi,

We're using R in an application where asking for a probability of an
event takes about 130ms.

What could we do to take that down to 30ms-40ms? The query code uses
randomforest, knn.

-- 
M.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cost-benefit/value for money analysis

2011-01-05 Thread Graham Smith

Liviu

Try this:
  require(sos)
  findFn('cost benefit')
 found 12 matches


Thanks, I wasn't aware of sos, however, following up the hits hasn't moved
me any further forward, except to demonstrate that such a function I want
doesn't exist.

But I will try some other search options.

Graham

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Comparing fitting models

2011-01-05 Thread Frodo Jedi


Dear all,
I have 3 models (from simple to complex) and I want to compare them in order to 
see if they fit equally well or not.
From the R prompt I am not able to see where I can get this information.
Let´s do an example:

fit1- lm(response ~ stimulus + condition + stimulus:condition,  data=scrd) 
#EQUIVALE A lm(response ~ stimulus*condition, data=scrd) 


fit2- lm(response ~ stimulus + condition, data=scrd) 

fit3- lm(response ~ condition, data=scrd) 


 anova(fit2, fit1) #compare models 
Analysis of Variance Table

Model 1: response ~ stimulus + condition
Model 2: response ~ stimulus + condition + stimulus:condition
  Res.DfRSS Df Sum of Sq  F Pr(F)
1165 364.13   
2159 362.67  61.4650 0.1071 0.9955


 anova(fit3, fit2, fit1) #compare models 
Analysis of Variance Table

Model 1: response ~ condition
Model 2: response ~ stimulus + condition
Model 3: response ~ stimulus + condition + stimulus:condition
  Res.DfRSS Df Sum of Sq  F Pr(F)
1171 382.78   
2165 364.13  618.650 1.3628 0.2328
3159 362.67  6 1.465 0.1071 0.9955



How can I understand that the simple model fits as good as the complex model 
(the one with the interaction)?

Thanks in advance

All the best


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stop and call objects

2011-01-05 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique 
 Dallazuanna
 Sent: Wednesday, January 05, 2011 9:26 AM
 To: Sebastien Bihorel
 Cc: R-help
 Subject: Re: [R] Stop and call objects

 Try this:

 f - function(x) 
 tryCatch(sum(x),error=function(e)sprintf(Error in %s:
 %s,  deparse(sys.call(1)), e$message))
 f('a')

The argument e to the error handler contains a call
component so you don't have to rely on the unreliable
sys.call(1) to get the offending call.  E.g.,

   f2 - function(x) {
tryCatch(sum(x),
  error=function(e) {
sprintf(Error in %s: %s, deparse(e$call)[1], e$message)
  }
)
  }
   f2('char')
  [1] Error in sum(x): invalid 'type' (character) of argument

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 On Wed, Jan 5, 2011 at 12:23 PM, Sebastien Bihorel 
 sebastien.biho...@cognigencorp.com wrote:

  Dear R-users,

  Let's consider the following snippet:

  f - function(x)  tryCatch(sum(x),error=function(e) stop(e))
  f('a')

  As expected, the last call returns an error message: Error 
 in sum(x) :
  invalid 'type' (character) of argument

  My questions are the following:
  1- can I easily ask the stop function to reference the f 
 function in
  addition to sum(x) in the error message?
  2- If not, I guess I would have to extract the call and 
 message objects
  from e, coerce the call as a character object, build a 
 custom string, and
  pass it to the stop function using call.=F. How can I 
 coerce a call object
  to a character and maintain the aspect of the printed 
 call (i.e. sum(x)
  instead of the character vector sum x returned by 
 as.character(e$call))?

  Thank you

  Sebastien

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to 'explode' a matrix

2011-01-05 Thread Henrique Dallazuanna

Try this:

apply(apply(m, 2, rep, each = 2), 1, rep, each = 2)

or

m[rep(seq(nrow(m)), each = 2), rep(seq(ncol(m)), each = 2)]

On Wed, Jan 5, 2011 at 10:03 AM, Kevin Ummel kevinum...@gmail.com wrote:

 Hi everyone,

 I'm looking for a way to 'explode' a matrix like this:

  matrix(1:4,2,2)
 [,1] [,2]
 [1,]13
 [2,]24

 into a matrix like this:

  matrix(c(1,1,2,2,1,1,2,2,3,3,4,4,3,3,4,4),4,4)
 [,1] [,2] [,3] [,4]
 [1,]1133
 [2,]1133
 [3,]2244
 [4,]2244

 My current kludge is this:

 v1=rep(1:4,each=2,times=2)
 v2=v1[order(rep(1:2,each=4,times=2))]
 matrix(v2,4,4)

 But I'm hoping there's a more efficient solution that I'm not aware of.

 Many thanks,
 Kevin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] real time R

2011-01-05 Thread Uwe Ligges




On 05.01.2011 17:10, Marcelo Barbudas wrote:

Hi,

We're using R in an application where asking for a probability of an
event takes about 130ms.



What could we do to take that down to 30ms-40ms? The query code uses
randomforest, knn.




Use a machine that is 4 times faster?

Otherwise: Use another method or a more efficient implementation. Don't 
use R at all if you want _guaranteed_ real time processing.


Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to 'explode' a matrix

2011-01-05 Thread Ben Bolker

Kevin Ummel kevinummel at gmail.com writes:

 I'm looking for a way to 'explode' a matrix like this:
 
  matrix(1:4,2,2)
  [,1] [,2]
 [1,]13
 [2,]24
 


  This is the Kronecker product of your matrix with the
matrix  (1 1 ; 1 1)

m - matrix(1:4,2,2)
kronecker(m,matrix(1,2,2))

  cheers
Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multipanel plots

2011-01-05 Thread Uwe Ligges




On 05.01.2011 06:16, smriti Sebastian wrote:

hi,
i have attached a doc file.


Maybe, but it cannot make it through the list.


Is this graph can be plotted using R?Plz help


We do not know. Make it available on some webserver and refer to it with 
an URL.


Uwe Ligges



regards,
smriti



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] real time R

2011-01-05 Thread Barry Rowlingson

On Wed, Jan 5, 2011 at 4:10 PM, Marcelo Barbudas nos...@gmail.com wrote:
 Hi,

 We're using R in an application where asking for a probability of an
 event takes about 130ms.

 What could we do to take that down to 30ms-40ms? The query code uses
 randomforest, knn.


 That's a fairly vague question So some vague answers:

 Firstly, profile your query to identify bottlenecks and then
concentrate your effort on removing them. Anything else is a waste of
time.

 Secondly, get a faster computer - whether that means faster CPU,
faster hard disks, faster RAM depends on where the bottleneck is in
your process. Or get parallel and use multiple CPUs. Or rewrite in C.
Or machine code. Or do it on a GPU.

 Thirdly, give us something more specific! Like examples perhaps?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to 'explode' a matrix

2011-01-05 Thread Jorge Ivan Velez

Hi Kevin,

Take a look at

?kronecker

HTH,
Jorge


On Wed, Jan 5, 2011 at 7:03 AM, Kevin Ummel  wrote:

 Hi everyone,

 I'm looking for a way to 'explode' a matrix like this:

  matrix(1:4,2,2)
 [,1] [,2]
 [1,]13
 [2,]24

 into a matrix like this:

  matrix(c(1,1,2,2,1,1,2,2,3,3,4,4,3,3,4,4),4,4)
 [,1] [,2] [,3] [,4]
 [1,]1133
 [2,]1133
 [3,]2244
 [4,]2244

 My current kludge is this:

 v1=rep(1:4,each=2,times=2)
 v2=v1[order(rep(1:2,each=4,times=2))]
 matrix(v2,4,4)

 But I'm hoping there's a more efficient solution that I'm not aware of.

 Many thanks,
 Kevin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation - Natrual Selection

2011-01-05 Thread Ben Ward


On 05/01/2011 17:40, Bert Gunter wrote:

My hypothesis was specified before I did my experiment. Whilst far from
perfect, I've tried to do the best I can to assess rise in resistance,
without going into genetics as it's not possible. (Although may be at the
next institution I've applied for MSc).

With my hypothesis (I mentioned it below), I was of the frame of mind that a
nonsignificant p-value on the cleaner variable (for now - experiment is far
from over), indicated a lack of evidence for rejecting the null. And so at
the minute, it looks like the type of cleaner makes no difference.

I have no fundamental objection, but be careful. I would simply
qualify your last sentence by saying that it means that the
experimental noise is to great to precisely determine the size of the
cleaner effect. Scientific reality tells us that it is never exactly
0; what your results show is that your uncertainty about the value of
the difference encompasses both positive and negative values. This
does NOT mean that the difference might not be scientifically large
enough to be of interest -- a confidence interval for the difference
(MUCH better than a P value) would help you determine that. If the
interval is narrow enough that the difference, positive or negative,
is too small to be of scientific interest, then you're done. If the
linterval is large, then it tells you that you need more data, a
better experiment (less noisy) etc.

-- Bert

At the moment I wouldn't call the confidence interval small, it's 
definately wide, and at the minute the confidence interval covers zero. 
My R-squared at the minite is also 0.5, this is mostly due to the few 
extreme cases of adaptation as I mentioned before, but I'm hesitant to 
remove it as papers in my literature study which also evolve bacteria 
show that there is often (sometimes wide) variation in the paths 
populations take. So whilst mathematically a bit undesirable, and makes 
me and the model uncertain, it does fall into place with what is known, 
or has been previously shown of the reality of selection. Again if I 
include the data from the bacteria dropped from the study, all that 
improves, and uncertainty is reduced.


It may also be worth me mentioning, I am also taking a more traditional 
approach (by that I mean a more Statistics 101 approach, indeed that 
is all the stats tuition covered in my course as a taught element), 
incase what I've described above did not work or was not ideal, because 
we (me and my supervisor) did forsee a model report may contain a lot of 
uncertainty. Indeed we did expect some populations to adapt and some to 
not etc. So I've also been collecting data on the width of the zones of 
inhibition shown by putting disks of cleaner on plates of growth, and 
measuring the dead zone that results. I can get lots of data from this 
with only a few plates, and doing this at the start of the study, a few 
times in the middle, and at the end. Will allow me to do more 
traditional analysis, for example t.test on the dead zone widths at the 
end of the study, between cleaner a and b.  Or a non parametric 
equivalent, maybe even a permutation test. The modelling stuff is 
already beyond what my supervisor expects of me, but I felt it would add 
value and a lot more insight to the study, allowing more variables to be 
accounted for, than a more short-sighted traditional test.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector of character with unequal width

2011-01-05 Thread Henrique Dallazuanna

Try this:

formatC(c(1, 11, 111, ), flag = 0, width = 9)

Or:

sprintf(%09d, c(1, 11, 111))

On Wed, Jan 5, 2011 at 1:50 PM, jose Bartolomei surfpr...@hotmail.comwrote:




 Dear R users,


 The best in this new year 2011.


 I am dealing with a character vector  (xx) whose nchar are not the same.


 Ex.
 nchar(xx)
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1
 [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 4 4 4 4 4 4
 4 4
 [75] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ... 9


 I need xx to be nchar = 9


 My best guest was to paste 0's. Then I need substring (xx, 6, 9).


 I came with:


 xx[1:61]-paste(, xx[1:61], sep=)


 xx[62:66]-paste(00, xx[62:66], sep=)


 xx[67:100]-paste(0, xx[67:100], sep=)


 ..

  nchar(xx)
 [1] 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
 9
 [38] 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
 9 9
 [75] 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9  9


 xx-substring(xx, 6, 9)




 This is a solution for one data set would be sufficient but not if I will
 continuously deal with this same issue.


 Furthermore, I am trying to automate the process but I have not be able to
 came with adequate solution.


 I was thinking to create a character vector of 0's 9-nchar(xx).
 Then paste it to xx.
 9-nchar(xx)
 [1] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 8
 [38] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 6 6 6 6 6 5 5 5 5 5 5
 5 5
 [75] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 ..1




 Nevertheless, I have not been able to create this vector nor I do not know
 if this is the best option.


 Another way I thought was to create an if statement, but this will be long
 and not efficient (I think).



 Any suggestion, will be appreciated.


 Jose




[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector of character with unequal width

2011-01-05 Thread Petr Savicky

On Wed, Jan 05, 2011 at 03:50:13PM +, jose Bartolomei wrote:
[...]
 
 I was thinking to create a character vector of 0's 9-nchar(xx). 
 Then paste it to xx. 
 9-nchar(xx)
 [1] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
 [38] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 6 6 6 6 6 5 5 5 5 5 5 5 5
 [75] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 ..1
 
 
 
 
 Nevertheless, I have not been able to create this vector nor I do not know if 
 this is the best option.

Did you consider something like the following?

  xx - c(abc, abcd, abcde)
  z1 - rep(0, times=length(xx))
  z2 - substr(z1, 1, 9 - nchar(xx))
  yy - paste(z2, xx, sep=)
  cbind(yy)
  # yy 
  #[1,] 00abc
  #[2,] 0abcd
  #[3,] abcde

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting colour-coded points

2011-01-05 Thread ANJAN PURKAYASTHA

Hi,
I have a file of the following type:

idab
1   0.5   5
2   0.7  15
3   1.6   7
40.5 25


I would like to plot the data in column a on the y-axis and the
corresponding data in column id on the x-axis, so plot(a~id).  However I
would like to colour these points according to the data in column b.
column b data may be colour coded into the following bins: 0-9; 10-19;
20-29.
Any idea on how to accomplish this?
TIA,
Anjan

-- 
===
anjan purkayastha, phd.
research associate
fas center for systems biology,
harvard university
52 oxford street
cambridge ma 02138
phone-703.740.6939
===

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting colour-coded points

2011-01-05 Thread Jorge Ivan Velez

Hi Anjan,

Try something along the lines of

d$bb - with(d, cut(b, c(0,9,19,29)))
with(d, plot(a, id, col = bb, pch = 16, las = 1))
legend('topright', as.character(levels(d$bb)), col = 1:length(levels(d$bb)),
ncol = 3, pch = 16)

where 'd' is your original data.frame.

HTH,
Jorge


On Wed, Jan 5, 2011 at 2:00 PM, ANJAN PURKAYASTHA  wrote:

 Hi,
 I have a file of the following type:

 idab
 1   0.5   5
 2   0.7  15
 3   1.6   7
 40.5 25
 

 I would like to plot the data in column a on the y-axis and the
 corresponding data in column id on the x-axis, so plot(a~id).  However I
 would like to colour these points according to the data in column b.
 column b data may be colour coded into the following bins: 0-9; 10-19;
 20-29.
 Any idea on how to accomplish this?
 TIA,
 Anjan

 --
 ===
 anjan purkayastha, phd.
 research associate
 fas center for systems biology,
 harvard university
 52 oxford street
 cambridge ma 02138
 phone-703.740.6939
 ===

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] speed up in R apply

2011-01-05 Thread David Winsemius



On Jan 5, 2011, at 10:03 AM, Young Cho wrote:


Hi,

I am doing some simulations and found a bottle neck in my R script.  
I made

an example:


a = matrix(rnorm(500),100,5)
tt  = Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() - tt

[1] -1291.026
Time difference of 0.2354031 secs


tt  = Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt

[1] -1291.026
Time difference of 20.23150 secs

Is there a faster way of calculating sum of products (of columns, or  
of

rows)?


You should look at crossprod and tcrossprod.


And is this an expected behavior?


Yes. For loops and *apply strategies are slower than the proper use of  
vectorized functions.




Thanks for your advice in advance,



--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] OT: Reprinting of Bertin's Semiology of Graphics

2011-01-05 Thread Frank Harrell


This is a major publishing event for statistical graphics.  I have long
possessed Bertin's shorter book Graphics and Graphic Information Processing
but Semiology is the one I've been waiting for.  Thanks for the good news
Michael!

Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/OT-Reprinting-of-Bertin-s-Semiology-of-Graphics-tp3175859p3176233.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting colour-coded points

2011-01-05 Thread David Winsemius



On Jan 5, 2011, at 2:00 PM, ANJAN PURKAYASTHA wrote:


Hi,
I have a file of the following type:

idab
1   0.5   5
2   0.7  15
3   1.6   7
40.5 25


I would like to plot the data in column a on the y-axis and the
corresponding data in column id on the x-axis, so plot(a~id).   
However I

would like to colour these points according to the data in column b.
column b data may be colour coded into the following bins: 0-9; 10-19;
20-29.
Any idea on how to accomplish this?


Something along the lines of this code:

plot(a ~ id, data=dfrm,
col=c(red, green, blue)[findInterval(dfrm$b,  
c(0,10,20,30) )] )

--
David.





TIA,
Anjan

--
===
anjan purkayastha, phd.
research associate
fas center for systems biology,
harvard university
52 oxford street
cambridge ma 02138
phone-703.740.6939
===

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Commander - how to disable the alphabetical sorting of variable names?

2011-01-05 Thread John Fox

Dear Iurie Malai,

How Rcmdr options are set is described in ?Commander, which is also
accessible via the R Commander menus, Help - Commander help. You need

  options(Rcmdr=list(sort.names=FALSE))

which you can put in Rprofile.site.

Best,
 John


John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of Iurie Malai
 Sent: January-05-11 7:39 AM
 To: r-help@r-project.org
 Subject: [R] R Commander - how to disable the alphabetical sorting of
 variable names?
 
 
 I try to disable alphabetical sorting of the variable names but I fail, R
 Commander does not store any changes made in the  Commander Options menu
/
 window. I tried to insert options(sort.names = FALSE) in Rprofile.site
and
 .Rprofile config files but without success. Does anyone know the solution?
 --
 View this message in context:
http://r.789695.n4.nabble.com/R-Commander-how-

to-disable-the-alphabetical-sorting-of-variable-names-tp3175426p3175426.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] speed up in R apply

2011-01-05 Thread Douglas Bates

On Wed, Jan 5, 2011 at 1:22 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Jan 5, 2011, at 10:03 AM, Young Cho wrote:

 Hi,

 I am doing some simulations and found a bottle neck in my R script. I made
 an example:

 a = matrix(rnorm(500),100,5)
 tt  = Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time() - tt

 [1] -1291.026
 Time difference of 0.2354031 secs

 tt  = Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt

 [1] -1291.026
 Time difference of 20.23150 secs

 Is there a faster way of calculating sum of products (of columns, or of
 rows)?

 You should look at crossprod and tcrossprod.

Hmm.  Not sure that would help, David.  You could use a matrix
multiplication of a %*% rep(1, ncol(a)) if you wanted the row sums but
of course you could also use rowSums to get those.

 And is this an expected behavior?

 Yes. For loops and *apply strategies are slower than the proper use of
 vectorized functions.

To expand a bit on David's point, the apply function isn't magic.  It
essentially loops over the rows, in this case.  By multiplying columns
together you are performing the looping over the rows in compiled
code, which is much, much faster.  If you want to do this kind of
operation effectively in R for a general matrix (i.e. not knowing in
advance that it has exactly 5 columns) you could use Reduce

 a - matrix(rnorm(500),100,5)
 system.time(pr1 - a[,1]*a[,2]*a[,3]*a[,4]*a[,5])
   user  system elapsed
   0.150.090.37
 system.time(pr2 - apply(a, 1, prod))
   user  system elapsed
 22.090   0.140  22.902
 all.equal(pr1, pr2)
[1] TRUE
 system.time(pr3 - Reduce(get(*), as.data.frame(a), rep(1, nrow(a
   user  system elapsed
  0.410   0.010   0.575
 all.equal(pr3, pr2)
[1] TRUE




 Thanks for your advice in advance,


 --

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assumptions for ANOVA: the right way to check the normality

2011-01-05 Thread Robert Baer

Someone suggested me that I don´t have to check the normality of the 
data, but
the normality of the residuals I get after the fitting of the  linear 
model.
I really ask you to help me to understand this point as I don´t find 
enough

material online where to solve it.


Try the following:
# using your scrd data and your proposed models
fit1- lm(response ~ stimulus + condition + stimulus:condition, data=scrd)
fit2- lm(response ~ stimulus + condition, data=scrd)
fit3- lm(response ~ condition, data=scrd)

# Set up for 6 plots on 1 panel
op = par(mfrow=c(2,3))

# residuals function extracts residuals
# Visual inspection is a good start for checking normality
# You get a much better feel than from some magic number statistic
hist(residuals(fit1))
hist(residuals(fit2))
hist(residuals(fit3))

# especially qqnorm() plots which are linear for normal data
qqnorm(residuals(fit1))
qqnorm(residuals(fit2))
qqnorm(residuals(fit3))

# Restore plot parameters
par(op)



If the data are not normally distributed I have to use the kruskal wallys 
test

and not the ANOVA...so please help
me to understand.


Indeed - Kruskal-Wallis is a good test to use for one factor data that is 
ordinal so it is a good alternative to your fit3.
Your response seems to be a discrete variable rather than a continuous 
variable.
You must decide if it is reasonable to approximate it with a normal 
distribution which is by definition continuous.




I make a numerical example, could you please tell me if the data in this 
table

are normally distributed or not?

Help!


number  stimulus condition response
1 flat_550_W_realism A3
2 flat_550_W_realism A3
3 flat_550_W_realism A5
4 flat_550_W_realism A3
5 flat_550_W_realism A3
6 flat_550_W_realism A3
7 flat_550_W_realism A3
8 flat_550_W_realism A5
9 flat_550_W_realism A3
10flat_550_W_realism A3
11flat_550_W_realism A5
12flat_550_W_realism A7
13flat_550_W_realism A5
14flat_550_W_realism A2
15flat_550_W_realism A3
16flat_550_W_realismAH7
17flat_550_W_realismAH4
18flat_550_W_realismAH5
19flat_550_W_realismAH3
20flat_550_W_realismAH6
21flat_550_W_realismAH5
22flat_550_W_realismAH3
23flat_550_W_realismAH5
24flat_550_W_realismAH5
25flat_550_W_realismAH7
26flat_550_W_realismAH2
27flat_550_W_realismAH7
28flat_550_W_realismAH5
29flat_550_W_realismAH5
30 bump_2_step_W_realism A1
31 bump_2_step_W_realism A3
32 bump_2_step_W_realism A5
33 bump_2_step_W_realism A1
34 bump_2_step_W_realism A3
35 bump_2_step_W_realism A2
36 bump_2_step_W_realism A5
37 bump_2_step_W_realism A4
38 bump_2_step_W_realism A4
39 bump_2_step_W_realism A4
40 bump_2_step_W_realism A4
41 bump_2_step_W_realismAH3
42 bump_2_step_W_realismAH5
43 bump_2_step_W_realismAH1
44 bump_2_step_W_realismAH5
45 bump_2_step_W_realismAH4
46 bump_2_step_W_realismAH4
47 bump_2_step_W_realismAH5
48 bump_2_step_W_realismAH4
49 bump_2_step_W_realismAH3
50 bump_2_step_W_realismAH4
51 bump_2_step_W_realismAH5
52 bump_2_step_W_realismAH4
53 hole_2_step_W_realism A3
54 hole_2_step_W_realism A3
55 hole_2_step_W_realism A4
56 hole_2_step_W_realism A1
57 hole_2_step_W_realism A4
58 hole_2_step_W_realism A3
59 hole_2_step_W_realism A5
60 hole_2_step_W_realism A4
61 hole_2_step_W_realism A3
62 hole_2_step_W_realism A4
63 hole_2_step_W_realism A7
64 hole_2_step_W_realism A5
65 hole_2_step_W_realism A1
66

[R] Nnet and AIC: selection of a parsimonious parameterisation

2011-01-05 Thread Ben Rhelp

Hi All,

I am trying to use a neural network for my work, but I am not sure about my 
approach to select a parsimonious model. In R with nnet, the IAC has
not been defined for a feed-forward neural network with a single hidden layer. 
Is this because it does not make sens mathematically in this case?
For example, is this pseudo code sensible?

Thanks in advance for your help. I am sorry if this has been answered before, 
but I haven't found an answer for this in the archive.

Below, I have added an implementation of this idea based on (Modern Applied 
Statistic with S) MASS code of chapter 8.

Cheers,

Ben



Pseudo code


Define RSS as:
RSS = (1-alpha)*RSS(identification set) +  alpha* RSS(validation set)
and AIC as:
AIC = 2*np + N*log(RSS)
where np corresponds to the non-null parameters of the neural network
and N is the sample size (based on 
http://en.wikipedia.org/wiki/Akaike_information_criterion).

Assuming a feed-forward neural network with a single hidden layer and
a maximum number of neurons (maxSize),

For size = 1 to maxSize
Optimise the decay
Select the neural network with the smallest AIC for a given size and decay
  using random starting parameterisation and random identification set
For the lowest to the largest diagonal element of the Hessian,
   Equate the corresponding parameter to 0
   If AIC(i)AIC(i-1), break;

The neural network selected is the one with the smallest AIC.


an example based on cpus data in Chapter 8 of MASS



library(nnet)
library(MASS)

# From Chapter 6, for comparisons
set.seed(123)
cpus.samp -
c(3, 5, 6, 7, 8, 10, 11, 16, 20, 21, 22, 23, 24, 25, 29, 33, 39, 41, 44, 45,
46, 49, 57, 58, 62, 63, 65, 66, 68, 69, 73, 74, 75, 76, 78, 83, 86,
88, 98, 99, 100, 103, 107, 110, 112, 113, 115, 118, 119, 120, 122,
124, 125, 126, 127, 132, 136, 141, 144, 146, 147, 148, 149, 150, 151,
152, 154, 156, 157, 158, 159, 160, 161, 163, 166, 167, 169, 170, 173,
174, 175, 176, 177, 183, 184, 187, 188, 189, 194, 195, 196, 197, 198,
199, 202, 204, 205, 206, 208, 209)


cpus2 - cpus[, 2:8] # excludes names, authors’ predictions
attach(cpus2)
cpus3 - data.frame(syct = syct-2, mmin = mmin-3, mmax = mmax-4, 
cach=cach/256,chmin=chmin/100, chmax=chmax/100, perf)
detach()




CVnn.cpus - function(formula, data = cpus3[cpus.samp, ], maxSize = 10,
decayRange = c(0,0.2), nreps = 5, nifold = 10, alpha= 9/10,
linout = TRUE, skip = TRUE, maxit = 1000,...){
#nreps=number of attempts to fit a nnet model with randomly chosen initial 
parameters
#  The one with the smallest RSS on the training data is then chosen
  nnWtsPrunning -function(nn,data,alpha,i){
truth - log10(data$perf)
RSS=(1-alpha)*sum((truth[ri != i] - predict(nn, data[ri != i,]))^2) + 
alpha* sum((truth[ri == i] - predict(nn, data[ri == i,]))^2)
AIC=2*sum(nn$wts!=0) + length(data$perf)*log(RSS)
nn.tmp=nn
for (j in (1:length(nn$wts))) {
  nn.tmp$wts[order(diag(nn.tmp$Hessian))[j]]=0
  RSS.tmp=(1-alpha)*sum((truth[ri != i] - predict(nn.tmp, data[ri != 
i,]))^2) + 

 alpha* sum((truth[ri == i] - predict(nn.tmp, data[ri == 
i,]))^2)
  AIC.tmp=2*sum(nn.tmp$wts!=0) + length(data$perf)*log(RSS.tmp)  
  if (is.nan(AIC.tmp) || AIC.tmpAIC ) {
  cat('\n  j',j,'AIC'=AIC.tmp,'AIC_1',AIC,'\n')
  break
  } else {
  nn=nn.tmp; AIC=AIC.tmp; RSS=RSS.tmp
  }
}
list(choice=sqrt(RSS/100),nparam=sum(nn$wts!=0),AIC=AIC,nn=nn)
  }

  #Modified function for optimisation
  CVnn1 - function(decay, formula, data, nreps=1, ri, size, linout, skip, 
maxit, optimFlag=FALSE, alpha) {
truth - log10(data$perf)
nn - nnet(formula, data[ri !=1,], trace=FALSE, size=size, linout=linout, 
skip=skip, maxit=maxit, Hess = TRUE)
RSS=(alpha-1)*sum((truth[ri != 1] - predict(nn, data[ri != 1,]))^2) + 
alpha* sum((truth[ri == 1] - predict(nn, data[ri == 1,]))^2)
ii=1
for (i in sort(unique(ri))) {
  for(rep in 1:nreps) {
nn.tmp - nnet(formula, data[ri !=i,], trace=FALSE, size=size, 
linout=linout, skip=skip, maxit=maxit, Hess = TRUE)
RSS.tmp=(alpha-1)*sum((truth[ri != i] - predict(nn.tmp, data[ri != 
i,]))^2) + 

 alpha* sum((truth[ri == i] - predict(nn.tmp, data[ri == 
i,]))^2)
if (RSS.tmpRSS){ RSS=RSS.tmp; nn=nn.tmp; ii=i}
  }
}
if (optimFlag) {
 return(RSS)
}else{
 prn=nnWtsPrunning(nn,data,alpha,ii)

 
list(choice=prn$choice,nparam=prn$nparam,nparaminit=length(nn$wts),AIC=prn$AIC,nn1=prn$nn)

}
  }

  maxSize=maxSize+1; j=1;   
  choice - numeric(maxSize);  nparam - numeric(maxSize);  lambdaj - 
numeric(maxSize)
  AIC - numeric(maxSize);  nparamInit -

[R] plot(aModel) vs. influence.measures()

2011-01-05 Thread Schwab,Wilhelm K

A while back I asked about getting a list of points that R considers 
influential after fitting a linear model, and very quickly got a helpful 
pointer to influence.measures().  But it has happened again.  The trouble I 
am having is that points marked on plots are not flagged in the output from 
influence.measures(), and I can't read them on the plots.  I tried some 
successive deletion, but then other points (naturally) start to look 
troublesome).

Is there a good way to get a list of suspicious entries at the beginning?  In 
this case, I am trying to help identify possible data entry errors, and I am 
interested in knowing what R bothered to mark up front.  Perhaps the defaults 
should be telling me that what I want to do is silly, but it sure _seems_ like 
it would be helpful.  Is there a way to control the threshold used by 
influence.measures() to get it to flag more items at one time?  I am learning 
the hard way, so feel free to tell me that I should be trying to do this some 
other way.

Bill

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OT: Reducing pdf file size

2011-01-05 Thread Kurt_Helf

Greetings
 Does anyone have any suggestions for reducing pdf file size,
particularly pdfs containing photos, without sacrificing quality?  Thanks
for any tips in advance.
Cheers
Kurt

***
Kurt Lewis Helf, Ph.D.
Ecologist
EEO Counselor
National Park Service
Cumberland Piedmont Network
P.O. Box 8
Mammoth Cave, KY 42259
Ph: 270-758-2163
Lab: 270-758-2151
Fax: 270-758-2609

Science, in constantly seeking real explanations, reveals the true majesty
of our world in all its complexity.
-Richard Dawkins

The scientific tradition is distinguished from the pre-scientific tradition
in having two layers.  Like the latter it passes on its theories but it
also passes on a critical attitude towards them.  The theories are passed
on not as dogmas but rather with the challenge to discuss them and improve
upon them.
-Karl Popper

...consider yourself a guest in the home of other creatures as significant
as yourself.
-Wayside at Wilderness Threshold in McKittrick Canyon, Guadalupe Mountains
National Park, TX

Cumberland Piedmont Network (CUPN) Homepage:
http://tiny.cc/e7cdx

CUPN Forest Pest Monitoring Website:
http://bit.ly/9rhUZQ

CUPN Cave Cricket Monitoring Website:
http://tiny.cc/ntcql

CUPN Cave Aquatic Biota Monitoring Website:
http://tiny.cc/n2z1o

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integration Sweave and TexMakerX

2011-01-05 Thread Duncan Murdoch


On 05/01/2011 12:04 PM, Sebastián Daza wrote:

Hi,

Does anyone know how to integrate texmakerx and sweave on Windows? I
mean, to run .rnw files directly from texmakerx  and get a pdf or dvi file.


I don't know texmakerx, but the patchDVI package (on R-forge, see 
https://r-forge.r-project.org/R/?group_id=233) contains some functions 
for hooking up Sweave with other LaTeX editors.  If it's not flexible 
enough to handle yours I'd like to hear what's missing, and I'd probably 
add it.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Prediction error for Ordinary Kriging

2011-01-05 Thread pearl may dela cruz

Hi ALL,

Can you please help me on how to determine the prediction error for ordinary 
kriging?Below are all the commands i used to generate the OK plot:

rsa2 - readShapeSpatial(residentialsa, CRS(+proj=tmerc 
+lat_0=39.66 +lon_0=-8.1319062 +k=1 +x_0=0 +y_0=0 
+ellps=intl +units=m +no_defs))
x2 - readShapeSpatial(ptna2, CRS(+proj=tmerc +lat_0=39.66 
+lon_0=-8.1319062 +k=1 +x_0=0 +y_0=0 +ellps=intl +units=m +no_defs))
bb - bbox(rsa2)
cs - c(1, 1)
cc - bb[, 1] + (cs/2)
cd - ceiling(diff(t(bb))/cs)
rsa2_grd - GridTopology(cellcentre.offset = cc,cellsize = cs, cells.dim = cd)
getClass(SpatialGrid)
p4s - CRS(proj4string(rsa2))
x2_SG - SpatialGrid(rsa2_grd, proj4string = p4s)
x2_SP - SpatialPoints(cbind(x2$X, x2$Y))
v - variogram(log1p(tsport_ace) ~ 1, x2, cutoff=100, width=9)
te- fit.variogram(v,vgm(0.0437, Exp, 26, 0))
y - krige(tsport_ace~1, x2, x2_SG, model = ve.fit)
spplot(y, 1, col.regions = bpy.colors(100), sp.layout = 
list(sp.lines,as(rsa2, 
SpatialLines),no.clip = TRUE))

 I'm looking forward to your response. Thanks.

Best regards,
Pearl dela Cruz


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] speed up in R apply

2011-01-05 Thread David Winsemius



On Jan 5, 2011, at 2:40 PM, Douglas Bates wrote:

On Wed, Jan 5, 2011 at 1:22 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Jan 5, 2011, at 10:03 AM, Young Cho wrote:


Hi,

I am doing some simulations and found a bottle neck in my R  
script. I made

an example:


a = matrix(rnorm(500),100,5)
tt  = Sys.time(); sum(a[,1]*a[,2]*a[,3]*a[,4]*a[,5]); Sys.time()  
- tt


[1] -1291.026
Time difference of 0.2354031 secs


tt  = Sys.time(); sum(apply(a,1,prod)); Sys.time() - tt


[1] -1291.026
Time difference of 20.23150 secs

Is there a faster way of calculating sum of products (of columns,  
or of

rows)?


You should look at crossprod and tcrossprod.


Hmm.  Not sure that would help, David.  You could use a matrix
multiplication of a %*% rep(1, ncol(a)) if you wanted the row sums but
of course you could also use rowSums to get those.


Thanks for pointing  that out. I misread the OP's code.



And is this an expected behavior?


Yes. For loops and *apply strategies are slower than the proper use  
of

vectorized functions.


To expand a bit on David's point, the apply function isn't magic.  It
essentially loops over the rows, in this case.  By multiplying columns
together you are performing the looping over the rows in compiled
code, which is much, much faster.  If you want to do this kind of
operation effectively in R for a general matrix (i.e. not knowing in
advance that it has exactly 5 columns) you could use Reduce


a - matrix(rnorm(500),100,5)
system.time(pr1 - a[,1]*a[,2]*a[,3]*a[,4]*a[,5])

  user  system elapsed
  0.150.090.37

system.time(pr2 - apply(a, 1, prod))

  user  system elapsed
22.090   0.140  22.902

all.equal(pr1, pr2)

[1] TRUE
system.time(pr3 - Reduce(get(*), as.data.frame(a), rep(1,  
nrow(a


Slightly faster would be:

system.time(pr3 - Reduce(*, as.data.frame(a)))

And thanks for the nice example. Using a data.frame to feed Reduce  
materially enhances its value to me.



  user  system elapsed
 0.410   0.010   0.575

all.equal(pr3, pr2)

[1] TRUE


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Advice on obscuring unique IDs in R

2011-01-05 Thread Anthony Staines

Dear colleagues,

This may be a question with a really obvious answer, but I
can't find it. I have access to a large file with real
medical record identifiers (mixed strings of characters and
numbers) in it. These represent medical events for many
thousands of people. It's important to be able to link
events for the same people.

It's much more important that the real record numbers are
strongly obscured. I'm interested in some kind of strong
one-way hash function to which I can feed the real numbers
and get back unique codes for each record  identifier fed
in. I can do this on the health service system, and I have
to do this before making further use of the data!

There is the 'digest' function, in the digest package, but
this seems to work on the whole vector of IDs, producing, in
my case, a vector with 60,000 identical entries.

H.Out$P_ID = digest(H.In$MRNr,serialize=FALSE, algo='md5')

I could do this in Perl, but I'd have to do quite a bit of
work to get it installed.

Any quick suggestions?
Anthony Staines
-- 
Anthony Staines, Professor of Health Systems Research,
School of Nursing, Dublin City University, Dublin 9,Ireland.
Tel:- +353 1 700 7807. Mobile:- +353 86 606 9713

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparing fitting models

2011-01-05 Thread Greg Snow

Just do anova(fit3, fit1)

This compares those 2 models directly.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Frodo Jedi
 Sent: Wednesday, January 05, 2011 10:10 AM
 To: r-help@r-project.org
 Subject: [R] Comparing fitting models
 
 
 Dear all,
 I have 3 models (from simple to complex) and I want to compare them in
 order to
 see if they fit equally well or not.
 From the R prompt I am not able to see where I can get this
 information.
 Let´s do an example:
 
 fit1- lm(response ~ stimulus + condition + stimulus:condition,
 data=scrd)
 #EQUIVALE A lm(response ~ stimulus*condition, data=scrd)
 
 
 fit2- lm(response ~ stimulus + condition, data=scrd)
 
 fit3- lm(response ~ condition, data=scrd)
 
 
  anova(fit2, fit1) #compare models
 Analysis of Variance Table
 
 Model 1: response ~ stimulus + condition
 Model 2: response ~ stimulus + condition + stimulus:condition
   Res.DfRSS Df Sum of Sq  F Pr(F)
 1165 364.13
 2159 362.67  61.4650 0.1071 0.9955
 
 
  anova(fit3, fit2, fit1) #compare models
 Analysis of Variance Table
 
 Model 1: response ~ condition
 Model 2: response ~ stimulus + condition
 Model 3: response ~ stimulus + condition + stimulus:condition
   Res.DfRSS Df Sum of Sq  F Pr(F)
 1171 382.78
 2165 364.13  618.650 1.3628 0.2328
 3159 362.67  6 1.465 0.1071 0.9955
 
 
 
 How can I understand that the simple model fits as good as the complex
 model
 (the one with the interaction)?
 
 Thanks in advance
 
 All the best
 
 
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading large SAS dataset in R

2011-01-05 Thread Santanu Pramanik

Hi all,

I have a large (approx. 1 GB) SAS dataset (test.sas7bdat) located in the
server (R:/ directory). I have SAS 9.1 installed in my PC and I can read
the SAS dataset in SAS, under a windows environment, after assigning libname
in R:\ directory.



Now I am trying to read the SAS dataset in R (R 2.12.0) using the read.ssd
function of the foreign package, but I get an error message SAS failed.
I believe I have specified the paths correctly (after reading some previous
posts I made sure that I do it right). Below is the small code:



sashome- C:/Program Files/SAS/SAS 9.1

read.ssd(libname=R:/, sectionnames=test, sascmd=file.path(sashome,
sas.exe))



Please let me know where I am making the mistake. Is it because of the size
of the file or the location of the file (in server instead of local hard
drive)?



Thanks in advance,

Santanu


-- 

Santanu Pramanik
Survey Statistician
NORC at the University of Chicago
Bethesda, MD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to 'explode' a matrix

2011-01-05 Thread Kevin Ummel

Thanks, Henrique.

The second option you suggested is about twice as fast as my original 
application.

Much appreciated,
Kevin

On Jan 5, 2011, at 6:30 PM, Henrique Dallazuanna wrote:

 Try this:
 
 apply(apply(m, 2, rep, each = 2), 1, rep, each = 2)
 
 or
 
 m[rep(seq(nrow(m)), each = 2), rep(seq(ncol(m)), each = 2)]
 
 On Wed, Jan 5, 2011 at 10:03 AM, Kevin Ummel kevinum...@gmail.com wrote:
 Hi everyone,
 
 I'm looking for a way to 'explode' a matrix like this:
 
  matrix(1:4,2,2)
 [,1] [,2]
 [1,]13
 [2,]24
 
 into a matrix like this:
 
  matrix(c(1,1,2,2,1,1,2,2,3,3,4,4,3,3,4,4),4,4)
 [,1] [,2] [,3] [,4]
 [1,]1133
 [2,]1133
 [3,]2244
 [4,]2244
 
 My current kludge is this:
 
 v1=rep(1:4,each=2,times=2)
 v2=v1[order(rep(1:2,each=4,times=2))]
 matrix(v2,4,4)
 
 But I'm hoping there's a more efficient solution that I'm not aware of.
 
 Many thanks,
 Kevin
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Heat map in R

2011-01-05 Thread lraeb...@sfu.ca


Hello,
I am trying to make a heatmap in R and am having some trouble. I am very new
to the world of R, but have been told that what I am trying to do should be
possible.  I want to make a heat map that looks like  a gene expression
heatmap (see http://en.wikipedia.org/wiki/Heat_map).

I have 43 samples and 900 genes (yes I know this will be a huge map). I also
have copy numbers associated with each gene/sample and need these to be
represented as the colour intensities on the heat map.  There are multiple
genes per sample with different copy numbers. I think my trouble may be how
I am setting up my data frame. 

My data frame was created in excel as a tab deliminated text file:

Gene   Copy Number   Sample ID
A   1935  01
B   2057  01
C   2184  02
D   1498  03
E   2294  03
F   2485  03
G   1560  04
H   3759  04
I   2792  05
J   7081  05
K   1922  06
...
... 
...
ZZZ 1354  43


My code in R is something like this:

data-read.table(/Users/jsmt/desktop/test.txt,header=T)

data_matrix-data.matrix(data)

data_heatmap - heatmap(data_matrix, Rowv=NA, Colv=NA, col = cm.colors(256),
scale=column, margins=c(5,10))

I end up getting a heat map split into 3 columns: sample, depth, gene and
the colours are just in big blocks that don't mean anything. 

Can anyone help me with my dataframe or my R code?  Again, I am fairly new
to R, so if you can help, please give me very detailed help :)

Thanks in advance! 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Heat-map-in-R-tp3176478p3176478.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Match numeric vector against rows in a matrix?

2011-01-05 Thread Kevin Ummel

Two posts in one day is not a good day...and this question seems like it should 
have an obvious answer:

I have a matrix where rows are unique combinations of 1's and 0's:

 combs=as.matrix(expand.grid(c(0,1),c(0,1)))
 combs
 Var1 Var2
[1,]00
[2,]10
[3,]01
[4,]11

I want a single function that will give the row index containing an exact match 
with vector x:

 x=c(0,1)

The solution needs to be applied many times, so I need something quick -- I was 
hoping a base function would do it, but I'm drawing a blank.

Thanks!
Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Advice on obscuring unique IDs in R

2011-01-05 Thread Seeliger . Curt

Dr. Anthony wrote on 01/05/2011 01:19:49 PM:
 This may be a question with a really obvious answer, but I
 can't find it. I have access to a large file with real
 medical record identifiers (mixed strings of characters and
 numbers) in it. ...

It's not that trivial of a question, or more organizations would have 
gotten it right.  I bet a method (or two) for obscuring PII is recommended 
by your university or department.  When that method has been determined, 
the requisite R package will probably be easy to find, and down the road 
you'll dodge the bullet of I thought it would work by not guessing at a 
method.

cur
-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
seeliger.c...@epa.gov
541/754-4638


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting Fortran or C++ etc to R

2011-01-05 Thread Murray Jorgensen

Thanks Barry and thanks to others who applied off-list. I can see that I 
should have given more details about my motives for wanting to replace a 
Fortran program by an R one.


At this stage I want to get something working in pure R because it is 
easier to fool around with and tweak with than Fortran and I have a few 
things that I want to try out that will involve perturbing the original 
code and I think I'd rather be doing them in R than in a 3GL.


Now that I have publicly asked the question I find that the answer to it 
occurs to me:


The program that I want to port to R is an ML estimation by the EM 
algorithm. The iterative steps are fairly simple except they need to be 
repeated a large number of times. What I have noticed is that I can 
replace (maybe) the within-step loops by matrix multiplications. This 
means that I will, by using %*%, be effectively handing a lot of the 
work to external Fortran (or similar) routines without calling .Fortran().


OK, I know that you can see though me and I accept that I am just 
rationalising my reluctance to get into package-writing. I will bite the 
bullet on that in due course but for the meantime I'm just going to fool 
around with straight R.


Barry came closest to answering my real question and I will formulate a 
follow-up question as follows:


Does anyone know of a helpful set of examples of the vectorization of code?

Cheers,  Murray


On 6/01/2011 12:32 a.m., Barry Rowlingson wrote:

On Wed, Jan 5, 2011 at 7:33 AM, lcnlcn...@gmail.com  wrote:


As for your actual requirement to do the convertion, I guess there'd not
exist any quick ways. You have to be both familiar with R and the other
language to make the rewrite work.


  To make the rewrite work _well_ is the bigger problem! The easiest
way to big performance wins is going to be spotting vectorisation
possibilities in the Fortran code. Any time you see a DO K=1,N loop
then look to see if its just a single vector operation in R.

  Another way to big wins is to write test code, so you can check if
your R code gives the same results as the Fortran (C/C++) code at
every stage of the rewrite. Don't just write it all in one go and then
hope it works! Small steps

Barry


--
Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: m...@waikato.ac.nzFax 7 838 4155
Phone  +64 7 838 4773 wkHome +64 7 825 0441   Mobile 021 0200 8350

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 125 matches

Mail list logo