date:20111125




On 24.11.2011 16:49, Filoche wrote:

Thank you for your precious help. It works fine.

However, what about if I have two entries in the legend?

I tryed:

legend('topright', inset = .05, title = 'light ratios', pch = c(21,22),
  legend = c(substitute('Green/Red' ~~ R^2 == r2,
list(r2=r2)),substitute('Green/Red' ~~ R^2 == r2, list(r2=r2))),
  horiz = FALSE, pt.bg = c('gray', 'black'), cex = 0.75,
  bg = 'white')

But it does not work.



There is an example in:

Ligges, U. (2002): R Help Desk: Automation of Mathematical Annotation in 
Plots. R News 2 (3), 32-34.


Applying that to your problem yields:

plot(1)
r2 - 0.9
termA - substitute('A: Green/Red' ~~ R^2 == r2, list(r2=r2))
termB - substitute('B: Green/Red' ~~ R^2 == r2, list(r2=r2))
legend('topright', inset = .05, title = 'light ratios', pch = c(21,22),
   legend = do.call(expression, list(termA, termB)),
   horiz = FALSE, pt.bg = c('gray', 'black'), cex = 0.75,
   bg = 'white')

Best,
Uwe Ligges



Regards,
Phil

--
View this message in context: 
http://r.789695.n4.nabble.com/Legend-tp4103799p4104386.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space




On 25.11.2011 05:07, Aldo wrote:

It works when I do not have multiple windows open,


Please read the posting guide! You failed to follow in multiple way, 
most important one so far: quote the thread! We do not know your 
original question nor the answers you got so far. This is the R mailing 
list that is sometimes misused by Nabble user, unfortunately.


Please define multiple windows.  Do you mean graphics devices, R 
instances, ...?






but is not there when I do
have multiple windows open, so I dont think it has to do with the
function The function is pretty complicated but I can share more if you
think it will help


Actually, the posting guide asks you to provide commented, *minimal*, 
self-contained, reproducible code.


Uwe Ligges





--
View this message in context: 
http://r.789695.n4.nabble.com/Objects-disappearing-in-my-R-work-space-tp4104389p4106379.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space




On 25.11.2011 05:12, Aldo wrote:

Is there a maximum memory allocation for all R windows open? because it is
like 1-3 million runs



So you mean you open a million windows at the same time? In that case we 
really need your definition of window.



 so... it may be reaching some sort of memory limit


I do not know if any OS / window manager has the capability to open that 
many numbers of windows. But as I said, we need some difintions and 
examples.


Uwe Ligges


--
View this message in context: 
http://r.789695.n4.nabble.com/Objects-disappearing-in-my-R-work-space-tp4104389p4106390.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space

On 25.11.2011 16:08, Michael Clawson wrote:

Uwe,

by window I mean instances, by runs I mean, runs the my Markov-Chain Monte
Carlo simulator

I open two instances of R, run a million cycle chain in each instance, and
when they finish, neither window has the object I defined to store the runs.

I tested this morning and when I open two R windows and run a 5k cycle
chain in each instance, neither window has the object I defined to store
the runs.

This does not happen when I only have one instance of R open

Please: provide commented, minimal, self-contained, reproducible code.

Uwe Ligges

2011/11/25 Uwe Liggeslig...@statistik.tu-dortmund.de

On 25.11.2011 05:12, Aldo wrote:

Is there a maximum memory allocation for all R windows open? because it is
like 1-3 million runs

So you mean you open a million windows at the same time? In that case we
really need your definition of window.

so... it may be reaching some sort of memory limit

I do not know if any OS / window manager has the capability to open that
many numbers of windows. But as I said, we need some difintions and
examples.

Uwe Ligges

View this message in context: http://r.789695.n4.nabble.com/**
Objects-disappearing-in-my-R-**work-space-tp4104389p4106390.**htmlhttp://r.789695.n4.nabble.com/Objects-disappearing-in-my-R-work-space-tp4104389p4106390.html

Sent from the R help mailing list archive at Nabble.com.

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.htmlhttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting values with some conditions in a simulation

How are you computing the sum?  Does FAQ 7.31 apply?  Showing at least a
sample of your code would help.

On Friday, November 25, 2011, Sl K s.ka...@gmail.com wrote:
 Dear R users,

 I am running simulations (1000), and in my simulation I am looking at
 specific sums. For example, if the sum is =4 then count this, if say 3,
 then don't count, if the sum=3, then generate a random number from uniform
 distribution, if this number is say less than 0.5, then count this sum, if
 greater than 0.5, then don't count. I am having trouble with introducing
 this uniform number and decide whether to count 3 or not.  Any help or
hint
 will be greatly appreciated. Thank you very much in advance

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting values with some conditions in a simulation

2011-11-25 Thread Jeff Newmiller

You need to read the posting guide. Provide a reproducible code sample, 
simplified, with self-contained data.
You might find the ave function useful if you are working with vectorized 
simulations.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Sl K s.ka...@gmail.com wrote:

Dear R users,

I am running simulations (1000), and in my simulation I am looking at
specific sums. For example, if the sum is =4 then count this, if say
3,
then don't count, if the sum=3, then generate a random number from
uniform
distribution, if this number is say less than 0.5, then count this sum,
if
greater than 0.5, then don't count. I am having trouble with
introducing
this uniform number and decide whether to count 3 or not.  Any help or
hint
will be greatly appreciated. Thank you very much in advance

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On-demand importing of a package




On 23.11.2011 14:59, Gabor Grothendieck wrote:

2011/11/23 Uwe Liggeslig...@statistik.tu-dortmund.de:



On 23.11.2011 03:18, Gabor Grothendieck wrote:


On Tue, Nov 22, 2011 at 3:16 PM, Gábor Csárdicsa...@rmki.kfki.huwrote:


Dear All,

in some functions of my package, I use the Matrix S4 class, as defined
in the Matrix package.

I don't want to depend on Matrix, however, because my package is
perfectly fine without Matrix, most of the functionality does not need
Matrix. Matrix is so included in the 'Suggests' line.

I load Matrix via require(), from the functions that really need it.
This mostly works fine, but I have an issue now that I cannot sort
out.

If I define a function like this in my package:

f- function() {
  require(Matrix)
  res- sparseMatrix(dims=c(5, 5), i=1:5, j=1:5, x=1:5)
  y- rowSums(res)
  res / y
}

then calling it from the R prompt I get
Error in rowSums(res) : 'x' must be an array of at least two dimensions

which basically means that the rowSums() in the base package is
called, not the S4 generic in the Matrix package. Why is that?
Is there any way to work around this problem, without depending on
Matrix?

I am doing this on R 2.14.0, x86_64-apple-darwin9.8.0.



Try adding these three lines to the package:

rowSums- function(x, na.rm = FALSE, dims = 1L) UseMethod(rowSums)
rowSums.dgCMatrix- Matrix:::rowSums
rowSums.default- base::rowSums




Folks, please not, just import relevant functionality from the *recommended*
package Matrix.
Messing around even more is certainly less helpful than importing relevant
part from a Namespace/package that you will use anyway.



The real problem is how to deal with conditional dependencies and
importing is just as much a kludge as anything else.  In the problem
under discussion it has the undesirable property that Matrix is always
imported even though its almost never needed.

Additional conditional dependency features may be needed in R.  All
the scenarios in which conditional dependency are involved need to be
thought about since there may be interaction among them.

Some features might be:

- dynamically import another package.
- uncouple package installation and loading.  Right now
install.packages has a dep= argument that causes the Suggests packages
to be installed too.  There should be some way for the package
developer to specify this rather than make the user specify it.  For
example, if Matrix were not a recommended package and most users
wanted to use it in the problem above but a few wanted to use a
package that conflicts with it then it would be nice if the package in
question could force dep=TRUE without having the user do it.  For
example, perhaps there would be an
   Installs: Matrix



Errr, if I understand this correctly, your arguments are now orthogonal 
to your original comments.


Before you told us it is important to be able to run stuff without 
having Matrix available or just load on demand since it may not be 
available to the users. Now you tell us you want to make it available 
without having any need to use it?


Uwe Ligges



line in the DESCRIPTION file to tell it to install Matrix at install
time but not load it automatically at package load time -- the package
would have to require it itself.  (sqldf has this problem since most,
but not all, users want RSQLite but to put it in Suggests would make
most users use install.packages(sqldf, dep = TRUE) which makes it
harder to install whereas putting it in Depends means its always
loaded and could conflict with some other database backend.)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On-demand importing of a package

2011-11-25 Thread Gabor Grothendieck

2011/11/25 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 23.11.2011 14:59, Gabor Grothendieck wrote:

 2011/11/23 Uwe Liggeslig...@statistik.tu-dortmund.de:


 On 23.11.2011 03:18, Gabor Grothendieck wrote:

 On Tue, Nov 22, 2011 at 3:16 PM, Gábor Csárdicsa...@rmki.kfki.hu
  wrote:

 Dear All,

 in some functions of my package, I use the Matrix S4 class, as defined
 in the Matrix package.

 I don't want to depend on Matrix, however, because my package is
 perfectly fine without Matrix, most of the functionality does not need
 Matrix. Matrix is so included in the 'Suggests' line.

 I load Matrix via require(), from the functions that really need it.
 This mostly works fine, but I have an issue now that I cannot sort
 out.

 If I define a function like this in my package:

 f- function() {
  require(Matrix)
  res- sparseMatrix(dims=c(5, 5), i=1:5, j=1:5, x=1:5)
  y- rowSums(res)
  res / y
 }

 then calling it from the R prompt I get
 Error in rowSums(res) : 'x' must be an array of at least two dimensions

 which basically means that the rowSums() in the base package is
 called, not the S4 generic in the Matrix package. Why is that?
 Is there any way to work around this problem, without depending on
 Matrix?

 I am doing this on R 2.14.0, x86_64-apple-darwin9.8.0.


 Try adding these three lines to the package:

 rowSums- function(x, na.rm = FALSE, dims = 1L) UseMethod(rowSums)
 rowSums.dgCMatrix- Matrix:::rowSums
 rowSums.default- base::rowSums



 Folks, please not, just import relevant functionality from the
 *recommended*
 package Matrix.
 Messing around even more is certainly less helpful than importing
 relevant
 part from a Namespace/package that you will use anyway.


 The real problem is how to deal with conditional dependencies and
 importing is just as much a kludge as anything else.  In the problem
 under discussion it has the undesirable property that Matrix is always
 imported even though its almost never needed.

 Additional conditional dependency features may be needed in R.  All
 the scenarios in which conditional dependency are involved need to be
 thought about since there may be interaction among them.

 Some features might be:

 - dynamically import another package.
 - uncouple package installation and loading.  Right now
 install.packages has a dep= argument that causes the Suggests packages
 to be installed too.  There should be some way for the package
 developer to specify this rather than make the user specify it.  For
 example, if Matrix were not a recommended package and most users
 wanted to use it in the problem above but a few wanted to use a
 package that conflicts with it then it would be nice if the package in
 question could force dep=TRUE without having the user do it.  For
 example, perhaps there would be an
   Installs: Matrix


 Errr, if I understand this correctly, your arguments are now orthogonal to
 your original comments.

 Before you told us it is important to be able to run stuff without having
 Matrix available or just load on demand since it may not be available to the
 users. Now you tell us you want to make it available without having any need
 to use it?


I was framing this in terms of the Matrix example, but perhaps its
easier to understand with the actual example which motivated this for
me.  That is, the feature is that whenever sqldf is installed then
RSQLite is installed too without having RSQLite automatically load
when sqldf loads.

Currently the only way to arrange that is to put RSQLite into Suggests
and then instruct the user to use install.packages(..., dep = TRUE),
say.   The problem with that is that it burdens the user with this
installation detail.

sqldf nearly always uses RSQLite so it should be installed when sqldf
is without the user having to do anything special.  We don't know at
install time whether RSQLite will be used or not but are willing to
have it unnecessarily installed even if its not needed in order to
make it easier for the majority who do use it.

However, just because RSQLite is installed does not mean that we want
RSQLite to be loaded automatically too.  sqldf can determine whether
the user wants to use the sqlite backend or one of several other
backends and require() RSQLite or not depending on whether its
actually to be used in that session.

Currently, if RSQLite is in Depends then its always loaded and if its
in Suggests then we can't be sure its been installed so neither of
these work the way we want.  The two things are tied together (i.e.
coupled) but here we want to separate them.  We always want RSQLite to
be installed without making the user specify it on the
install.packages() call yet we want the ability to dynamically
require() it rather than have it automatically loaded when sqldf is
loaded.

One way this might be implemented would be to have an Installs: line,
say, in the DESCRIPTION file which lists packages which are to be
installed at the same time but not automatically loaded.   It would be
the same

Re: [R] pairs(), expression in label and color in text.panel




On 24.11.2011 15:59, Johannes Radinger wrote:

Hello,

I'd like to add custom labels to my pair() plot. These
labels include math expression but they aren't correctly
displayed...


Looks fine for me in R-2.14.0 on the windows() device (alpha, text, 
beta). (Both version and device you used are unspecified)




Further, I want that the boxes for the text.panel (diagonal)
have an other background color (grey80). Is that generally
possible? If yes how do I have to set it?

What I've so far is:


panel.cor- function(x, y, digits=2, prefix=, cex.cor)
{
usr- par(usr); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r- abs(cor(x, y))
txt- format(c(r, 0.123456789), digits=digits)[1]
txt- paste(prefix, txt, sep=)
if(missing(cex.cor)) cex- 0.5/strwidth(txt)

test- cor.test(x,y)
# borrowed from printCoefmat
Signif- symnum(test$p.value, corr = FALSE, na = FALSE,
cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1),
symbols = c(***, **, *, .,  ))

text(0.5, 0.5, paste(txt,Signif), cex = 2)
}

#correlation pair plot
pairs(df, labels=c(expression(alpha),text,expression(beta)), 
lower.panel=panel.smooth, upper.panel=panel.cor)



Not easily without changing the original code, I think, but you can cheat:

pairs(iris, labels = expression(alpha, text, beta),
lower.panel=panel.smooth, upper.panel=panel.cor,
diag.panel = function(...)
rect(par(usr)[1], par(usr)[3],
 par(usr)[2], par(usr)[4], col=grey80)
)


Best,
Uwe Ligges




Maybe someone knows how to do that and can give some hints...

/Johannes

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On-demand importing of a package

2011-11-25 Thread Jakson Alves de Aquino

On Fri, Nov 25, 2011 at 1:21 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:

[...]

 I was framing this in terms of the Matrix example, but perhaps its
 easier to understand with the actual example which motivated this for
 me.  That is, the feature is that whenever sqldf is installed then
 RSQLite is installed too without having RSQLite automatically load
 when sqldf loads.

[...]

I think that the following procedure has the result that you want:

Put in the DESCRIPTION file:

Imports: RSQLite

And in the R code write something like:

RSQLite::AnRSQLiteFunction()

-- 
Jakson Alves de Aquino
Federal University of Ceará
Social Sciences Department
www.lepem.ufc.br/aquino.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On-demand importing of a package

2011-11-25 Thread Gabor Grothendieck

On Fri, Nov 25, 2011 at 11:52 AM, Jakson Alves de Aquino
jalve...@gmail.com wrote:
 On Fri, Nov 25, 2011 at 1:21 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:

 [...]

 I was framing this in terms of the Matrix example, but perhaps its
 easier to understand with the actual example which motivated this for
 me.  That is, the feature is that whenever sqldf is installed then
 RSQLite is installed too without having RSQLite automatically load
 when sqldf loads.

 [...]

 I think that the following procedure has the result that you want:

 Put in the DESCRIPTION file:

 Imports: RSQLite

 And in the R code write something like:

 RSQLite::AnRSQLiteFunction()

I had been thinking of using Imports in DESCRIPTION but was concerned
that that would put RSQLite objects ahead of everything else on
sqldf's search path even when not wanted but I gather you are
intending that Imports be used in DESCRIPTION: but _not_ in the
NAMESPACE file.  I think that that would likely work. I will test it
out to be sure. What I would probably want to do is to require()
RSQLite in case the user wants to mix sqldf and RSQLite calls and I
will check whether the check procedure allows that if the package is
only named in Imports but, if not, it might be sufficient to put
RSQLite in both Imports and Suggests.  Thanks.


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space

2011-11-25 Thread Michael Clawson

Uwe,

by window I mean instances, by runs I mean, runs the my Markov-Chain Monte
Carlo simulator

I open two instances of R, run a million cycle chain in each instance, and
when they finish, neither window has the object I defined to store the runs.

I tested this morning and when I open two R windows and run a 5k cycle
chain in each instance, neither window has the object I defined to store
the runs.

This does not happen when I only have one instance of R open

2011/11/25 Uwe Ligges lig...@statistik.tu-dortmund.de

On 25.11.2011 05:12, Aldo wrote:

Is there a maximum memory allocation for all R windows open? because it is
like 1-3 million runs

So you mean you open a million windows at the same time? In that case we
really need your definition of window.

so... it may be reaching some sort of memory limit

I do not know if any OS / window manager has the capability to open that
many numbers of windows. But as I said, we need some difintions and
examples.

Uwe Ligges

--
View this message in context: http://r.789695.n4.nabble.com/**
Objects-disappearing-in-my-R-**work-space-tp4104389p4106390.**htmlhttp://r.789695.n4.nabble.com/Objects-disappearing-in-my-R-work-space-tp4104389p4106390.html

Sent from the R help mailing list archive at Nabble.com.

__**
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.html http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

Re: [R] Unable to reproduce Stata Heckman sample selection estimates

2011-11-25 Thread Yuan Yuan

Hi Arne,

Thanks for the reply.

I am using R version 2.14.0 and sampleSelection version 0.6.12.

I estimate the model by the 1-step ML method. However, when I use 
the 2-step method, the standard errors are reported as NA.

I use the selection() function, very basic call, something to the 
effect of: selection(selectionFormula, outcomeFormula, data = 
aDataFrame), where the formulas are very straightforward and basic 
as well, y ~ x1 + x2 + ... + xp.

I have read the associated paper, which is where I got the idea to 
pass the coefficients from a seleciton object to the start argument.

I will work on creating a minimal reproducible example; the dataset 
is large and confidential, the models long-ish.

 - Clara

On Friday, November 25, 2011 04:04:52 am Arne Henningsen wrote:
 On 25 November 2011 04:37, Yuan Yuan y.y...@vt.edu wrote:
  Hello,
  
  I am working on reproducing someone's analysis which was done in
  Stata. The analysis is estimation of a standard Heckman sample
  selection model (Tobit-2), for which I am using the 
sampleSelection
  package and the selection() function. I have a few problems with 
the
  estimation:
  
  1) The reported standard error for all estimates is Inf ...
  vcov(selectionObject) yields Inf in every cell.
  
  2) While the selection equation coefficient estimates are almost
  exactly the same as the Stata results, the outcome equation
  coefficient estimates are quite different (different sign in one 
case,
  order of magnitude difference in some other cases).
  
  3) I can't seem to figure out how to specify the initial values 
for
  the MLE ... whatever argument I pass to start (even of the form
  coef(selectionObject)), I get the following error:
  Error in gr[, fixed] - NA : (subscript) logical subscript too 
long
  
  I have to admit I am pretty confused by #1, I feel like I must 
be
  doing something wrong, missing something obvious, but I have no 
idea
  what. I figure #2 might be because the algorithms (selection and
  Stata) are just finding different local maxima, but because of 
#3 I
  can't test that guess by using different initial values in 
selection.
  
  Let me know if I should provide any more information. Thanks in
  advance for any pointers in the right direction.
 
 Yes, please provide more information (see also the posting guide 
[1]),
 e.g. which version of R and which version of the sampleSelection
 package are you using? Do you estimate the model by the two-step
 approach or by the 1-step maximum likelihood method? Which 
commands
 did use use? Can you send us a reproducible example? Have you read 
the
 paper about using the sampleSelection package [2]?
 
 [1] http://www.r-project.org/posting-guide.html
 [2] http://www.jstatsoft.org/v27/i07
 
 Best wishes from copenhagen,
 Arne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: invalid type(list) for variable when using lm()

2011-11-25 Thread Dhaynes

Ok let me clarify

I have multidimensional array and I need to convert it to a singular
dimensional array.
The multidimensional array is 359 rows, 2 cols, 3 deep
I need to run a regression model mymatrix[1,1,1:3] and mymatrix [1,2,1:3]

This is my current error, which indicates I have the incorrect list type (I
have tried functions as.list, as.vector, as.vector)

lm(formula = mymatrix[1,1,1:3]~mymatrix[1,2,1:3] )
Error in model.frame.default(formula = mymatrix[1, 1, 1:3] ~ mymatrix[1,  :
  invalid type (list) for variable 'mymatrix[1, 1, 1:3]'


I was unsuccessful at attempting the str(mymatrix[1,1,1:3] --Argument not
valid model

The data.frame function did not create the objects
- data.frame(a=mymatrix[1,1,1:3], b=mymatrix[1,2,1:3])
 lm(a~b, data=df)
Error in eval(expr, envir, enclos) : object 'a' not found

Here is my code
con - dbConnect(PostgreSQL(), user=postgres,
password=antione,dbname=Education)
rs - dbGetQuery(con, SELECT (GRADE1[10]) As grade1_t1, (GRADE1[11]) As
grade1_t2, (GRADE1[12]) As grade1_t3, (GRADE2[11]) As grade2_t2,
(GRADE2[12]) As grade2_t3, (GRADE2[13]) As grade2_t4 FROM attending)
myval - rs
attach(myval)
names(myval)
dim(myval)

mymatrix - array(myval, c(379,2,3))

mymatrix[,1,1] - grade1_t1
mymatrix[,1,2] - grade1_t2
mymatrix[,1,3] - grade1_t3
mymatrix[,2,1] - grade2_t2
mymatrix[,2,2] - grade2_t3
mymatrix[,2,3] - grade2_t4

I can do this
plot(mymatrix[1,1,1:3],mymatrix[1,2,1:3])

On Fri, Nov 25, 2011 at 6:06 AM, Bert Gunter [via R] 
ml-node+s789695n4107159...@n4.nabble.com wrote:

 Inline below.

 -- Bert

 On Fri, Nov 25, 2011 at 2:31 AM, Milan Bouchet-Valat [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4107159i=0
 wrote:

  Le vendredi 25 novembre 2011 Ã  00:02 -0800, Dhaynes a Ã©crit :
  Hello,
 
  I am new to R.
  I have multidimensional array (379,2,3) and I need to create a series
 of
  linear regressions (379 to be exact)
  I have the array stored properly I believe, but I can not use the
  lm(myarray[1,1,1:3]~myarray[1,2,1:3])
  I have checked to make sure they are exactly the same length.
  I have also tried endlessly to convert the subset of the array back
 into a
  vector.

 ?as.vector
 Actually an array **is** a vector -- but with an additional dim
 attribute. Try:
 str(x)


 
  any help would be appreciated.

 1) Read relevant portions of R docs, like ?array and perhaps An
 Introduction to R.

 2)  Read and follow the posting guide. In particular, give us a toy
 example with the code you used to construct your array. It's difficult
 to diagnose the source of engine failure without the car.

 3) See my comment below.

  The 'formula' argument of lm doesn't take actual values, but variable
  names. So you need to create vectors containing your data, or pass a

 --This is patently false. Please check before giving obviously wrong
 advice:

  x - array(rnorm(150), dim= c(10,5,3))
  lm(x[,3,2] ~x[,1,1])

 Call:
 lm(formula = x[, 3, 2] ~ x[, 1, 1])

 Coefficients:
 (Intercept)x[, 1, 1]
 -0.1247   0.1171





  data frame with these vectors are columns. So, going the latter way :
  df - data.frame(a=myarray[1,1,1:3], b=myarray[1,2,1:3])
  lm(a ~ b, data=df)
 
  or in one step
  lm(a ~ b, data=data.frame(a=myarray[1,1,1:3], b=myarray[1,2,1:3]))
 
 
  Regards
 
  __
  [hidden email] 
  http://user/SendEmail.jtp?type=nodenode=4107159i=1mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:

 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

 __
 [hidden email] http://user/SendEmail.jtp?type=nodenode=4107159i=2mailing 
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://r.789695.n4.nabble.com/Error-invalid-type-list-for-variable-when-using-lm-tp3045462p4107159.html
  To unsubscribe from Error: invalid type(list) for variable when using
 lm(), click 
 herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3045462code=aGF5bmVzZDJAZ21haWwuY29tfDMwNDU0NjJ8MjA1OTM1OTY5
 .

Re: [R] Objects disappearing in my R work space



On Nov 25, 2011, at 10:08 AM, Michael Clawson wrote:


Uwe,

by window I mean instances, by runs I mean, runs the my Markov-Chain  
Monte

Carlo simulator


It would probably be better to adopt the terminology that the things  
you are calling windows are sessions.


I open two instances of R, run a million cycle chain in each  
instance, and
when they finish, neither window has the object I defined to store  
the runs.


I tested this morning and when I open two R windows and run a 5k cycle
chain in each instance, neither window has the object I defined to  
store

the runs.

This does not happen when I only have one instance of R open


The most common cause of that behavior is failing to assign the output  
of a function to a name. There is an object named .Last.value that  
hosld the results of the last returned object even it it doesn't have  
another name.


lapply(1:10,  I)
test - .Last.value
test
[[1]]
[1] 1

[[2]]
[1] 2
snipped rest of output

But as Uwe said ... without the code, ... and your OS (to answer the  
question about memory)   and your sessionInfo() to make sure that  
this is not a GUI-related issue ... we cannot say very much.





2011/11/25 Uwe Ligges lig...@statistik.tu-dortmund.de




On 25.11.2011 05:12, Aldo wrote:

Is there a maximum memory allocation for all R windows open?  
because it is

like 1-3 million runs




So you mean you open a million windows at the same time? In that  
case we

really need your definition of window.


 so... it may be reaching some sort of memory limit




I do not know if any OS / window manager has the capability to open  
that

many numbers of windows. But as I said, we need some difintions and
examples.

Uwe Ligges

--


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space

2011-11-25 Thread Michael Clawson

My problem with providing the code, is MCMC is a fairly integrated process,
so I dont know how I would pare it down to send...
Would it work to send the MCMC code, and the three *.csv files to go along
with it?

On Fri, Nov 25, 2011 at 9:54 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Nov 25, 2011, at 10:08 AM, Michael Clawson wrote:

  Uwe,

 by window I mean instances, by runs I mean, runs the my Markov-Chain Monte
 Carlo simulator


 It would probably be better to adopt the terminology that the things you
 are calling windows are sessions.


 I open two instances of R, run a million cycle chain in each instance, and
 when they finish, neither window has the object I defined to store the
 runs.

 I tested this morning and when I open two R windows and run a 5k cycle
 chain in each instance, neither window has the object I defined to store
 the runs.

 This does not happen when I only have one instance of R open


 The most common cause of that behavior is failing to assign the output of
 a function to a name. There is an object named .Last.value that hosld the
 results of the last returned object even it it doesn't have another name.

 lapply(1:10,  I)
 test - .Last.value
 test
 [[1]]
 [1] 1

 [[2]]
 [1] 2
 snipped rest of output

 But as Uwe said ... without the code, ... and your OS (to answer the
 question about memory)   and your sessionInfo() to make sure that this
 is not a GUI-related issue ... we cannot say very much.




 2011/11/25 Uwe Ligges 
 lig...@statistik.tu-dortmund.**delig...@statistik.tu-dortmund.de
 



 On 25.11.2011 05:12, Aldo wrote:

  Is there a maximum memory allocation for all R windows open? because it
 is
 like 1-3 million runs


 
 So you mean you open a million windows at the same time? In that case we
 really need your definition of window.


  so... it may be reaching some sort of memory limit



 I do not know if any OS / window manager has the capability to open that
 many numbers of windows. But as I said, we need some difintions and
 examples.

 Uwe Ligges

 --


 David Winsemius, MD
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: invalid type(list) for variable when using lm()



On Nov 25, 2011, at 11:41 AM, Dhaynes wrote:


Ok let me clarify

I have multidimensional array and I need to convert it to a singular
dimensional array.
The multidimensional array is 359 rows, 2 cols, 3 deep
I need to run a regression model mymatrix[1,1,1:3] and mymatrix  
[1,2,1:3]


This is my current error, which indicates I have the incorrect list  
type (I

have tried functions as.list, as.vector, as.vector)

lm(formula = mymatrix[1,1,1:3]~mymatrix[1,2,1:3] )
Error in model.frame.default(formula = mymatrix[1, 1, 1:3] ~  
mymatrix[1,  :

 invalid type (list) for variable 'mymatrix[1, 1, 1:3]'


I was unsuccessful at attempting the str(mymatrix[1,1,1:3] -- 
Argument not

valid model

The data.frame function did not create the objects
- data.frame(a=mymatrix[1,1,1:3], b=mymatrix[1,2,1:3])

lm(a~b, data=df)

Error in eval(expr, envir, enclos) : object 'a' not found

Here is my code
con - dbConnect(PostgreSQL(), user=postgres,
password=antione,dbname=Education)
rs - dbGetQuery(con, SELECT (GRADE1[10]) As grade1_t1,  
(GRADE1[11]) As

grade1_t2, (GRADE1[12]) As grade1_t3, (GRADE2[11]) As grade2_t2,
(GRADE2[12]) As grade2_t3, (GRADE2[13]) As grade2_t4 FROM attending)
myval - rs
attach(myval)


Generally a bad idea to attach objects. It's a sin that is committed  
by several authors but it generally gets in the way of safe code  
writing. Better to use with().



names(myval)
dim(myval)

mymatrix - array(myval, c(379,2,3))

mymatrix[,1,1] - grade1_t1
mymatrix[,1,2] - grade1_t2
mymatrix[,1,3] - grade1_t3
mymatrix[,2,1] - grade2_t2
mymatrix[,2,2] - grade2_t3
mymatrix[,2,3] - grade2_t4


But what are these various grade-named objects? Are you sure you  
didn't coerce the matrix to character mode? What is str(mymatrix)  
after this?



--
David.


I can do this
plot(mymatrix[1,1,1:3],mymatrix[1,2,1:3])

On Fri, Nov 25, 2011 at 6:06 AM, Bert Gunter [via R] 
ml-node+s789695n4107159...@n4.nabble.com wrote:


Inline below.

-- Bert

On Fri, Nov 25, 2011 at 2:31 AM, Milan Bouchet-Valat [hidden  
email]http://user/SendEmail.jtp?type=nodenode=4107159i=0

wrote:


Le vendredi 25 novembre 2011 à 00:02 -0800, Dhaynes a écrit :

Hello,

I am new to R.
I have multidimensional array (379,2,3) and I need to create a  
series

of

linear regressions (379 to be exact)
I have the array stored properly I believe, but I can not use the
lm(myarray[1,1,1:3]~myarray[1,2,1:3])
I have checked to make sure they are exactly the same length.
I have also tried endlessly to convert the subset of the array back

into a

vector.


?as.vector
Actually an array **is** a vector -- but with an additional dim
attribute. Try:

str(x)





any help would be appreciated.


1) Read relevant portions of R docs, like ?array and perhaps An
Introduction to R.

2)  Read and follow the posting guide. In particular, give us a toy
example with the code you used to construct your array. It's  
difficult

to diagnose the source of engine failure without the car.

3) See my comment below.

The 'formula' argument of lm doesn't take actual values, but  
variable

names. So you need to create vectors containing your data, or pass a


--This is patently false. Please check before giving obviously wrong
advice:


x - array(rnorm(150), dim= c(10,5,3))
lm(x[,3,2] ~x[,1,1])


Call:
lm(formula = x[, 3, 2] ~ x[, 1, 1])

Coefficients:
(Intercept)x[, 1, 1]
   -0.1247   0.1171





data frame with these vectors are columns. So, going the latter  
way :

df - data.frame(a=myarray[1,1,1:3], b=myarray[1,2,1:3])
lm(a ~ b, data=df)

or in one step
lm(a ~ b, data=data.frame(a=myarray[1,1,1:3], b=myarray[1,2,1:3]))


Regards

__
[hidden email] http://user/SendEmail.jtp? 
type=nodenode=4107159i=1mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:

http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
[hidden email] http://user/SendEmail.jtp? 
type=nodenode=4107159i=2mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
If you reply to this email, your message will be added to the  
discussion

below:

http://r.789695.n4.nabble.com/Error-invalid-type-list-for-variable-when-using-lm-tp3045462p4107159.html
To unsubscribe from Error: invalid type(list) for variable when using
lm(), click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3045462code=aGF5bmVzZDJAZ21haWwuY29tfDMwNDU0NjJ8MjA1OTM1OTY5 


.

Re: [R] counting values with some conditions in a simulation

2011-11-25 Thread Jeff Newmiller

A) you need to reply-all to keep the discussion on the mailing list.

B) you need to post in plain text.

C) this has the arbitrary smell of homework. This is not a homework help line.

D) You are overwriting your accumulation variable sumt after each test. Since 
you are not handling this calculation in a vectorized manner, I suggest you use 
the if ... else ... else syntax to accomplish this. See the help for if.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Sl K s.ka...@gmail.com wrote:

Sorry, I forgot to include my code. Here is what I am trying to do.



rep=10

results-numeric(rep)

x- data.frame(matrix(runif(10*15),15))

y- data.frame(matrix(runif(10*15),15))

for (i in c(1:rep)){

st-data.frame(y=c(x[,i],y[,i]),samp=factor(c(rep(X,15),rep(Y,15

stt-st[order(st[,1]),]

dt-stt[1:30,]

r-as.vector( dt$samp)

 tt-rle(r)$lengths[rle(r)$values == X]

sumt-sum(tt[1:3])

sumt[sumt =3] - 0

 sumt[sumt3]- 1

sums-as.numeric(sumt)

results[i] - sums

}

xx-as.vector(results)

sum(xx)


This was the original code I had, before I was just counting how many
will
give me a sum more than 3. Now, I want to show that if sumt3 then 0,
if
sum3 then 1, if sum=3, then generate a random number from uniform
distribution, if this number is say less than 0.5 then it's 1, if
greater
than 0.5, then it's 0.

Thank you very much for your help.


On Fri, Nov 25, 2011 at 10:19 AM, Jeff Newmiller
jdnew...@dcn.davis.ca.uswrote:

 You need to read the posting guide. Provide a reproducible code
sample,
 simplified, with self-contained data.
 You might find the ave function useful if you are working with
 vectorized simulations.

---
 Jeff NewmillerThe .   .  Go
Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
  Live:   OO#.. Dead: OO#.. 
Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#. 
rocks...1k

---
 Sent from my phone. Please excuse my brevity.

 Sl K s.ka...@gmail.com wrote:

 Dear R users,
 
 I am running simulations (1000), and in my simulation I am looking
at
 specific sums. For example, if the sum is =4 then count this, if
say
 3,
 then don't count, if the sum=3, then generate a random number from
 uniform
 distribution, if this number is say less than 0.5, then count this
sum,
 if
 greater than 0.5, then don't count. I am having trouble with
 introducing
 this uniform number and decide whether to count 3 or not.  Any help
or
 hint
 will be greatly appreciated. Thank you very much in advance
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there way to add a new row to a data frame in a specific location

2011-11-25 Thread Ian Strang


This look really interesting but I don't understand what is happening.
Please can someone explain the last line and what the bit in [] is doing.
Ian

df = data.frame( A=c('a','b','c'), B=c(1,2,3), C=c(10,20,30),
stringsAsFactors=FALSE)

newrow = c('X', 100, 200)

rbind(df,newrow)[c(1,4,2,3),]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there way to add a new row to a data frame in a specific location



On Nov 25, 2011, at 2:10 PM, Ian Strang wrote:


This look really interesting but I don't understand what is happening.
Please can someone explain the last line and what the bit in [] is  
doing.

Ian


You just stick the new line on the bottom and return the rows in the  
order specified in the i argument to [. It's just like vector  
indexing except with rows.


 (1:4)[c(4,2,3,1)]
[1] 4 2 3 1
 (4:1)[c(4,2,3,1)]
[1] 1 3 2 4




df = data.frame( A=c('a','b','c'), B=c(1,2,3), C=c(10,20,30),
stringsAsFactors=FALSE)

newrow = c('X', 100, 200)

rbind(df,newrow)[c(1,4,2,3),]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple selection, renaming and saving the results

2011-11-25 Thread Robin Corrià




Dear all,

I have a
big data frame:


str(data1)

'data.frame':   18272 obs. of  11 variables:

 $ tag  :
int  11 12 13 15 17
18 19 100011 100012 100014 ...

 $ sp   :
Factor w/ 18 levels acassp,acocar,..: 13 5 7 14 14 18 3
11 13 10 ...

 $ gx   :
num  20 10 35 68 88 63 123 115 137 136
...

 $ gy   :
num  30 25 24 1 10 40 45 25 23 45 ...

 $ d0   :
int  0 0 0 0 0 0 0 0 0 0 ...

 $ d1   :
int  395 395 395 395 395 395 395 395 395
395 ...

 $ d2   :
int  751 751 751 751 751 751 751 751 751
751 ...

 $ d3   :
int  1515 1515 1515 1515 1515 1515 1515
1515 1515 1515 ...

 $ d4   :
int  2562 2562 2562 2562 2562 2562 2562
2562 2562 2562 ...

 $ block: int 
1 1 1 1 1 1 1 1 1 1 ...

 $ treat: Factor w/ 4 levels
I,M,N,T: 1 1 1 1 1 1 1 1 1 1
...


And I need to
do multiple selections of gx and gy for all the levels of: sp, block, treat, 
and when d0!=NA. Then
calculate some spatial functions with the selected gx and gy coordinates, and
save the results with a name according to the selection.
 

One single
selection could be done and named like that:

acocar1I=subset(data1,(treat==I
 data1$block==1  data1$sp==acocar  data1$d0!=NA))


These are some
of the functions I have to calculate:

acocar1I.spp-spp(x=acocar1I$gx,
y=acocar1I$gy, window=wA)

acocar1I.dp-dval(acocar1I.spp,25,2.5,18,20)


And I want
to create a 'results' object to access easily all the results:

acocar1I.res-alist()

acocar1I.res$data1-acocar1I

acocar1I.res$spp-acocar1I.spp

acocar1I.res$dp-acocar1I.dp


Can I do everything
in a single loop or a single function?

Many
thanks!











  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] perfectionism

2011-11-25 Thread Jack Tanner

I have a named vector:

 z - c(1, 2, 3, 2)
 names(z) - c(a,b,c,b)
 f - c(b,c)

I want to know the index in z of the first occurrence of each of the values in 
f.

One implementation is 

 sapply(f, function(x) which(names(z)==x)[1])
b c
2 3

Is which() smart enough to stop when it finds in z the first occurrence of every
value from f, or does it search through all the values in z only to report the
first one?

Are some more elegant ways of writing this code?

Just curious.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend

2011-11-25 Thread Filoche

Thank you sire for your help. Thank also for sharing the reference, I'll take
a look at it.

Regards,
Phil

--
View this message in context: 
http://r.789695.n4.nabble.com/Legend-tp4103799p4108587.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] perfectionism

Try this:

 z - c(1, 2, 3, 2)
 names(z) - c(a,b,c,b)
 f - c(b,c)
 match(f, names(z))
[1] 2 3
 # if you want the names
 x - match(f, names(z))
 names(x) - f
 x
b c
2 3



On Fri, Nov 25, 2011 at 2:23 PM, Jack Tanner i...@hotmail.com wrote:
 I have a named vector:

 z - c(1, 2, 3, 2)
 names(z) - c(a,b,c,b)
 f - c(b,c)

 I want to know the index in z of the first occurrence of each of the values 
 in f.

 One implementation is

 sapply(f, function(x) which(names(z)==x)[1])
 b c
 2 3

 Is which() smart enough to stop when it finds in z the first occurrence of 
 every
 value from f, or does it search through all the values in z only to report the
 first one?

 Are some more elegant ways of writing this code?

 Just curious.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] perfectionism

2011-11-25 Thread Patrick Burns


Did you try:

z[f]

On 25/11/2011 19:23, Jack Tanner wrote:

I have a named vector:


z- c(1, 2, 3, 2)
names(z)- c(a,b,c,b)
f- c(b,c)


I want to know the index in z of the first occurrence of each of the values in 
f.

One implementation is


sapply(f, function(x) which(names(z)==x)[1])

b c
2 3

Is which() smart enough to stop when it finds in z the first occurrence of every
value from f, or does it search through all the values in z only to report the
first one?

Are some more elegant ways of writing this code?

Just curious.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Dataset fit for Wavelet regression

2011-11-25 Thread Gyanendra Pokharel

Hi all,
Can any body suggest me the data set which has the length of power of two
and fit for the wavelet smoothing for the regression?
There are alot of datasets like ethanol in the package SemiPar that can
be used for this smoothing but we have to extend them in the power of two.
I have no idea of this. If any body give the technique for this, it will be
great.
Best

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On-demand importing of a package

2011-11-25 Thread Jakson Alves de Aquino

On Fri, Nov 25, 2011 at 2:40 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 I think that the following procedure has the result that you want:

 Put in the DESCRIPTION file:

 Imports: RSQLite

 And in the R code write something like:

 RSQLite::AnRSQLiteFunction()

 I had been thinking of using Imports in DESCRIPTION but was concerned
 that that would put RSQLite objects ahead of everything else on
 sqldf's search path even when not wanted but I gather you are
 intending that Imports be used in DESCRIPTION: but _not_ in the
 NAMESPACE file.  I think that that would likely work. I will test it
 out to be sure. What I would probably want to do is to require()
 RSQLite in case the user wants to mix sqldf and RSQLite calls and I
 will check whether the check procedure allows that if the package is
 only named in Imports but, if not, it might be sufficient to put
 RSQLite in both Imports and Suggests.  Thanks.

I have done this with the 'descr' package. It wasn't necessary to put
the imported packages in two places, only in the Imports field. This
was enough to make R install all dependencies but not load then along
with 'descr'.

-- 
Jakson Alves de Aquino
Federal University of Ceará
Social Sciences Department
www.lepem.ufc.br/aquino.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] perfectionism

2011-11-25 Thread Jack Tanner

jim holtman jholtman at gmail.com writes:

  match(f, names(z))
 [1] 2 3

Jim, thanks so much, that's right on.

Patrick, thanks to you too, but yours is not the same as what I asked:

 z - c(3,4,5,4)
 names(z)- c(a,b,c,b)
 z[f]
b c 
4 5

Yours returns the actual values in z, not the indexes in z, i.e., not

[1] 2 3

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] perfectionism

2011-11-25 Thread Barth B. Riley

Z[f] would return the fth values in z. That is, if f = c(1,2) z[f] would return 
the first and second elements of z. The original intent was to obtain the 
indices in z corresponding to the values in f.

Barth


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Patrick Burns
Sent: Friday, November 25, 2011 1:41 PM
To: r-help@r-project.org; i...@hotmail.com
Subject: Re: [R] perfectionism

Did you try:

z[f]

On 25/11/2011 19:23, Jack Tanner wrote:
 I have a named vector:

 z- c(1, 2, 3, 2)
 names(z)- c(a,b,c,b)
 f- c(b,c)

 I want to know the index in z of the first occurrence of each of the values 
 in f.

 One implementation is

 sapply(f, function(x) which(names(z)==x)[1])
 b c
 2 3

 Is which() smart enough to stop when it finds in z the first occurrence of 
 every
 value from f, or does it search through all the values in z only to report the
 first one?

 Are some more elegant ways of writing this code?

 Just curious.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

PRIVILEGED AND CONFIDENTIAL INFORMATION
This transmittal and any attachments may contain PRIVILEGED AND
CONFIDENTIAL information and is intended only for the use of the
addressee. If you are not the designated recipient, or an employee
or agent authorized to deliver such transmittals to the designated
recipient, you are hereby notified that any dissemination,
copying or publication of this transmittal is strictly prohibited. If
you have received this transmittal in error, please notify us
immediately by replying to the sender and delete this copy from your
system. You may also call us at (309) 827-6026 for assistance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: invalid type(list) for variable when using lm()

2011-11-25 Thread Bert Gunter

Inline below. HOWEVER -- my comments are tentative and need
verification by someone more expert because:

1. This is not a reproducible example, so I have no idea what really happening

2. I don't know what your dbQuery command does.Do you?

But see below for my guesses

-- Bert

On Fri, Nov 25, 2011 at 10:10 AM, David Winsemius
dwinsem...@comcast.net wrote:

 On Nov 25, 2011, at 11:41 AM, Dhaynes wrote:

 Ok let me clarify

 I have multidimensional array and I need to convert it to a singular
 dimensional array.
 The multidimensional array is 359 rows, 2 cols, 3 deep
 I need to run a regression model mymatrix[1,1,1:3] and mymatrix [1,2,1:3]

 This is my current error, which indicates I have the incorrect list type
 (I
 have tried functions as.list, as.vector, as.vector)

 lm(formula = mymatrix[1,1,1:3]~mymatrix[1,2,1:3] )
 Error in model.frame.default(formula = mymatrix[1, 1, 1:3] ~ mymatrix[1,
  :
  invalid type (list) for variable 'mymatrix[1, 1, 1:3]'


 I was unsuccessful at attempting the str(mymatrix[1,1,1:3] --Argument
 not
 valid model

It should be str(mymatrix)

 The data.frame function did not create the objects
 - data.frame(a=mymatrix[1,1,1:3], b=mymatrix[1,2,1:3])

LHS is missing, but presumably just a typo here. Note that a and b
would contain only 3 values each, presumably not what you want. And,
as I said in my earlier message, you don't need to do this anyway.


 lm(a~b, data=df)

 Error in eval(expr, envir, enclos) : object 'a' not found

 Here is my code
 con - dbConnect(PostgreSQL(), user=postgres,
 password=antione,dbname=Education)
 rs - dbGetQuery(con, SELECT (GRADE1[10]) As grade1_t1, (GRADE1[11]) As
 grade1_t2, (GRADE1[12]) As grade1_t3, (GRADE2[11]) As grade2_t2,
 (GRADE2[12]) As grade2_t3, (GRADE2[13]) As grade2_t4 FROM attending)

I think the problem is the structure of rs. Is it a data.frame or a
list or what? What does str(rs) give you?

I think you need to **carefully** read ?dbGetQuery


 myval - rs
 attach(myval)

 Generally a bad idea to attach objects. It's a sin that is committed by
 several authors but it generally gets in the way of safe code writing.
 Better to use with().

-- I second this.

 names(myval)
 dim(myval)

 mymatrix - array(myval, c(379,2,3))

 mymatrix[,1,1] - grade1_t1
 mymatrix[,1,2] - grade1_t2
 mymatrix[,1,3] - grade1_t3
 mymatrix[,2,1] - grade2_t2
 mymatrix[,2,2] - grade2_t3
 mymatrix[,2,3] - grade2_t4

 But what are these various grade-named objects? Are you sure you didn't
 coerce the matrix to character mode? What is str(mymatrix) after this?


 --
 David.

 I can do this
 plot(mymatrix[1,1,1:3],mymatrix[1,2,1:3])

 On Fri, Nov 25, 2011 at 6:06 AM, Bert Gunter [via R] 
 ml-node+s789695n4107159...@n4.nabble.com wrote:

 Inline below.

 -- Bert

 On Fri, Nov 25, 2011 at 2:31 AM, Milan Bouchet-Valat [hidden
 email]http://user/SendEmail.jtp?type=nodenode=4107159i=0
 wrote:

 Le vendredi 25 novembre 2011 à 00:02 -0800, Dhaynes a écrit :

 Hello,

 I am new to R.
 I have multidimensional array (379,2,3) and I need to create a series

 of

 linear regressions (379 to be exact)
 I have the array stored properly I believe, but I can not use the
 lm(myarray[1,1,1:3]~myarray[1,2,1:3])
 I have checked to make sure they are exactly the same length.
 I have also tried endlessly to convert the subset of the array back

 into a

 vector.

 ?as.vector
 Actually an array **is** a vector -- but with an additional dim
 attribute. Try:

 str(x)



 any help would be appreciated.

 1) Read relevant portions of R docs, like ?array and perhaps An
 Introduction to R.

 2)  Read and follow the posting guide. In particular, give us a toy
 example with the code you used to construct your array. It's difficult
 to diagnose the source of engine failure without the car.

 3) See my comment below.

 The 'formula' argument of lm doesn't take actual values, but variable
 names. So you need to create vectors containing your data, or pass a

 --This is patently false. Please check before giving obviously wrong
 advice:

 x - array(rnorm(150), dim= c(10,5,3))
 lm(x[,3,2] ~x[,1,1])

 Call:
 lm(formula = x[, 3, 2] ~ x[, 1, 1])

 Coefficients:
 (Intercept)    x[, 1, 1]
   -0.1247       0.1171





 data frame with these vectors are columns. So, going the latter way :
 df - data.frame(a=myarray[1,1,1:3], b=myarray[1,2,1:3])
 lm(a ~ b, data=df)

 or in one step
 lm(a ~ b, data=data.frame(a=myarray[1,1,1:3], b=myarray[1,2,1:3]))


 Regards

 __
 [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4107159i=1mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

 http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.




 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:


 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

Re: [R] Objects disappearing in my R work space

2011-11-25 Thread Ben Bolker

Michael Clawson michael.v.clawson at gmail.com writes:

 
 My problem with providing the code, is MCMC is a fairly integrated process,
 so I dont know how I would pare it down to send...
 Would it work to send the MCMC code, and the three *.csv files to go along
 with it?
 

  Reproducible is essential, minimal is strongly recommended.  The more you
can pare down your code, the more likely it is is that someone will actually
take the time to take a look at it and see what's going on.
  Are you using an external program (JAGS, WinBUGS, etc.) or writing to 
external files that might be getting clobbered by other instances when
you have more than one instance running at once?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] regression fit

2011-11-25 Thread magushi

Hi every one.
I want to analyze my data. The  y (density) is very skewed to the right. 
this is my tentative fit-density~factor(Site)+factor(Species)+ depth. 
which model to apply between glm and lm, for sure I can not use lm( This is
non linear data)

--
View this message in context: 
http://r.789695.n4.nabble.com/regression-fit-tp4108686p4108686.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regression fit



On Nov 25, 2011, at 2:56 PM, magushi wrote:


Hi every one.
I want to analyze my data. The  y (density) is very skewed to the  
right.

this is my tentative fit-density~factor(Site)+factor(Species)+ depth.
which model to apply between glm and lm, for sure I can not use  
lm( This is

non linear data)


Until you have looked more carefully at your tentative fit you are  
jumping to unsupported assumptions. It is a common misconception that  
a skewed distribution of the dependent variable represents a problem  
for linear methods. It may be true but you will not know until you  
look at the distribution of the residuals of a linear fit. So far we  
see not evidence that you have done so. If you showed us plots of  
density ~ depth within categories of Site and Species then there might  
be a basis for discussion of alternative links or transformations.


This wouldn't be homework by any chance, would it?




--
View this message in context: 
http://r.789695.n4.nabble.com/regression-fit-tp4108686p4108686.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On-demand importing of a package

2011-11-25 Thread Gabor Grothendieck

On Fri, Nov 25, 2011 at 2:47 PM, Jakson Alves de Aquino
jalve...@gmail.com wrote:
 On Fri, Nov 25, 2011 at 2:40 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 I think that the following procedure has the result that you want:

 Put in the DESCRIPTION file:

 Imports: RSQLite

 And in the R code write something like:

 RSQLite::AnRSQLiteFunction()

 I had been thinking of using Imports in DESCRIPTION but was concerned
 that that would put RSQLite objects ahead of everything else on
 sqldf's search path even when not wanted but I gather you are
 intending that Imports be used in DESCRIPTION: but _not_ in the
 NAMESPACE file.  I think that that would likely work. I will test it
 out to be sure. What I would probably want to do is to require()
 RSQLite in case the user wants to mix sqldf and RSQLite calls and I
 will check whether the check procedure allows that if the package is
 only named in Imports but, if not, it might be sufficient to put
 RSQLite in both Imports and Suggests.  Thanks.

 I have done this with the 'descr' package. It wasn't necessary to put
 the imported packages in two places, only in the Imports field. This
 was enough to make R install all dependencies but not load then along
 with 'descr'.

I just tried it but I wanted to require() RSQLite so that the user can
access its facilities as well and although putting it just in Imports
does work the check complains about requiring a package that has not
been declared unless I put it in Suggests as well.  If I don't do a
require() then it would not be necessary to put it in Suggests so
there seems to be a slight difference between descr and sqldf.


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unable to reproduce Stata Heckman sample selection estimates

2011-11-25 Thread Yuan Yuan

Hi Arne,

I believe I figured out why the Stata coefficient estimates differed 
from R's: in my case, the outcome response variable is binary, so 
the outcome equation is a probit model. From my reading of the 
sampleSelection paper, it seems that the Tobit-2 model has a 
continuous outcome response variable. The Stata command used was 
heckprob, which assumes both the outcome and the selection equations 
are probit models. When I compared the Stata heckman command with 
the R results, I found the estimates were the same.

Sorry for not picking up on that difference earlier.

So it seems that selection() is perhaps not what I'm looking for, 
unless there is a way to specify a probit selection model. Is there 
a package out there that estimates probit models with Heckman sample 
selection? It looks like SemiParBIVProbit might work for me.

 - Clara

On Friday, November 25, 2011 11:05:31 am Yuan Yuan wrote:
 Hi Arne,
 
 Thanks for the reply.
 
 I am using R version 2.14.0 and sampleSelection version 0.6.12.
 
 I estimate the model by the 1-step ML method. However, when I use
 the 2-step method, the standard errors are reported as NA.
 
 I use the selection() function, very basic call, something to the
 effect of: selection(selectionFormula, outcomeFormula, data =
 aDataFrame), where the formulas are very straightforward and basic
 as well, y ~ x1 + x2 + ... + xp.
 
 I have read the associated paper, which is where I got the idea to
 pass the coefficients from a seleciton object to the start 
argument.
 
 I will work on creating a minimal reproducible example; the 
dataset
 is large and confidential, the models long-ish.
 
  - Clara
 
 On Friday, November 25, 2011 04:04:52 am Arne Henningsen wrote:
  On 25 November 2011 04:37, Yuan Yuan y.y...@vt.edu wrote:
   Hello,
   
   I am working on reproducing someone's analysis which was done 
in
   Stata. The analysis is estimation of a standard Heckman sample
   selection model (Tobit-2), for which I am using the
 
 sampleSelection
 
   package and the selection() function. I have a few problems 
with
 
 the
 
   estimation:
   
   1) The reported standard error for all estimates is Inf ...
   vcov(selectionObject) yields Inf in every cell.
   
   2) While the selection equation coefficient estimates are 
almost
   exactly the same as the Stata results, the outcome equation
   coefficient estimates are quite different (different sign in 
one
 
 case,
 
   order of magnitude difference in some other cases).
   
   3) I can't seem to figure out how to specify the initial 
values
 
 for
 
   the MLE ... whatever argument I pass to start (even of the 
form
   coef(selectionObject)), I get the following error:
   Error in gr[, fixed] - NA : (subscript) logical subscript too
 
 long
 
   I have to admit I am pretty confused by #1, I feel like I must
 
 be
 
   doing something wrong, missing something obvious, but I have 
no
 
 idea
 
   what. I figure #2 might be because the algorithms (selection 
and
   Stata) are just finding different local maxima, but because of
 
 #3 I
 
   can't test that guess by using different initial values in
 
 selection.
 
   Let me know if I should provide any more information. Thanks 
in
   advance for any pointers in the right direction.
  
  Yes, please provide more information (see also the posting guide
 
 [1]),
 
  e.g. which version of R and which version of the sampleSelection
  package are you using? Do you estimate the model by the two-step
  approach or by the 1-step maximum likelihood method? Which
 
 commands
 
  did use use? Can you send us a reproducible example? Have you 
read
 
 the
 
  paper about using the sampleSelection package [2]?
  
  [1] http://www.r-project.org/posting-guide.html
  [2] http://www.jstatsoft.org/v27/i07
  
  Best wishes from copenhagen,
  Arne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] variable types - logistic regression

2011-11-25 Thread Ben quant

Hello,

Is there an example out there that shows how to treat each of the predictor
variable types when doing logistic regression in R? Something like this:

glm(y~x1+x2+x3+x4, data=mydata, family=binomial(link=logit),
na.action=na.pass)

I'm drawing mostly from:
http://www.ats.ucla.edu/stat/r/dae/logit.htm

...but there are only two types of variable in the example given. I'm
wondering if the answer is that easy or if I have to consider more with
different types of variables. It seems like as.factor() is doing a lot of
the organization for me.

I will need to understand how to perform logistic regression in R on all
data types all in the same model (potentially).

As it stands, I think I can solve all of my data type issues with:

as.factor(x,ordered=T) ...for all discrete ordinal variables
as.factor(x, ordered=F) ...for all discrete nominal variables
...and do nothing for everything else.

I'm pretty sure its not that simple because of some other posts I've seen,
but I haven't seen a post that discusses ALL data types in logistic
regression.

Here is what I think will work at this point:

glm(y ~ **all_other_vars + as.factor(disc_ord_var,ordered=T) +
as.factor(disc_nom_var,ordered=F), data=mydata,
family=binomial(link=logit), na.action=na.pass)

I'm also looking for any best practices help as well. I'm new'ish to
R...and oddly enough I haven't had the pleasure of doing much regression R
yet.

Regards,

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable types - logistic regression

2011-11-25 Thread Joshua Wiley

Hi Ben,

The following is oversimplified but hopefully helpful.  Regression
only works with numbers.  The trick then becomes how to convert
non-numeric data into meaningful numbers.  For so-called continuous
data (the type you get from running: rnorm(100) ), nothing needs to be
done.  For others (e.g., what you gey from sample(1:5, 100, replace =
TRUE) ), the data may not be truly continuous, but it is often treated
as such (this type is particularly common in the social sciences where
questionnaires and surveys are administered and participants are asked
to rate things on 1 to 5 or 1 to 7, or ... scales.

When you move on to data that is not really continuous and you do not
want to treat as such (say first, second, third place), some schema
has to be used to convert them.  Most commonly, contrasts are
used---thus certain levels are contrasted with others.  In R, for
ordered factors, the default contrasts are orthogonal polynomials.
For example the contrasts for the first second, third example might
be:

contrasts(factor(1:3, ordered = TRUE))

.L and .Q stand for linear and quadratice, respectively.  For k
levels, there will be k - 1 contrast columns.  This relaxes the
linearity assuption applied to continuous data by testing the effects
of first, second, etc. order polynomials.  If the data have no
meaningful order, say explaining levels of red bull consumption by
college major, the default contrasts applied by R are dummy codes.
This picks one group (the lowest) as the referent, and compares the
effect of all the other groups, relative to the referent.  For
example, suppose we had a small sample of only three college majors:

contrasts(factor(1:3))

1 is the reference group, the first contrast tests the effect of being
in group 2 versus group 1, the second group 3 versus group 1.

All of these work with logistic regression, or any flavour of general
linear model (via the glm() and other functions).  In many regards,
the treatment of predictors in logistic regression is not any
different from basic linear regression (ordinary least squares [OLS]).
 The logistic functions works on the outcome, not the predictors.
That said, some special considerations do come into play.  You need
some variability on all of your predictors.  In OLS with truly
continuous data, if you have a two level nominal predictor with some
people in each level, it is unlikely that any given cell would have
all the same values.  However, with a 0/1 outcome and a 0/1 predictor,
it may be that in one particular cell, everyone has either a 0 or 1
for the outcome, which can be problematic for estimation purposes.

What sorts of data are you dealing with?  Is just entering the
variables or using factor() not doing what you expect with some?  I
have not looked at the web page you referenced much but if you have an
example type of data you feel is not covered or would like more fully
covered, feel free to email me off list and I can add an example to
the page.

Cheers,

Josh


On Fri, Nov 25, 2011 at 2:09 PM, Ben quant ccqu...@gmail.com wrote:
 Hello,

 Is there an example out there that shows how to treat each of the predictor
 variable types when doing logistic regression in R? Something like this:

 glm(y~x1+x2+x3+x4, data=mydata, family=binomial(link=logit),
 na.action=na.pass)

 I'm drawing mostly from:
 http://www.ats.ucla.edu/stat/r/dae/logit.htm

 ...but there are only two types of variable in the example given. I'm
 wondering if the answer is that easy or if I have to consider more with
 different types of variables. It seems like as.factor() is doing a lot of
 the organization for me.

 I will need to understand how to perform logistic regression in R on all
 data types all in the same model (potentially).

 As it stands, I think I can solve all of my data type issues with:

 as.factor(x,ordered=T) ...for all discrete ordinal variables
 as.factor(x, ordered=F) ...for all discrete nominal variables
 ...and do nothing for everything else.

 I'm pretty sure its not that simple because of some other posts I've seen,
 but I haven't seen a post that discusses ALL data types in logistic
 regression.

 Here is what I think will work at this point:

 glm(y ~ **all_other_vars + as.factor(disc_ord_var,ordered=T) +
 as.factor(disc_nom_var,ordered=F), data=mydata,
 family=binomial(link=logit), na.action=na.pass)

 I'm also looking for any best practices help as well. I'm new'ish to
 R...and oddly enough I haven't had the pleasure of doing much regression R
 yet.

 Regards,

 Ben

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles

Re: [R] Objects disappearing in my R work space

2011-11-25 Thread Duncan Murdoch


On 11-11-25 1:03 PM, Michael Clawson wrote:

My problem with providing the code, is MCMC is a fairly integrated process,
so I dont know how I would pare it down to send...
Would it work to send the MCMC code, and the three *.csv files to go along
with it?


I don't understand.  If you can't make your code simple enough to post, 
how do you think we can possibly imagine what you're doing?


Duncan Murdoch



On Fri, Nov 25, 2011 at 9:54 AM, David Winsemiusdwinsem...@comcast.netwrote:



On Nov 25, 2011, at 10:08 AM, Michael Clawson wrote:

  Uwe,


by window I mean instances, by runs I mean, runs the my Markov-Chain Monte
Carlo simulator



It would probably be better to adopt the terminology that the things you
are calling windows are sessions.



I open two instances of R, run a million cycle chain in each instance, and
when they finish, neither window has the object I defined to store the
runs.

I tested this morning and when I open two R windows and run a 5k cycle
chain in each instance, neither window has the object I defined to store
the runs.

This does not happen when I only have one instance of R open



The most common cause of that behavior is failing to assign the output of
a function to a name. There is an object named .Last.value that hosld the
results of the last returned object even it it doesn't have another name.

lapply(1:10,  I)
test- .Last.value
test
[[1]]
[1] 1

[[2]]
[1] 2
snipped rest of output

But as Uwe said ... without the code, ... and your OS (to answer the
question about memory)   and your sessionInfo() to make sure that this
is not a GUI-related issue ... we cannot say very much.





2011/11/25 Uwe 
Liggeslig...@statistik.tu-dortmund.**delig...@statistik.tu-dortmund.de







On 25.11.2011 05:12, Aldo wrote:

  Is there a maximum memory allocation for all R windows open? because it

is
like 1-3 million runs




So you mean you open a million windows at the same time? In that case we
really need your definition of window.


 so... it may be reaching some sort of memory limit





I do not know if any OS / window manager has the capability to open that
many numbers of windows. But as I said, we need some difintions and
examples.

Uwe Ligges

--




David Winsemius, MD
West Hartford, CT




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The contrast and Design libraries

2011-11-25 Thread Frank Harrell

Note that you can do what you specified using only the rms package:
require(rms)
f - Glm(propalive~exptime+infstat*status, 
data=dat)
contrast(f,
   a = list(status = levels(dat$status), infstat=control, exptime=8230),
   b = list(status = levels(dat$status), infstat=infected,exptime=8230))
Frank

Joanne Lello wrote
 
 Dear all,
 
 I have been using the contrast library in my teaching for the last couple 
 of years and am right in the middle of this year's round. In the last week 
 R has been updated to version 2.14.0 on our computers. This has had the 
 unfortunate effect of meaning the contrasts library no longer works, as 
 the Design library is no longer available. I wonder if anyone has a fix 
 for this...or alternatively can tell me another package that is as simple 
 to use (the students won't cope with anything more complicated - we're all 
 biologists not statisticians).
 
 I hope someone can help.
 
 Here is a typical bit of code I'm currently running so you can see what 
 I'm trying to do:
 
 exptime is a covariate and both infstat and status are factors
 
 mod-glm(propalive~exptime+infstat+status+ 
 infstat:status,
 data=dat)
 
 library(contrast)
 
 contrast(mod3,
a = list(status = levels(dat$status), infstat=control, exptime=8230),
b = list(status = levels(dat$status), infstat=infected,exptime=8230))
 
 any help gratefully received,
 
 Jo
 
 Dr Joanne Lello
 Cardiff University
 School of Biosciences
 Organism and Environment Group
 Biomedical Sciences Building
 Museum Avenue
 Cardiff
 CF10 3AX
 Tel: 02920 875885
 E-mail: lelloj@.ac
   [[alternative HTML version deleted]]
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/The-contrast-and-Design-libraries-tp4106321p4109210.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space

2011-11-25 Thread Rolf Turner


On 26/11/11 12:19, Duncan Murdoch wrote:

On 11-11-25 1:03 PM, Michael Clawson wrote:
My problem with providing the code, is MCMC is a fairly integrated 
process,

so I dont know how I would pare it down to send...
Would it work to send the MCMC code, and the three *.csv files to go 
along

with it?




This *has* to be a fortune!!!:

I don't understand.  If you can't make your code simple enough to 
post, how do you think we can possibly imagine what you're doing?


cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The contrast and Design libraries

2011-11-25 Thread Joshua Wiley

On Thu, Nov 24, 2011 at 9:23 AM, Joanne Lello lel...@cardiff.ac.uk wrote:
 Dear all,

 I have been using the contrast library

They're packages!

[snip]

For the sake of the good Martin Maechler,

Josh

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space

One thing is to define missing a little better.  For example, as
mentioned previously, are you returning the values from a function
call?  If you, print out an indication that they exist at that point.
If there is further processing happening, put some checks as to their
existance as the code continues.  Can you localize where this is
happening?  If they are disappearing, then there is something you
are doing in your code to most likely make it happen.  Until there is
something that people can reproduce, there are all types of theories
we can expound on.

If I was looking at the code, I could probably put checks in at
various points to see in what section these things disappeared.  Do
you have some 'try' functions around parts of the code that might not
be reporting some error conditions?  So probably until you can provide
something that we can at least look at, and exactly how you determined
that something disappeared, there is probably not much more we can do
at this point.

On Fri, Nov 25, 2011 at 6:19 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 11-11-25 1:03 PM, Michael Clawson wrote:

 My problem with providing the code, is MCMC is a fairly integrated
 process,
 so I dont know how I would pare it down to send...
 Would it work to send the MCMC code, and the three *.csv files to go along
 with it?

 I don't understand.  If you can't make your code simple enough to post, how
 do you think we can possibly imagine what you're doing?

 Duncan Murdoch


 On Fri, Nov 25, 2011 at 9:54 AM, David
 Winsemiusdwinsem...@comcast.netwrote:


 On Nov 25, 2011, at 10:08 AM, Michael Clawson wrote:

  Uwe,

 by window I mean instances, by runs I mean, runs the my Markov-Chain
 Monte
 Carlo simulator


 It would probably be better to adopt the terminology that the things you
 are calling windows are sessions.


 I open two instances of R, run a million cycle chain in each instance,
 and
 when they finish, neither window has the object I defined to store the
 runs.

 I tested this morning and when I open two R windows and run a 5k cycle
 chain in each instance, neither window has the object I defined to store
 the runs.

 This does not happen when I only have one instance of R open


 The most common cause of that behavior is failing to assign the output of
 a function to a name. There is an object named .Last.value that hosld
 the
 results of the last returned object even it it doesn't have another name.

 lapply(1:10,  I)
 test- .Last.value
 test
 [[1]]
 [1] 1

 [[2]]
 [1] 2
 snipped rest of output

 But as Uwe said ... without the code, ... and your OS (to answer the
 question about memory)   and your sessionInfo() to make sure that
 this
 is not a GUI-related issue ... we cannot say very much.




 2011/11/25 Uwe
 Liggeslig...@statistik.tu-dortmund.**delig...@statistik.tu-dortmund.de




 On 25.11.2011 05:12, Aldo wrote:

  Is there a maximum memory allocation for all R windows open? because
 it

 is
 like 1-3 million runs


 
 So you mean you open a million windows at the same time? In that case
 we
 really need your definition of window.


  so... it may be reaching some sort of memory limit



 I do not know if any OS / window manager has the capability to open
 that
 many numbers of windows. But as I said, we need some difintions and
 examples.

 Uwe Ligges

 --


 David Winsemius, MD
 West Hartford, CT



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Objects disappearing in my R work space

2011-11-25 Thread Michael Clawson

I am returning a matrix from a function

Samples-DoMCMC(InitialVactor, CovarianceMatrix, ObservationData,
NumCycles, Burnin, Thin)

When I have one R session open the program works, Samples is a matrix of
samples from the MCMC

When I have two R sessions open, it runs to completion, but when I got to
view Samples, it is not there.

by disappear, I mean I define Samples as above, and when it finishes
cycling through

I type Samples and it says it doesn't exist, I type the command: ls() and
it is not in the list of objects defined in the session.

How would something I am doing in the code be affected by how many R
sessions are open? I have tried this on two different machines, one running
Windows XP and one Windows 7. but both running R 2.13.1

Thank you all for all of the responses, sorry if my inexperience in R is
hindering this process

On Fri, Nov 25, 2011 at 4:06 PM, jim holtman jholt...@gmail.com wrote:

 One thing is to define missing a little better.  For example, as
 mentioned previously, are you returning the values from a function
 call?  If you, print out an indication that they exist at that point.
 If there is further processing happening, put some checks as to their
 existance as the code continues.  Can you localize where this is
 happening?  If they are disappearing, then there is something you
 are doing in your code to most likely make it happen.  Until there is
 something that people can reproduce, there are all types of theories
 we can expound on.

 If I was looking at the code, I could probably put checks in at
 various points to see in what section these things disappeared.  Do
 you have some 'try' functions around parts of the code that might not
 be reporting some error conditions?  So probably until you can provide
 something that we can at least look at, and exactly how you determined
 that something disappeared, there is probably not much more we can do
 at this point.

 On Fri, Nov 25, 2011 at 6:19 PM, Duncan Murdoch
 murdoch.dun...@gmail.com wrote:
  On 11-11-25 1:03 PM, Michael Clawson wrote:
 
  My problem with providing the code, is MCMC is a fairly integrated
  process,
  so I dont know how I would pare it down to send...
  Would it work to send the MCMC code, and the three *.csv files to go
 along
  with it?
 
  I don't understand.  If you can't make your code simple enough to post,
 how
  do you think we can possibly imagine what you're doing?
 
  Duncan Murdoch
 
 
  On Fri, Nov 25, 2011 at 9:54 AM, David
  Winsemiusdwinsem...@comcast.netwrote:
 
 
  On Nov 25, 2011, at 10:08 AM, Michael Clawson wrote:
 
   Uwe,
 
  by window I mean instances, by runs I mean, runs the my Markov-Chain
  Monte
  Carlo simulator
 
 
  It would probably be better to adopt the terminology that the things
 you
  are calling windows are sessions.
 
 
  I open two instances of R, run a million cycle chain in each instance,
  and
  when they finish, neither window has the object I defined to store the
  runs.
 
  I tested this morning and when I open two R windows and run a 5k cycle
  chain in each instance, neither window has the object I defined to
 store
  the runs.
 
  This does not happen when I only have one instance of R open
 
 
  The most common cause of that behavior is failing to assign the output
 of
  a function to a name. There is an object named .Last.value that hosld
  the
  results of the last returned object even it it doesn't have another
 name.
 
  lapply(1:10,  I)
  test- .Last.value
  test
  [[1]]
  [1] 1
 
  [[2]]
  [1] 2
  snipped rest of output
 
  But as Uwe said ... without the code, ... and your OS (to answer the
  question about memory)   and your sessionInfo() to make sure that
  this
  is not a GUI-related issue ... we cannot say very much.
 
 
 
 
  2011/11/25 Uwe
  Liggeslig...@statistik.tu-dortmund.**de
 lig...@statistik.tu-dortmund.de
 
 
 
 
  On 25.11.2011 05:12, Aldo wrote:
 
   Is there a maximum memory allocation for all R windows open? because
  it
 
  is
  like 1-3 million runs
 
 
  
  So you mean you open a million windows at the same time? In that case
  we
  really need your definition of window.
 
 
   so... it may be reaching some sort of memory limit
 
 
 
  I do not know if any OS / window manager has the capability to open
  that
  many numbers of windows. But as I said, we need some difintions and
  examples.
 
  Uwe Ligges
 
  --
 
 
  David Winsemius, MD
  West Hartford, CT
 
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide

Re: [R] Objects disappearing in my R work space

at least put

print(str(Samples))

right after the function call to make sure it is there, and if you
have more code after that, sprinkle that print statement in the
following code.  You might at least send the code and indicate where
you have tested to see if the object is still there.

So put some of those print statements to help isolate what section of
code the problem is happening in  because just saying it disappears is
not sufficient.  This is elementary debugging of a program.  If your
program is as big and complicated as you say, then you have to start
learning some debugging techniques to help you find where the problem
is.

On Fri, Nov 25, 2011 at 7:48 PM, Michael Clawson
michael.v.claw...@gmail.com wrote:
 I am returning a matrix from a function
 Samples-DoMCMC(InitialVactor, CovarianceMatrix, ObservationData, NumCycles,
 Burnin, Thin)
 When I have one R session open the program works, Samples is a matrix of
 samples from the MCMC
 When I have two R sessions open, it runs to completion, but when I got to
 view Samples, it is not there.
 by disappear, I mean I define Samples as above, and when it finishes cycling
 through
 I type Samples and it says it doesn't exist, I type the command: ls() and it
 is not in the list of objects defined in the session.
 How would something I am doing in the code be affected by how many R
 sessions are open? I have tried this on two different machines, one running
 Windows XP and one Windows 7. but both running R 2.13.1
 Thank you all for all of the responses, sorry if my inexperience in R is
 hindering this process

 On Fri, Nov 25, 2011 at 4:06 PM, jim holtman jholt...@gmail.com wrote:

 One thing is to define missing a little better.  For example, as
 mentioned previously, are you returning the values from a function
 call?  If you, print out an indication that they exist at that point.
 If there is further processing happening, put some checks as to their
 existance as the code continues.  Can you localize where this is
 happening?  If they are disappearing, then there is something you
 are doing in your code to most likely make it happen.  Until there is
 something that people can reproduce, there are all types of theories
 we can expound on.

 If I was looking at the code, I could probably put checks in at
 various points to see in what section these things disappeared.  Do
 you have some 'try' functions around parts of the code that might not
 be reporting some error conditions?  So probably until you can provide
 something that we can at least look at, and exactly how you determined
 that something disappeared, there is probably not much more we can do
 at this point.

 On Fri, Nov 25, 2011 at 6:19 PM, Duncan Murdoch
 murdoch.dun...@gmail.com wrote:
  On 11-11-25 1:03 PM, Michael Clawson wrote:
 
  My problem with providing the code, is MCMC is a fairly integrated
  process,
  so I dont know how I would pare it down to send...
  Would it work to send the MCMC code, and the three *.csv files to go
  along
  with it?
 
  I don't understand.  If you can't make your code simple enough to post,
  how
  do you think we can possibly imagine what you're doing?
 
  Duncan Murdoch
 
 
  On Fri, Nov 25, 2011 at 9:54 AM, David
  Winsemiusdwinsem...@comcast.netwrote:
 
 
  On Nov 25, 2011, at 10:08 AM, Michael Clawson wrote:
 
   Uwe,
 
  by window I mean instances, by runs I mean, runs the my Markov-Chain
  Monte
  Carlo simulator
 
 
  It would probably be better to adopt the terminology that the things
  you
  are calling windows are sessions.
 
 
  I open two instances of R, run a million cycle chain in each
  instance,
  and
  when they finish, neither window has the object I defined to store
  the
  runs.
 
  I tested this morning and when I open two R windows and run a 5k
  cycle
  chain in each instance, neither window has the object I defined to
  store
  the runs.
 
  This does not happen when I only have one instance of R open
 
 
  The most common cause of that behavior is failing to assign the output
  of
  a function to a name. There is an object named .Last.value that
  hosld
  the
  results of the last returned object even it it doesn't have another
  name.
 
  lapply(1:10,  I)
  test- .Last.value
  test
  [[1]]
  [1] 1
 
  [[2]]
  [1] 2
  snipped rest of output
 
  But as Uwe said ... without the code, ... and your OS (to answer the
  question about memory)   and your sessionInfo() to make sure that
  this
  is not a GUI-related issue ... we cannot say very much.
 
 
 
 
  2011/11/25 Uwe
 
  Liggeslig...@statistik.tu-dortmund.**delig...@statistik.tu-dortmund.de
 
 
 
 
  On 25.11.2011 05:12, Aldo wrote:
 
   Is there a maximum memory allocation for all R windows open?
  because
  it
 
  is
  like 1-3 million runs
 
 
  
  So you mean you open a million windows at the same time? In that
  case
  we
  really need your definition of window.
 
 
   so... it may be reaching some sort of memory limit
 
 
 
  I do not know if any OS / window

Re: [R] Multiple selection, renaming and saving the results