Re: [R] best way to apply a list of functions to a dataset ?

2010-07-21 Thread Dennis Murphy
Hi:

On Tue, Jul 20, 2010 at 5:37 PM, Glen Barnett glnbr...@gmail.com wrote:

 Hi Dennis,

 Thanks for the reply.

 Yes, that's easier, but the conversion to a matrix with rbind has
 converted the output of that final function to a numeric.


If you look at the output of lapply(attitude, f), you'll see that the
conversion from logical to numeric has already taken place. Different
components of lists can have different types, but within a component, all of
the elements must have the same class.
You can patch up the result as follows:

x - data.frame(do.call(rbind, lapply(attitude, f)))
x[, 5] - as.logical(x[, 5])
 x
   meansdskewness median mean.gt.med
rating 64.6 12.172562 -0.35792491   65.5   FALSE
complaints 66.6 13.314757 -0.21541749   65.0TRUE
privileges 53.1 12.235430  0.37912287   51.5TRUE
learning   56.36667 11.737013 -0.05403354   56.5   FALSE
raises 64.6 10.397226  0.19754317   63.5TRUE
critical   74.76667  9.894908 -0.86577893   77.5   FALSE
advance42.9 10.288706  0.85039799   41.0TRUE

but if you're doing this sort of thing over a large data frame such fixes
may be impractical.


I included that last function in the example secifically to preclude
 people assuming that functions would always return the same type.



There is a plyr solution, although it's a little more long-winded than I'd
prefer in the end:

library(ggplot2)
# melt the data frame so that the variables become factor levels
 ma - melt(attitude)
Using  as id variables
 dim(ma)
[1] 210   2
# Use ddply to get the set of summaries by variable:

ddply(ma, .(variable), summarise, mean = mean(value), sd = sd(value),
   skewness = skewness(value), median = median(value),
   mean.gt.med = mean.gt.med(value))
variable meansdskewness median mean.gt.med
1 rating 64.6 12.172562 -0.35792491   65.5   FALSE
2 complaints 66.6 13.314757 -0.21541749   65.0TRUE
3 privileges 53.1 12.235430  0.37912287   51.5TRUE
4   learning 56.36667 11.737013 -0.05403354   56.5   FALSE
5 raises 64.6 10.397226  0.19754317   63.5TRUE
6   critical 74.76667  9.894908 -0.86577893   77.5   FALSE
7advance 42.9 10.288706  0.85039799   41.0TRUE

Notice that now the logical class of mean.gt.med is preserved.  The trick
with ddply() in package plyr is that, in this case, it is convenient to melt
the data frame first before doing the summarizations. This is because
ddply() requires a grouping variable - in this example, the groups are the
variables themselves.

HTH,
Dennis

I guess this doesn't matter too much for a logical, but what if
 instead the function returned a character (say mean, median, or
 equal - indicating which one was larger, or equal which could
 easily happen with discrete data). This precludes using rbind (which I
 also used at first, before I noticed that sometimes I could have
 functions that don't return numerics).

 Glen


 On Tue, Jul 20, 2010 at 6:55 PM, Dennis Murphy djmu...@gmail.com wrote:
  Hi:
 
  This might be a little easier (?):
 
  library(datasets)
  skewness - function(x) mean(scale(x)^3)
  mean.gt.med - function(x) mean(x)median(x)
 
  # --
  # construct the function to apply to each variable in the data frame
  f - function(x) c(mean = mean(x), sd = sd(x), skewness = skewness(x),
   median = median(x), mean.gt.med = mean.gt.med(x))
 
  # map function to each variable with lapply and combine with do.call():
  do.call(rbind, lapply(attitude, f))
 meansdskewness median mean.gt.med
  rating 64.6 12.172562 -0.35792491   65.5   0
  complaints 66.6 13.314757 -0.21541749   65.0   1
  privileges 53.1 12.235430  0.37912287   51.5   1
  learning   56.36667 11.737013 -0.05403354   56.5   0
  raises 64.6 10.397226  0.19754317   63.5   1
  critical   74.76667  9.894908 -0.86577893   77.5   0
  advance42.9 10.288706  0.85039799   41.0   1
 
  HTH,
  Dennis
 
 
  On Mon, Jul 19, 2010 at 10:51 PM, Glen Barnett glnbr...@gmail.com
 wrote:
 
  Assuming I have a matrix of data (or under some restrictions that will
  become obvious, possibly a data frame), I want to be able to apply a
  list of functions (initially producing a single number from a vector)
  to the data and produce a data frame (for compact output) with column
  1 being the function results for the first function, column 2 being
  the results for the second function and so on - with each row being
  the columns of the original data.
 
  The obvious application of this is to produce summaries of data sets
  (a bit like summary() does on numeric matrices), but with user
  supplied functions. I am content for the moment to leave it to the
  user to supply functions that work with the data they supply so as to
  produce results that will actually be data-frame-able, though I'd 

[R] xtable with ifelse statement

2010-07-21 Thread vivi84

Hi there,
I'm very new on R and I hope someone can help me to solve the problem in
using the ifelse statement with the xtable function(library xtable). 
I'm trying to get the printing of the elements of two lists in a sorted way.
These two list have in common the their names.
I will try to give an example:
the first list looks like this:

$code1

Code  code1
Nation Usa

$code2

Code  code2
Nation Spain


the second one is like this

$code 1
var187 
var2   73

$code2  
var134
var223


As you can see the two list have the same names.I tried to implement the
following ifelse statement but it doesn't work
orde2=function(x,y){
ifelse(names(x)==names(y),list(print(xtable(x)),print(xtable(y))),NULL)

What I want is the xtable of this first:
$code1

Code  code1
Nation Usa


and after this:

$code 1
var187 
var2   73


And as the first case, the xtable of this;
$code2

Code  code2
Nation Spain


and after this:

$code2  
var134
var223

Someone Knows if it possible to do this and how??
Thanks a lot!

-- 
View this message in context: 
http://r.789695.n4.nabble.com/xtable-with-ifelse-statement-tp2296694p2296694.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can RMySQL run on Windows using a remote MySQL server?

2010-07-21 Thread asbro

Change of user, was neatgadgets but that seems to not work very well.

I managed to get it working by copying the 2 dll files into the R bin
folder. So I am up and running now. Thanks for your help. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Can-RMySQL-run-on-Windows-using-a-remote-MySQL-server-tp2293533p2296597.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice Panel Print Order

2010-07-21 Thread Mark Ebbert
Hi,

I have done a bit of searching and have not found a good answer for my 
question. Although I have not difficulty ordering the panels, Lattice prints 
them from bottom to top and left to right for each page. Is it possible to make 
it print from top to bottom for each page?

I've tinkered with index.cond and tried reordering the panels, but I don't see 
a way to do it on a page by page basis.

Example:
tmp-data.frame(vals=sample(1:100,150,replace=T),concentration=sample(0:15,15),group=letters[1:10])
xyplot(vals~concentration|group,data=tmp,layout=c(3,3))

As you can see my layout is a 3x3 matrix for each page. It would be logical for 
Lattice to print the panels as:
abc
def
ghi
new page
j

However, it prints the following order:
ghi
def
abc
new page
blank row
blank row
j

Your help is much appreciated.

ME
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls with some coefficients fixed

2010-07-21 Thread Keith Jewell

Gabor Grothendieck ggrothendi...@gmail.com wrote in message 
news:aanlktilszaicycu3lz2f5d_bxq1g8m8f7jsjsbj2l...@mail.gmail.com...
 On Tue, Jul 20, 2010 at 9:58 AM,  nas...@uottawa.ca wrote:
 For nls, the fixing (or masking) of parameters is not, to my knowledge,
 possible.

 This is something I've been trying to get in such routines for over 2
 decades. Masks are
 available, but not yet well documented, in Rcgmin and Rvmmin packages.
 However, these use
 an optim() style approach, which is quite different from nls(). If 
 there's
 sufficient
 interest and some collaboration, I'd be willing to have a go at providing
 such
 functionality, but it would take quite a bit of work to provide the full
 capability of nls().


 You can optimize over b while fixing m like this:

 m - 1
 nls(demand ~ m + b * Time, BOD, start = c(b = 1))

Thanks Gabor,

I'd recognised that approach in my original post starting this thread (19 
July) where I'd used the example:
#
# fix parameter and use explicit start values works fine
nls(density ~ SSgompertz(log(conc), Asym, b2, b3=0.8), data = DNase.1, 
start=list(Asym=3, b2=2))
#---

I'm fitting many different models so I need a more generic approach.
I'd appreciate any comments on the suggestion in my 20 July post? (repeated 
here with some typo's corrected)
-
nls - function(formula, data=parent.frame(), start, ...){
  if (missing(start)) start - getInitial(formula, data)
  stats:::nls(formula, data, start=start[names(start) %in% 
all.vars(formula)], ...)
}
--

I see it breaks nls for non-selfStart functions; e.g. omitting the explicit 
start from your example
  nls(demand ~ m + b * Time, BOD)
works fine with vanilla nls (albeit with a warning)
#   No starting values specified for some parameters.
# Intializing 'b' to '1.'.
# Consider specifying 'start' or using a selfStart model
but my wrapper breaks it altogether.
 # Error in getInitial.default(func, data, mCall = as.list(match.call(func, 
:
 #  no 'getInitial' method found for function objects

I think that's OK in my application - I either use selfStart functions or 
explicit start lists - but I don't really want to remove existing 
functionality.

I guess I could (!?!) dive into nls and implement the same kind of approach 
internally, but that's very deep water for me :-{

Best regards,

Keith J

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice Panel Print Order

2010-07-21 Thread Deepayan Sarkar
On Wed, Jul 21, 2010 at 12:13 AM, Mark Ebbert mark.ebb...@hci.utah.edu wrote:
 Hi,

 I have done a bit of searching and have not found a good answer for my 
 question.
 Although I have not difficulty ordering the panels, Lattice prints them from 
 bottom
 to top and left to right for each page. Is it possible to make it print from 
 top to bottom
 for each page?

Search for as.table in ?xyplot. Is this what you were looking for?

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls with some coefficients fixed

2010-07-21 Thread Keith Jewell

Gabor Grothendieck ggrothendi...@gmail.com wrote in message 
news:aanlktilszaicycu3lz2f5d_bxq1g8m8f7jsjsbj2l...@mail.gmail.com...
 On Tue, Jul 20, 2010 at 9:58 AM,  nas...@uottawa.ca wrote:
 For nls, the fixing (or masking) of parameters is not, to my knowledge,
 possible.

 This is something I've been trying to get in such routines for over 2
 decades. Masks are
 available, but not yet well documented, in Rcgmin and Rvmmin packages.
 However, these use
 an optim() style approach, which is quite different from nls(). If 
 there's
 sufficient
 interest and some collaboration, I'd be willing to have a go at providing
 such
 functionality, but it would take quite a bit of work to provide the full
 capability of nls().


 You can optimize over b while fixing m like this:

 m - 1
 nls(demand ~ m + b * Time, BOD, start = c(b = 1))

Thanks Gabor,

I'd recognised that approach in my original post starting this thread (19 
July) where I'd used the example:
#
# fix parameter and use explicit start values works fine
nls(density ~ SSgompertz(log(conc), Asym, b2, b3=0.8), data = DNase.1, 
start=list(Asym=3, b2=2))
#---

I'm fitting many different models so I need a more generic approach.
I'd appreciate any comments on the suggestion in my 20 July post? (repeated 
here with some typo's corrected)
-
nls - function(formula, data=parent.frame(), start, ...){
  if (missing(start)) start - getInitial(formula, data)
  stats:::nls(formula, data, start=start[names(start) %in% 
all.vars(formula)], ...)
}
--

I see it breaks nls for non-selfStart functions; e.g. omitting the explicit 
start from your example
  nls(demand ~ m + b * Time, BOD)
works fine with vanilla nls (albeit with a warning)
#   No starting values specified for some parameters.
# Intializing 'b' to '1.'.
# Consider specifying 'start' or using a selfStart model
but my wrapper breaks it altogether.
 # Error in getInitial.default(func, data, mCall = as.list(match.call(func, 
:
 #  no 'getInitial' method found for function objects

I think that's OK in my application - I either use selfStart functions or 
explicit start lists - but I don't really want to remove existing 
functionality.

I guess I could (!?!) dive into nls and implement the same kind of approach 
internally, but that's very deep water for me :-{

Best regards,

Keith J

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] square brackets in expression in plots

2010-07-21 Thread Jannis
Dear Duncan, dear David,

thanks for the reply! I knew about Duncans solution, but could not figure out 
how to combine it that the 'm*s^-1' can be read from a string. Is there a way 
to tweak Davids suggestion in a way that this is possible?

E.g.:

a='m*s^-1'
b='speed~bgroup([,a, ])'
plot(1:10,main=parse(text=b)) 


Cheers
Jannis

--- David Winsemius dwinsem...@comcast.net schrieb am Di, 20.7.2010:

 Von: David Winsemius dwinsem...@comcast.net
 Betreff: Re: [R] square brackets in expression in plots
 An: Jannis bt_jan...@yahoo.de
 CC: r-h...@stat.math.ethz.ch
 Datum: Dienstag, 20. Juli, 2010 18:06 Uhr
 
 On Jul 20, 2010, at 12:42 PM, Jannis wrote:
 
  Dears,
  
  
  do you know whether it is possible to include any
 square parantheses (brackets) in an expression to use it as
 an axis label?
  
  e.g. I would like to have a label like (with the sub
 and superscripts correct):
  
  speed [m s^-1]
 
 Not exactly clear what you mean by correct.
  
  I know how to combine an expression with text via
 paste, but as I run soimething like:
  
  a='m*s^{-1}'
  plot(1:10,main=parse(text=a))
 
 ?plotmath
 
 This seemed to work fine with main, sub and xlab:
 
 a='speed [m*s^-1]'
  plot(1:10,main=parse(text=a))
 
 But maybe you wanted the square-brackets?
 
 a='speed~bgroup([, m*s^-1, ])'
 plot(1:10,main=parse(text=a))
 
  
  I found now way of doing this. I use the parse thing
 as I have all these units stored as strings that represent
 expressions.
  
  
  Cheers for any help!
  
  Jannis
  
  
  
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 David Winsemius, MD
 West Hartford, CT
 
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: If help

2010-07-21 Thread Petr PIKAL
Hi

I would make an interaction of those two vectors.

interaction(sample(letters[1:2], 20, replace=T),sample(letters[1:2], 20, 
replace=T))
 [1] a.a a.b b.b b.a b.b b.b a.a a.a b.b b.a a.a a.a b.b b.a a.a a.b b.a 
a.a b.b
[20] a.b
Levels: a.a b.a a.b b.b

You will get factor and you can change easily labels and change them to 
numeric vector.

See, ?levels, ?factor

Regards
Petr
 

r-help-boun...@r-project.org napsal dne 20.07.2010 19:14:28:

 Hi Y'all,
 
 I have some data in a table with 2 columns. There are two values: 
Reduction 
 and No Reduction.   I am trying to make a new variable change which 
recode 
 the combinations from column 1 and 2 into a single number.  Here is a 
snippet 
 from the table:
 
 [1,] NoReduction  NoReduction
  [2,] ReductionReduction
  [3,] NoReduction  NoReduction
  [4,] NoReduction  NoReduction
  [5,] ReductionReduction
  [6,] ReductionReduction
  [7,] ReductionReduction
  [8,] NoReduction  NoReduction
  [9,] NoReduction  NoReduction
 [10,] NoReduction  NoReduction
 
 This is the code that I have written so far..
 
 for (i in 1:nrow(change20082009))
 if(change20082009[i,1]=='No Reduction'  change20082009[i,2]=='No 
Reduction') 
 ){change20082009[i,3] - 1} else
 if(change20082009[i,1]=='No Reduction'  
change20082009[i,2]=='Reduction')
 {change20082009[i,3] - -1} else
 if(change20082009[i,1]=='Reduction' change20082009[i,2]=='No 
Reduction') 
 {change20082009[i,3] - 2} else
 if(change20082009[i,1]=='Reduction' 
change20082009[i,2]=='Reduction') 
 {change20082009[i,3] - 0}
 )
 
 I can't seem to get the code above to work..Any suggestions (I am sure 
it is 
 really basic)? Is there a better way to do this?
 
 Sincerely,
 
 tom
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to unsubscribe

2010-07-21 Thread barbara horta e costa
Hello,
I need to unsubcribe this email to the R help mailing list because I need to
create an email only for this mailing list. My email account doesn't have
that much space anymore and I don't have access to this every day.
How can I do that?
Thanks a lot

-- 
Bárbara H. Costa
Marine Biologist Researcher
SCIAENA - Marine Sciences and Cooperation
www.sciaena.org

ISPA | http://www.ispa.pt/ui/uie/index.asp
BIOMARES | http://www.ccmar.ualg.pt/biomares/
MSI-UCSB | http://www.msi.ucsb.edu/

bco...@ispa.pt
barbarahco...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ACCTGMX to 1223400 in R?

2010-07-21 Thread John1983

Dear all,

Thank you for the help. Yes I should have posted in the Bioconductor's
forum. Next time I sure will. The chartr function helped and now the code is
significantly faster. I have also used the Bstrings as suggested by Martin.
Thank you again.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/ACCTGMX-to-1223400-in-R-tp2294636p2296884.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] KALMAN

2010-07-21 Thread Research

Hello,

I am looking for some very simple, step by step, hands on 
application/examples/notes etc. on setting up a multivariate time series 
Kalman filter model in R.


Any help/pointers much appreciated.

Best regards,
Costas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Limited output

2010-07-21 Thread confusedcius

The details of my problem are as follows:
I have an sql that returns 2192 rows in sqlite.
In R, I typed the following:
 library(RSQLite)
 con - dbConnect(dbDriver(SQLite), dbname = C:\\sqlite\\... .sqlite)
 dbListTables(con)
#[1] tbl_n...  tbl_s...#
 cur - dbSendQuery(con,select ... from tbl_n... where... )
 bru - fetch(cur)
 bru

This returns the first 500 of the 2192 rows of sqlite.

If I then type 
 options()$max.print
I get
#[1] 9#
And from
 str(bru)
I get
#'data.frame':   500 obs. of  2 variables:
 $ vbl1 : chr...
 $ vbl2 : num...#

By the way, I appreciate your help!

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Limited-output-tp2295882p2296795.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Limited output

2010-07-21 Thread confusedcius

I just found the solution...
I used dbGetQuery instead of dbSendQuery. That's a more direct approach and
it works. Thanks again for your help.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Limited-output-tp2295882p2296822.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Objects within environment

2010-07-21 Thread Megh Dal
Hi all, I have following environments loaded with my current R session:

 search()
[1] .GlobalEnvpackage:stats package:graphics  
package:grDevices
[5] package:utils package:datasets  package:methods   Autoloads 
   
[9] package:base

How can I find the objects under a specific environment? Here I tries following:

 objects(envir=package:base)
Error in objects(envir = package:base) : invalid 'envir' argument

It would be great if somebody would point me the correct arguments for object() 
function to find the onjects associated with it. In help file it is written 
that:
 envir: an alternative argument to ‘name’ for specifying the
  environment evaluation environment. Mostly there for back
  compatibility
What is the wrong in my code?

Thanks,




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Obtaining the unmerged cases from one of the two data set

2010-07-21 Thread Mangalani Peter Makananisa
Dear R Gurus,

 

I am having two dummy csv data sets A and B containing 19 and 15
cases/observations respectively. From the two data set 13 cases are
intersection. From one of the two (any) data set, How do I then retrieve
the unmerged data ? let's take A for example, six cases must appear in
our results. See the R codes below.

 

Please assist.

 

Looking forward to hearing from the group sooner and thanking you in
advance

 

Kind regards

 

Mangalani Peter Makananisa

Statistical Analyst

South African Revenue Service (SARS)

Segmentation and Research : Data Modelling

Tel: +27 12 422 7357, Cell: +27 82 456 4669, Fax:+27 12 422 6579

E-Mail : pmakanan...@sars.gov.za mailto:pmakanan...@sars.gov.za   

 

 

 A = read.csv(C:/Documents and Settings/S1033067/Desktop/A.csv,
header = TRUE, dec =,, sep = ,)

 names(A)

[1] NAME   SALARY

 dim(A)[1]

[1] 19

 B = read.csv(C:/Documents and Settings/S1033067/Desktop/B.csv,
header = TRUE, dec =,, sep = ,)

 names(B)

[1] NAME B.SALARY

 dim(B)[1]

[1] 15

 common = merge(A,B)

 names(common)

[1] NAME SALARY   B.SALARY

 dim(common)[1]

[1] 13

 

 


Please Note: This email and its contents are subject to our email legal notice 
which can be viewed at http://www.sars.gov.za/Email_Disclaimer.pdf 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Brute-force clustering

2010-07-21 Thread syrvn

Hi all,

I use heatmap.2 for clustering and as you know you can choose between
several distance functions and clustering algorithms. Trying all
combinations isn't too bad but I have two further parameters and whenever I
change one of these I need do redo all cluster combinations. It probably
takes weeks to test all possible
combinations that's why I would like to use a kind of brute-force solution
to test all combinations.

One major issue is that I have 20 different cases c1-c20 and I need c1-c5 to
be in one cluster and c6-c8 in a different one and so on. So after trying a
certain combination I somehow need to check whether the cases fall into the
same clusters.

Cheers
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Brute-force-clustering-tp2296897p2296897.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest - Strata

2010-07-21 Thread Tim Howard
Coll,

An alternative approach is to do that subsetting yourself before sending it to 
RF and treat each group as an external validation group, as follows:
- extract Site A, build a RF model (Model 1) on sites B and C
- validate this model by running a predict on site A using the model, use ROCR 
or other evaluation metrics to look at the effectiveness of Model 1. 
- extract Site B, build a RF model (Model 2) on sites A and C.
- validate this model by trying to predict presence in site B using model 2.
- continue through all your sites.

This is called 'leave-one-out' and is used in some fields for model validation. 
 You final accuracy estimates of your model could be based on the averages of 
values obtained for each model. 

Hope that Helps. 
Tim



--

Message: 44
Date: Tue, 20 Jul 2010 08:48:04 -0700 (PDT)
From: Coll gbco...@gmail.com
To: r-help@r-project.org 
Subject: [R] Random Forest - Strata
Message-ID: 1279640884553-2295731.p...@n4.nabble.com
Content-Type: text/plain; charset=us-ascii


Hi all,

Had struggled in getting Strata in randomForest to work on this. 

Can I get randomForest for each of its TREE, to get ALL sample from some
strata to build tree, while leaving some strata TOTALLY untouched as oob?

e.g. in below, how I can tell RF to, 
- for tree 1 in the forest, to use only Site A and B to build the tree,
while using the WHOLE Site C data for the oob error rate,
- for tree 2, use only site A and C to build tree, while using whole site B
data for oob
- for tree 3, use Site B and C, A as oob...?

My command does not work as it would use some sample in all of the sites:
rforest.obj - randomForest(Presence.f ~., data=dataset.subset, strata =
site.factor)

while 
the setting the corresponding sampsize argument seems would only screen
out the Site in all tree building...

SitePresence  Length  Sulphur
A   Yes3.50 19.42
A   No  3.9051.09
A   No  3.6026.75
B   Yes2.60 9.71
B   No  2.209.77
B   No  2.608.60
B   No  3.0035.59
C   Yes3.50 16.07
C   No  3.4049.96
C   No  3.1035.35

Any idea / comments are welcomed.

Thanks in advance.

Coll
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Random-Forest-Strata-tp2295731p2295731.html 
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Objects within environment

2010-07-21 Thread Duncan Murdoch

On 21/07/2010 5:57 AM, Megh Dal wrote:

Hi all, I have following environments loaded with my current R session:


search()

[1] .GlobalEnvpackage:stats package:graphics  
package:grDevices
[5] package:utils package:datasets  package:methods   Autoloads
[9] package:base


How can I find the objects under a specific environment? Here I tries following:


objects(envir=package:base)

Error in objects(envir = package:base) : invalid 'envir' argument

It would be great if somebody would point me the correct arguments for object() 
function to find the onjects associated with it. In help file it is written 
that:
 envir: an alternative argument to ‘name’ for specifying the
  environment evaluation environment. Mostly there for back
  compatibility
What is the wrong in my code?


The easiest way to pick an item from the search list is by number:

objects(3)
 or
ls(3)

will give you the objects in the graphics package, with the search list 
as above.  You can also specify the name as the name argument, e.g.


objects(package:base)

If you want to use the envir argument (why?), you need to give an 
environment, not the name of one.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Chi-square distribution probability density function:

2010-07-21 Thread Knut Krueger
Hi to all I found
an formular of an **
***p-Value Calculator for the Chi-Square test*
*http://www.danielsoper.com/statcalc/calc11.aspx*
*with the formula*
*http://www.danielsoper.com/statkb/topic11.aspx*
*what's the gamma function of this formula in r?*
*df=5*
*ch2=25.50878*
*the following code does not give the result  0.001 for the values above *
*p= ((0.5^(df/2))/gamma(df))*(ch2^((df/2)-1))* (2.718281828459^(-ch2/2))  
or is there any other error?

Regards Knut
***

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining the unmerged cases from one of the two data set

2010-07-21 Thread David Winsemius


On Jul 21, 2010, at 5:33 AM, Mangalani Peter Makananisa wrote:


Dear R Gurus,


I saw no reason to copy Rob Hyndman. I did not see that this involves  
any of the packages he maintains.


I am having two dummy csv data sets A and B containing 19 and 15
cases/observations respectively. From the two data set 13 cases are
intersection. From one of the two (any) data set, How do I then  
retrieve

the unmerged data ? let's take A for example, six cases must appear in
our results. See the R codes below.


?setdiff

Perhaps:

setdiff( (NAME(A), NAME(B) )

You can also do a merge that is an outer join that includes all the  
NAME information and then extract the records with SALARY  
and .B.SALARY data. Untested in absence of working example:


?merge
mer - merge(A,B, all=TRUE)
mer[ mer$NAME %in% setdiff(NAME(A), NAME(B) ),  ]

--
David.





A = read.csv(C:/Documents and Settings/S1033067/Desktop/A.csv,

header = TRUE, dec =,, sep = ,)


names(A)


[1] NAME   SALARY


dim(A)[1]


[1] 19


B = read.csv(C:/Documents and Settings/S1033067/Desktop/B.csv,

header = TRUE, dec =,, sep = ,)


names(B)


[1] NAME B.SALARY


dim(B)[1]


[1] 15


common = merge(A,B)



names(common)


[1] NAME SALARY   B.SALARY


dim(common)[1]


[1] 13





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Validation in R for GAMM

2010-07-21 Thread Karen
My GAMM model is to find drivers of species richness in forests is

gamm1- gamm(Total Species Richness ~ fROT + s(PH) + s(LOI) + ASP +
s(SQRT_ELEV) + CANCOV + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) +
s(OLDWDLD) + s(DISTWOOD) + s(Annprec), random=list (fSITE =~1), family
= poisson, data = BIOFOR3)

My issues are that the validation graphs are using methods I'm
unfamiliar with e.g. square rooting values and Pearson's residuals.
I'm following as best I can recommendations in Wood 2006 and Zuur 2009
but have some (I think) basic questions on graphical validation:

1. plot sqrt transformed fitted values vs square root transformed
observed values - should be straight line.
I presume the observed values mentioned above refer only to observed
RESPONSE variable

2. Pearson residuals vs. sqrt fitted values (should form band with no
patterns)
I'm totally unfamilar with using Pearson residuals.  Until I can see
example of a band with no patterns I can't really say if my output is
OK! Could this band also be described as a cloud of points with no
pattern?

I have 2 factors and 2 unsmoothed explanatory variables in my model –
what residual graphs are suitable for these? Is it the traditional
ones of normalized residuals vs raw explanatory variables?

Also for a lot of other model checking residuals are plotted against
explanatory variables..what’s appropriate residuals to plot
vs.smoothed explanatory variables for a GAMM?

3.   Raw residuals (observed vs fitted) vs. sqrt fitted values (should
show clear cone)
 But my output doesn't show clear cone and I don't know what action to
take to rectify this issue
 Infact graphs 2. and 3. are the exact same.

Some help with this would be much appreciated.

Thanks,

Karen

The only examples I have for graph output is Wood 2006 Fig 6.11

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Chi-square distribution probability density function:

2010-07-21 Thread David Winsemius


On Jul 21, 2010, at 6:52 AM, Knut Krueger wrote:


Hi to all I found
an formular of an **
***p-Value Calculator for the Chi-Square test*
*http://www.danielsoper.com/statcalc/calc11.aspx*
*with the formula*
*http://www.danielsoper.com/statkb/topic11.aspx*
*what's the gamma function of this formula in r?*
*df=5*
*ch2=25.50878*
*the following code does not give the result  0.001 for the values  
above *
*p= ((0.5^(df/2))/gamma(df))*(ch2^((df/2)-1))* (2.718281828459^(- 
ch2/2))

or is there any other error?


Check your implementation of that formula. You made an error in the  
gamma argument.


See also:

?Chisquare
?gamma
--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xtable with ifelse statement

2010-07-21 Thread Jannis
What exactly do you want to do? To me it looks more like a problem with the 
arrangement of your lists as with xtable (or ifelse) (so you are better next 
time of with a more precise subject line).

Please read the posting guide and provide some reproducable code so we can help 
you (and include your lists etc. via  dump('x',file=stdout())).

As first suggestion I would rearrange your lists in a way that the data you 
want to have together ends up in one list, but there are several alternatives, 
depending on what you want to achieve.


Jannis

--- vivi84 cipullu...@hotmail.com schrieb am Mi, 21.7.2010:

 Von: vivi84 cipullu...@hotmail.com
 Betreff: [R] xtable with ifelse statement
 An: r-help@r-project.org
 Datum: Mittwoch, 21. Juli, 2010 07:23 Uhr
 
 Hi there,
 I'm very new on R and I hope someone can help me to solve
 the problem in
 using the ifelse statement with the xtable function(library
 xtable). 
 I'm trying to get the printing of the elements of two lists
 in a sorted way.
 These two list have in common the their names.
 I will try to give an example:
 the first list looks like this:
 
 $code1
 
 Code              code1
 Nation         
    Usa
 
 $code2
 
 Code              code2
 Nation         
    Spain
 
 
 the second one is like this
 
 $code 1
 var1               
     87 
 var2               
    73
 
 $code2  
 var1               
     34
 var2               
     23
 
 
 As you can see the two list have the same names.I tried to
 implement the
 following ifelse statement but it doesn't work
 orde2=function(x,y){
 ifelse(names(x)==names(y),list(print(xtable(x)),print(xtable(y))),NULL)
 
 What I want is the xtable of this first:
 $code1
 
 Code              code1
 Nation         
    Usa
 
 
 and after this:
 
 $code 1
 var1               
     87 
 var2               
    73
 
 
 And as the first case, the xtable of this;
 $code2
 
 Code              code2
 Nation         
    Spain
 
 
 and after this:
 
 $code2  
 var1               
     34
 var2               
     23
 
 Someone Knows if it possible to do this and how??
 Thanks a lot!
 
 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/xtable-with-ifelse-statement-tp2296694p2296694.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Contingency Table Analysis - specific cell to specific cell comparisons

2010-07-21 Thread Tsunhin John Wong
Dear R users,

I have a question of how to do some specific cell to cell comparisons
on a R x C contingency table.
The table is a 3 x 5 table with frequency / count data.

 langcons.table - table(lang, cons)
 langcons.table[cbind(lang,cons)] - freq
 langcons.table

  Adj Int Oth Pas Tra
C  69 221  17   3 198
E  56 214  33  31 174
J  36 291   8   9 164

I know how to do an independent model test using Poisson in glm
 glm.out1 - glm(freq~lang+cons, family=poisson, data=langcons.data)
 summary(glm.out1)

And then fit the saturated model
 glm.out2 - glm(freq~lang*cons, family=poisson, data=langcons.data)
 summary(glm.out2)

However, the results are difficult to interpret:
C and Adj are used to as a baseline.
And I can only see main effects and interactions and *always according
to the baseline*.
Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept)
lang1
lang2
cons1
cons2
cons3
cons4
lang1:cons1
lang2:cons1
lang1:cons2
lang2:cons2
lang1:cons3
lang2:cons3
lang1:cons4
lang2:cons4

If anyone know, please suggest me some way to do specific cell to cell
comparison on such a contingency table.
Say, to compare pairs of cells:
along a column: 3 vs 31, 9 vs 31, 3 vs 9
along a row: 36 vs 9
or even across column and row: 36 vs 31, and 36 vs 3

Thanks for your help in advance.

Best,

John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Contingency Table Analysis - specific cell to specific cell comparisons

2010-07-21 Thread David Winsemius


On Jul 21, 2010, at 8:07 AM, Tsunhin John Wong wrote:


Dear R users,

I have a question of how to do some specific cell to cell comparisons
on a R x C contingency table.
The table is a 3 x 5 table with frequency / count data.


langcons.table - table(lang, cons)
langcons.table[cbind(lang,cons)] - freq
langcons.table


 Adj Int Oth Pas Tra
C  69 221  17   3 198
E  56 214  33  31 174
J  36 291   8   9 164

I know how to do an independent model test using Poisson in glm

glm.out1 - glm(freq~lang+cons, family=poisson, data=langcons.data)
summary(glm.out1)


And then fit the saturated model

glm.out2 - glm(freq~lang*cons, family=poisson, data=langcons.data)
summary(glm.out2)


However, the results are difficult to interpret:
C and Adj are used to as a baseline.
And I can only see main effects and interactions and *always according
to the baseline*.
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)
lang1
lang2
cons1
cons2
cons3
cons4
lang1:cons1
lang2:cons1
lang1:cons2
lang2:cons2
lang1:cons3
lang2:cons3
lang1:cons4
lang2:cons4

If anyone know, please suggest me some way to do specific cell to cell
comparison on such a contingency table.


Even if you are daunted by the task of plugging the covariates into  
the formula, exp(intercept+sum(beta_N*var_n)), you can always use the  
predict function to create an estimate for all (or a specific set) of  
the covariates. They come out on the log(rate) scale so would need to  
be exponentiated. Consult your instructor for further help.




Say, to compare pairs of cells:
along a column: 3 vs 31, 9 vs 31, 3 vs 9
along a row: 36 vs 9
or even across column and row: 36 vs 31, and 36 vs 3



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to unsubscribe

2010-07-21 Thread Jannis
As in nearly every e-mail list, you just need to click on one of the links at 
the end of each mail (its the first link in this case) and follow the 
instructions.


Jannis

--- barbara horta e costa barbarahco...@gmail.com schrieb am Mi, 21.7.2010:

 Von: barbara horta e costa barbarahco...@gmail.com
 Betreff: [R] how to unsubscribe
 An: r-help@r-project.org
 Datum: Mittwoch, 21. Juli, 2010 09:50 Uhr
 Hello,
 I need to unsubcribe this email to the R help mailing list
 because I need to
 create an email only for this mailing list. My email
 account doesn't have
 that much space anymore and I don't have access to this
 every day.
 How can I do that?
 Thanks a lot
 
 -- 
 Bárbara H. Costa
 Marine Biologist Researcher
 SCIAENA - Marine Sciences and Cooperation
 www.sciaena.org
 
 ISPA | http://www.ispa.pt/ui/uie/index.asp
 BIOMARES | http://www.ccmar.ualg.pt/biomares/
 MSI-UCSB | http://www.msi.ucsb.edu/
 
 bco...@ispa.pt
 barbarahco...@gmail.com
 
     [[alternative HTML version deleted]]
 
 
 -Integrierter Anhang folgt-
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest - Strata

2010-07-21 Thread mxkuhn
If you use the index argument of the trainControl() function in the caret 
package, the train() function can be used for this type of resampling (and 
you'll get some decent summaries and visualizations to boot)

Max

On Jul 21, 2010, at 7:11 AM, Tim Howard tghow...@gw.dec.state.ny.us wrote:

 Coll,
 
 An alternative approach is to do that subsetting yourself before sending it 
 to RF and treat each group as an external validation group, as follows:
 - extract Site A, build a RF model (Model 1) on sites B and C
 - validate this model by running a predict on site A using the model, use 
 ROCR or other evaluation metrics to look at the effectiveness of Model 1. 
 - extract Site B, build a RF model (Model 2) on sites A and C.
 - validate this model by trying to predict presence in site B using model 2.
 - continue through all your sites.
 
 This is called 'leave-one-out' and is used in some fields for model 
 validation.  You final accuracy estimates of your model could be based on the 
 averages of values obtained for each model. 
 
 Hope that Helps. 
 Tim
 
 
 
 --
 
 Message: 44
 Date: Tue, 20 Jul 2010 08:48:04 -0700 (PDT)
 From: Coll gbco...@gmail.com
 To: r-help@r-project.org 
 Subject: [R] Random Forest - Strata
 Message-ID: 1279640884553-2295731.p...@n4.nabble.com
 Content-Type: text/plain; charset=us-ascii
 
 
 Hi all,
 
 Had struggled in getting Strata in randomForest to work on this. 
 
 Can I get randomForest for each of its TREE, to get ALL sample from some
 strata to build tree, while leaving some strata TOTALLY untouched as oob?
 
 e.g. in below, how I can tell RF to, 
 - for tree 1 in the forest, to use only Site A and B to build the tree,
 while using the WHOLE Site C data for the oob error rate,
 - for tree 2, use only site A and C to build tree, while using whole site B
 data for oob
 - for tree 3, use Site B and C, A as oob...?
 
 My command does not work as it would use some sample in all of the sites:
 rforest.obj - randomForest(Presence.f ~., data=dataset.subset, strata =
 site.factor)
 
 while 
 the setting the corresponding sampsize argument seems would only screen
 out the Site in all tree building...
 
 SitePresence  Length  Sulphur
 AYes   3.5019.42
 ANo3.9051.09
 ANo3.6026.75
 BYes   2.609.71
 BNo2.209.77
 BNo2.608.60
 BNo3.0035.59
 CYes   3.5016.07
 CNo3.4049.96
 CNo3.1035.35
 
 Any idea / comments are welcomed.
 
 Thanks in advance.
 
 Coll
 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/Random-Forest-Strata-tp2295731p2295731.html 
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Saving R plots as image in oracle database tables

2010-07-21 Thread vikrant

Dear All,
I want to know how to save R plots as image/object in a oracle database??
I am using RODBC package for connecting R to oracle
Please explain by giving example..
Thanks in advance


Cheers,
Vikrant
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Saving-R-plots-as-image-in-oracle-database-tables-tp2297009p2297009.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fix()ing an S4 method

2010-07-21 Thread Albert-Jan Roskam
Hi R experts,
 
The fix() function canbe used to edit normal functions. I would like to know 
whether it's also possible to use something similar to edit a method of an S4 
class. In other words, is there a fix-like function that allows me to edit 
method definitions without having to go back to the source code?
 
setGeneric (name=doStuff,def = function(object){standardGeneric(doStuff)})
setClass(SomeClass, representation(text = character))
setMethod(f = doStuff, signature (SomeClass), definition = function(object){
    return(print(paste(***, obj...@text)))  })
fix(doStuff) # Does NOT give the intended result. How can I make R show the 
method definition?

Cheers!!
Albert-Jan

~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to unsubscribe

2010-07-21 Thread barbara horta e costa
I already changed my email account.
Thanks a lot

On 21 July 2010 12:50, Jannis bt_jan...@yahoo.de wrote:

 As in nearly every e-mail list, you just need to click on one of the links
 at the end of each mail (its the first link in this case) and follow the
 instructions.


 Jannis

 --- barbara horta e costa barbarahco...@gmail.com schrieb am Mi,
 21.7.2010:

  Von: barbara horta e costa barbarahco...@gmail.com
  Betreff: [R] how to unsubscribe
  An: r-help@r-project.org
  Datum: Mittwoch, 21. Juli, 2010 09:50 Uhr
   Hello,
  I need to unsubcribe this email to the R help mailing list
  because I need to
  create an email only for this mailing list. My email
  account doesn't have
  that much space anymore and I don't have access to this
  every day.
  How can I do that?
  Thanks a lot
 
  --
  Bárbara H. Costa
  Marine Biologist Researcher
  SCIAENA - Marine Sciences and Cooperation
  www.sciaena.org
 
  ISPA | http://www.ispa.pt/ui/uie/index.asp
  BIOMARES | http://www.ccmar.ualg.pt/biomares/
  MSI-UCSB | http://www.msi.ucsb.edu/
 
  bco...@ispa.pt
  barbarahco...@gmail.com
 
  [[alternative HTML version deleted]]
 
 
  -Integrierter Anhang folgt-
 
  __
  R-help@r-project.org
  mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 






-- 
Bárbara H. Costa
Marine Biologist Researcher
SCIAENA - Marine Sciences and Cooperation
www.sciaena.org

ISPA | http://www.ispa.pt/ui/uie/index.asp
BIOMARES | http://www.ccmar.ualg.pt/biomares/
MSI-UCSB | http://www.msi.ucsb.edu/

bco...@ispa.pt
barbarahco...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] DIF Analysis starting from a gpcm class object

2010-07-21 Thread Dr. Federico Andreis

Dear useRs,

does any of you have suggestions on how to conduct a proper DIF analysis
starting from a model of
class gpcm (from the wonderful package ltm by prof. Rizopoulos)? 

difR will handle only dichotomous items, and I have a mix of dicho- and
polytomous ones (that's why I chose the partial credit model). 

I also found the package lordif, but I'm not really sure if that's what I
need.

I basically want to regress item difficulties on some other covariates (sex,
age class,...)..

Is there a package you know of that handle such models for a DIF analysis or
should I code such a thing by myself? and if so, how can I extract what I
need from the model object?

thanks to everybody

/federico

-
Dr. Federico Andreis
Università degli Studi di Milano-Bicocca, PhD Student
MEB Department, Karolinska Institutet, Stockholm, Visiting PhD Student
-
Problem: To Catch a Lion in the Sahara Desert - The Dirac Method 

We observe that wild lions are, ipso facto, not observable in the Sahara
Desert. Consequently, if there are any lions in the Sahara, they are tame.
The capture of a tame lion may be left as an exercise for the reader.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/DIF-Analysis-starting-from-a-gpcm-class-object-tp2297051p2297051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fractional Polynomials - Hazard Ratios and Relative Hazard Plots

2010-07-21 Thread Laura Bonnett
Dear All,

I am using Windows and R version 2.9.2 with libraries cmprsk, mfp and
Design.

I have a dataset with approximately 1700 patients (1 row per patient) and I
have 12 outcomes, three of which are continuous.  I have performed
univariate analyses to see if any factors are associated with a higher
likelihood of the event of interest (achieving 12 month remission from
epileptic seizures) and also an analysis adjusted for multiple variables.

I have tried log and fractional polynomial (FP) transformations of each
continuous variable.  In the case of age, used for the example below, the FP
transformation led to the best model fit according to the AIC.  I have
therefore applied this transformation for all analyses.

To begin with I have fitted a Cox model stratified by randomisation period,
rpa, (either before or after a certain date).

fit1 - mfp(Surv(Remtime,Rcens) ~ fp(age) + strata(rpa), family=cox,
data=nearma, select=0.05, verbose=TRUE)

I would like two things from this model, hazard ratios and and an associated
hazard ratio plot.  I am aware that the hazard ratios produced from a
fractional polynomial transformation are not to be used directly (i.e. those
obtained from summary(coxfitf1)).  Instead the derived functional form of
the variable should be used to estimate hazard ratios post hoc.  I have
attached a word document explaining how hazard ratios and confidence
interval can be derived and given a worked example for the variable, age.
The univariate results are:

Age (years)

≤10

(10 to 25]

(25 to 37]

(37 to 50]

(50 to 71]

71

1.00

1.00 (1.00 to 1.00)

0.99 (0.99 to 1.00)

0.99 (0.99 to 0.99)

0.99 (0.98 to 0.99)

0.98 (0.97 to 0.99)

To create a plot of the relative hazard I have used the code:
plot(nearma$age,predict(fit1,type=risk),xlab=Age,ylab=Relative Hazard)
The produced plot is attached.
As you can clearly see, the hazard ratios above and the relative hazard plot
do not agree.
This is also the case for the other two continuous variables that have been
transformed via the FP approach.

The hazard ratios for age using the model adjusted for multiple variables
are as follows, which do coincide with the plot:

Age (years)

≤10

(10 to 25]

(25 to 37]

(37 to 50]

(50 to 71]

71

1.00

0.86 (0.77 to 0.95)

0.78 (0.65 to 0.94)

0.80 (0.64 to 1.00)

1.01 (0.81 to 1.25)

1.61 (1.27 to 2.05)

Can anyone therefore explain why the univariate hazard ratios do not
coincide with the relative hazard plot and yet the hazard ratios from the
multivariable model do?  I know that the calculations are correct for both
sets of hazard ratios.

Thank you for any help you can provide as I am at a loss to explain the
difference in the plot fo the calculations - they should, after all, be
saying the same thing!

Thank you,
Laura
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Cheat Sheets and Reference Cards

2010-07-21 Thread Tim Church
FYI - There is a new collection of free cheat sheets and reference cards for
R on DevCheatSheet - http://devcheatsheet.com/tag/r/

If anyone knows of any other R cheat sheets that are not listed, please use
the 'Suggest a Missing Cheat Sheet' link on the site.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Piecewise regression using lme()

2010-07-21 Thread goulvenS

Hi everyone,

I'm trying to fit a of piecewise regression model on a time series. The idea
is to divide the series into segments and then to apply linear regression
models on each segment but in a global way and considering
heteroskedasticity between the segments. For example, I build a time series
y with 3 segments: 

segment1=1:20+rnorm(20,0,2)
segment2=20-2*1:30+rnorm(30,0,5)
segment3=-40+0.5*1:15+rnorm(15,0,1)

group=c(rep(1,20),rep(2,30),rep(3,15))
y=c(segment1,segment2,segment3)

Data=data.frame(y,t=1:65,group=as.factor(group))

the model I'd like to fit is:

y_t=
(beta_01+beta_11*t+error_1)*(group==1)+(beta_02+beta_12*t+error_2)*(group==2)+(beta_03+beta_13*t+error_3)*(group==3)

It looks like a mixed effects model were the fixed effect are the piecewise
linear regression parts (beta_0i+beta_1i*t) and the random effects are the
variance components error_i. 

The problem is that I can't find the way the write this model correctly
using the lme() function of the package nlme. I can't remove the global
intercept and the global variance component. Here's what I tried:

#1 Using a prior piecewise linear regression

lm.list=lmList(y~t|group,Data)

model.lme=lme(lm.list,weights=varIdent(form=~1|group))

#2 Trying to estimate the whole model directly and considering the different
lines as random effects

model.lme=lme(y~1,random=~t|group,data=Data)

but the intercept remains... 

I read a lot of R-help messages before posting this one (my first!) and I'm
not getting any closer. It looks like no one tried to estimate the exact
same model. I'll be very grateful if someone could help me on this. 

Thanks

Goulven




-- 
View this message in context: 
http://r.789695.n4.nabble.com/Piecewise-regression-using-lme-tp2297118p2297118.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error message: attempt to set rownames on object with no dimensions

2010-07-21 Thread Igor Blanco
Hi there,

I am trying to analyze some data using the lme package.
In my case there is a variable called Newmarker that can only take 2 values
(number 1 or number 2).
I have used the as.factor to remark this fact but an error message appears
stating: attempt to set rownames on object with no dimensions.

Here is the code I have used:

*Dataset -   read.table(Data2.txt, header=TRUE, sep=, na.strings=NA,
dec=., strip.white=TRUE)

Dataset$Newmarker - as.factor(Dataset$Newmarker)

attach(Dataset)

lme.1 - lme(Newmarker ~ Age, data=Dataset,  random= ~1|ID)

summary(lme.1)*

If I don't use the as.factor I have no problems but I don't think this is
the correct way to proceed... The other variables I use can have any value.

Can anybody help me please?

Thanking you in advance,

Regards,

Igor Blanco

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message: attempt to set rownames on object with no dimensions

2010-07-21 Thread Prof Brian Ripley
?lme (at least the one in package nlme) seems not to mention that the 
response in 'fixed' must be a numeric variable, but it must (and a 
factor is not numeric).


If you want to use a binary response, you need to use other methods, 
and it would be better to ask on R-sig-mixed-models making clear what 
your aims actually are.


On Wed, 21 Jul 2010, Igor Blanco wrote:


Hi there,

I am trying to analyze some data using the lme package.


Perhaps you mean the nlme package?


In my case there is a variable called Newmarker that can only take 2 values
(number 1 or number 2).
I have used the as.factor to remark this fact but an error message appears
stating: attempt to set rownames on object with no dimensions.

Here is the code I have used:

*Dataset -   read.table(Data2.txt, header=TRUE, sep=, na.strings=NA,
dec=., strip.white=TRUE)

Dataset$Newmarker - as.factor(Dataset$Newmarker)

attach(Dataset)

lme.1 - lme(Newmarker ~ Age, data=Dataset,  random= ~1|ID)

summary(lme.1)*

If I don't use the as.factor I have no problems but I don't think this is
the correct way to proceed... The other variables I use can have any value.

Can anybody help me please?

Thanking you in advance,

Regards,

Igor Blanco

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message: attempt to set rownames on object with no dimensions

2010-07-21 Thread David Winsemius


On Jul 21, 2010, at 9:39 AM, Igor Blanco wrote:


Hi there,

I am trying to analyze some data using the lme package.
In my case there is a variable called Newmarker that can only take 2  
values

(number 1 or number 2).
I have used the as.factor to remark this fact but an error message  
appears

stating: attempt to set rownames on object with no dimensions.

Here is the code I have used:

*Dataset -   read.table(Data2.txt, header=TRUE, sep=,  
na.strings=NA,

dec=., strip.white=TRUE)

Dataset$Newmarker - as.factor(Dataset$Newmarker)


Why do you use as.factor? It's probably already a factor. What does  
str(Dataset) show?




attach(Dataset)


You do not need to do this, and it creates potential for confusion.  
The data= argument will allow the column names to be accessed by the  
lme() function.




lme.1 - lme(Newmarker ~ Age, data=Dataset,  random= ~1|ID)

summary(lme.1)*

If I don't use the as.factor I have no problems but I don't think  
this is
the correct way to proceed... The other variables I use can have any  
value.



--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The opposite of lag

2010-07-21 Thread Dimitri Liakhovitski
Hello!

I have a data frame A (below) with a grouping factor (group). I take
my DV and create the new, lagged DV by applying the function lag.it
(below). It works fine.

A - data.frame(year=rep(c(1980:1984),3), group=
factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
lag.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, -1))[idx,]
  out-cbind(x, DVs[,2])  # wages[,2]
   names(out)[length(out)]-DV.lag
   return(out)
}
A
A.lagged - do.call(rbind, by(A, A$group, lag.it))
A.lagged


Now, I am trying to create the oppostive of lag for DV (should I call
it lead?)
I tried exactly the same as above, but with a different number under
lag function (below), but it's not working. I am clearly doing
something wrong. Any advice?
Thanks a lot!


lead.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, 2))[idx,]
 out-cbind(x, DVs[,2])
   names(out)[length(out)]-DV.lead
   return(out)
}
A
A.lead - do.call(rbind, by(A, A$group, lead.it))
A.lead

-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do you calculate DIC from coda files (R2WinBUGS)?

2010-07-21 Thread Patrick McKann
 I tried calculating pD as 1/2 variance of the deviance, but I got hugely
inflated numbers.

Does anybody know how to calculate pD from the coda files output from
R2WinBUGS?

By the way, 'set DIC' is greyed out for some reason within WinBUGS, so I
can't monitor DIC.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best way to apply a list of functions to a dataset ?

2010-07-21 Thread Peter Ehlers

Dennis' ddply solution would be my choice. Here is
a small variation that makes it easy to modify what
list of functions is applied:

#
ma- melt(attitude)

f - function(x,v) summarise(x,
 mean = mean(v),
 sd = sd(v),
 skewness = skewness(v),
 mean.gt.med = mean.gt.med(v)
)

ddply(ma, .(variable), function(x) f(x, v = x[[value]]))
#

Another option is to use data.frame in place of summarise:

#
f - function(x,v) data.frame(
 mean = mean(v),
 sd = sd(v),
 skewness = skewness(v),
 mean.gt.med = mean.gt.med(v)
)
#

  -Peter Ehlers


On 2010-07-21 0:41, Dennis Murphy wrote:

Hi:

On Tue, Jul 20, 2010 at 5:37 PM, Glen Barnettglnbr...@gmail.com  wrote:


Hi Dennis,

Thanks for the reply.

Yes, that's easier, but the conversion to a matrix with rbind has
converted the output of that final function to a numeric.



If you look at the output of lapply(attitude, f), you'll see that the
conversion from logical to numeric has already taken place. Different
components of lists can have different types, but within a component, all of
the elements must have the same class.
You can patch up the result as follows:

x- data.frame(do.call(rbind, lapply(attitude, f)))
x[, 5]- as.logical(x[, 5])

x

meansdskewness median mean.gt.med
rating 64.6 12.172562 -0.35792491   65.5   FALSE
complaints 66.6 13.314757 -0.21541749   65.0TRUE
privileges 53.1 12.235430  0.37912287   51.5TRUE
learning   56.36667 11.737013 -0.05403354   56.5   FALSE
raises 64.6 10.397226  0.19754317   63.5TRUE
critical   74.76667  9.894908 -0.86577893   77.5   FALSE
advance42.9 10.288706  0.85039799   41.0TRUE

but if you're doing this sort of thing over a large data frame such fixes
may be impractical.


I included that last function in the example secifically to preclude

people assuming that functions would always return the same type.




There is a plyr solution, although it's a little more long-winded than I'd
prefer in the end:

library(ggplot2)
# melt the data frame so that the variables become factor levels
  ma- melt(attitude)
Using  as id variables

dim(ma)

[1] 210   2
# Use ddply to get the set of summaries by variable:

ddply(ma, .(variable), summarise, mean = mean(value), sd = sd(value),
skewness = skewness(value), median = median(value),
mean.gt.med = mean.gt.med(value))
 variable meansdskewness median mean.gt.med
1 rating 64.6 12.172562 -0.35792491   65.5   FALSE
2 complaints 66.6 13.314757 -0.21541749   65.0TRUE
3 privileges 53.1 12.235430  0.37912287   51.5TRUE
4   learning 56.36667 11.737013 -0.05403354   56.5   FALSE
5 raises 64.6 10.397226  0.19754317   63.5TRUE
6   critical 74.76667  9.894908 -0.86577893   77.5   FALSE
7advance 42.9 10.288706  0.85039799   41.0TRUE

Notice that now the logical class of mean.gt.med is preserved.  The trick
with ddply() in package plyr is that, in this case, it is convenient to melt
the data frame first before doing the summarizations. This is because
ddply() requires a grouping variable - in this example, the groups are the
variables themselves.

HTH,
Dennis

I guess this doesn't matter too much for a logical, but what if

instead the function returned a character (say mean, median, or
equal - indicating which one was larger, or equal which could
easily happen with discrete data). This precludes using rbind (which I
also used at first, before I noticed that sometimes I could have
functions that don't return numerics).

Glen


On Tue, Jul 20, 2010 at 6:55 PM, Dennis Murphydjmu...@gmail.com  wrote:

Hi:

This might be a little easier (?):

library(datasets)
skewness- function(x) mean(scale(x)^3)
mean.gt.med- function(x) mean(x)median(x)

# --
# construct the function to apply to each variable in the data frame
f- function(x) c(mean = mean(x), sd = sd(x), skewness = skewness(x),
  median = median(x), mean.gt.med = mean.gt.med(x))

# map function to each variable with lapply and combine with do.call():
do.call(rbind, lapply(attitude, f))
meansdskewness median mean.gt.med
rating 64.6 12.172562 -0.35792491   65.5   0
complaints 66.6 13.314757 -0.21541749   65.0   1
privileges 53.1 12.235430  0.37912287   51.5   1
learning   56.36667 11.737013 -0.05403354   56.5   0
raises 64.6 10.397226  0.19754317   63.5   1
critical   74.76667  9.894908 -0.86577893   77.5   0
advance42.9 10.288706  0.85039799   41.0   1

HTH,
Dennis


On Mon, Jul 19, 2010 at 10:51 PM, Glen Barnettglnbr...@gmail.com

wrote:


Assuming I have a matrix of data (or under some restrictions that will
become obvious, possibly a data frame), I want to be able to apply a
list of functions (initially producing a single number from a vector)
to the data and produce a data 

Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday

2010-07-21 Thread Ben Bolker
Harlan Harris harlan at harris.name writes:

 
 Hi,
 
 I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything
 worked OK until this morning, when my ability to write into spreadsheet
 cells went away. I get the following weird error:
 
 Error in els[[type + 1]] : subscript out of bounds
 
 Looking at the Google Docs API changelog, I see the following:
 
 http://code.google.com/apis/spreadsheets/changelog.html
 Release 2010-01 (July 14, 2010)
 
 This is an advanced notice about an upcoming change.
 
- Starting July 19, 2010, all links returned by all Spreadsheets API
feeds will use HTTPS. This is being done in the interests of increased
security. If you require the use of HTTP, we recommend that you remove the
replace https with http in these links. Another announcement will be made
on July 19, 2010, when this change goes to production.
 
 I suspect this is the problem. Fixing it is above my head, I'm afraid. Could
 anyone help? This is urgent. Thank you,


   This is an Omegahat package (took me a little while to find it). Perhaps
you should write to the package maintainer?

library(RGoogleDocs)
help(package=RGoogleDocs)

or, more obscurely:

help(package=RGoogleDocs)$info[[1]][9]

 (there may be a better way to deal with objects of type packageInfo
but I can't figure it out right at the moment).
  It looks as though one might be able to fix this by hacking the 
hard-coded URLs in the code, but as you suggest that might be above
your head.

  good luck ...
Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday

2010-07-21 Thread Henrique Dallazuanna
On Wed, Jul 21, 2010 at 11:24 AM, Ben Bolker bbol...@gmail.com wrote:

 Harlan Harris harlan at harris.name writes:

 
  Hi,
 
  I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything
  worked OK until this morning, when my ability to write into spreadsheet
  cells went away. I get the following weird error:
 
  Error in els[[type + 1]] : subscript out of bounds
 
  Looking at the Google Docs API changelog, I see the following:
 
  http://code.google.com/apis/spreadsheets/changelog.html
  Release 2010-01 (July 14, 2010)
 
  This is an advanced notice about an upcoming change.
 
 - Starting July 19, 2010, all links returned by all Spreadsheets API
 feeds will use HTTPS. This is being done in the interests of increased
 security. If you require the use of HTTP, we recommend that you remove
 the
 replace https with http in these links. Another announcement will be
 made
 on July 19, 2010, when this change goes to production.
 
  I suspect this is the problem. Fixing it is above my head, I'm afraid.
 Could
  anyone help? This is urgent. Thank you,


This is an Omegahat package (took me a little while to find it). Perhaps
 you should write to the package maintainer?

 library(RGoogleDocs)
 help(package=RGoogleDocs)

 or, more obscurely:

 help(package=RGoogleDocs)$info[[1]][9]

  (there may be a better way to deal with objects of type packageInfo
 but I can't figure it out right at the moment).


Maybe:
packageDescription('RGoogleDocs', fields = 'Author')


  It looks as though one might be able to fix this by hacking the
 hard-coded URLs in the code, but as you suggest that might be above
 your head.

  good luck ...
 Ben Bolker

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Clustering groups

2010-07-21 Thread syrvn

Hi,

is there a way in R to identify those cluster methods / distance measures
which best reflect predefined cluster groups.

Given 10 observations O1...O10. Optimally, these 10 observations cluster as
follows:
cluster1: O1, O2, O3, O4
cluster2: O5, O6
cluster3: O7, O8, O9, O10.

What I want is a method which identifies that cluster method / distance
measure which best reflect my predefined groups.

Is that somehow possible to do?
Cheers
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Clustering-groups-tp2297210p2297210.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Gabor Grothendieck
On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have a data frame A (below) with a grouping factor (group). I take
 my DV and create the new, lagged DV by applying the function lag.it
 (below). It works fine.

 A - data.frame(year=rep(c(1980:1984),3), group=
 factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
 lag.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, -1))[idx,]
  out-cbind(x, DVs[,2])  # wages[,2]
   names(out)[length(out)]-DV.lag
   return(out)
 }
 A
 A.lagged - do.call(rbind, by(A, A$group, lag.it))
 A.lagged


 Now, I am trying to create the oppostive of lag for DV (should I call
 it lead?)
 I tried exactly the same as above, but with a different number under
 lag function (below), but it's not working. I am clearly doing
 something wrong. Any advice?
 Thanks a lot!


 lead.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, 2))[idx,]
  out-cbind(x, DVs[,2])
   names(out)[length(out)]-DV.lead
   return(out)
 }
 A
 A.lead - do.call(rbind, by(A, A$group, lead.it))
 A.lead


Try this:

library(zoo)
z - read.zoo(A, index = 1, split = group, frequency = 1)
lag(z, c(-1, 0, 1))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best way to apply a list of functions to a dataset ?

2010-07-21 Thread Hadley Wickham
 ddply(ma, .(variable), summarise, mean = mean(value), sd = sd(value),
       skewness = skewness(value), median = median(value),
       mean.gt.med = mean.gt.med(value))

In principle, you should be able to do:

ddply(ma, .(variable), colwise(each(mean, sd, skewness, median, mean.gt.med)))

but currently colwise and each don't work together that well.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with time in R

2010-07-21 Thread Aaditya Nanduri
Ms. Chisholm,

If you could tell us how you plan to use the variables, we will have a
better understanding of what you are looking for and will be able to help
you.
Are you looking for the time in seconds? In that case, do as Mr. Holfman
says. He just skipped the part about converting the factors to characters.
You can do that by:
y - as.character(x) where x is the vector of factors.

Are you looking to have a list of hours, minutes and seconds? That can be
done too...Although it would be much easier to just have hours and min.sec

On Tue, Jul 20, 2010 at 7:33 AM, Sarah Chisholm sarah.chisholm...@ucl.ac.uk
 wrote:

 Hi,

 I have a problem with the time formatting in R. I have entered time in the
 format MM:SS.xyz and R has automatically classified this as a factor, but
 I need it numerically. However when I use as.numeric() it gives me totally
 different numbers. Is there any way I can tell R to read thes input as a
 number?

 Thank you very much

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Aaditya Nanduri
aaditya.nand...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interactions in GAMMs

2010-07-21 Thread Karen Moore
Hi,

I've an issue adding an interaction to a GAMM:

My model was of form:

gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV
+ s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) +
s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE),  family = poisson, data =
BIOFOR2)

with interaction of s(OLDWDLD:DISTWOOD).

However I got this error message below but don't know what it means? (my
data is composed of info for 150 plots)

#I Warning messages:
#2: In OLDWDLD:DISTWOOD :
 # numerical expression has 150 elements: only the first used
#3: In OLDWDLD:DISTWOOD :
#  numerical expression has 150 elements: only the first used

Can anyone offer advice on how to include this interaction in GAMM model?
Thanks

Karen Moore

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX

2010-07-21 Thread Harlan Harris
I unfortunately haven't received any responses about this problem. We (the
company I work for) are willing to discuss payment to someone who is willing
to quickly contribute a fix to the RGoogleDocs/RCurl toolchain that will
restore write access. Please contact me directly if you're interested. Thank
you,

 -Harlan Harris

On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name wrote:

 Hi,

 I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything
 worked OK until this morning, when my ability to write into spreadsheet
 cells went away. I get the following weird error:

 Error in els[[type + 1]] : subscript out of bounds

 Looking at the Google Docs API changelog, I see the following:

 http://code.google.com/apis/spreadsheets/changelog.html
 Release 2010-01 (July 14, 2010)

 This is an advanced notice about an upcoming change.

- Starting July 19, 2010, all links returned by all Spreadsheets API
feeds will use HTTPS. This is being done in the interests of increased
security. If you require the use of HTTP, we recommend that you remove the
replace https with http in these links. Another announcement will be
made on July 19, 2010, when this change goes to production.


 I suspect this is the problem. Fixing it is above my head, I'm afraid.
 Could anyone help? This is urgent. Thank you,

  -Harlan Harris



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactions in GAMMs

2010-07-21 Thread Vito Muggeo (UniPa)

Hi Karen,
I think you should decide what you mean for interaction. s(x:y) is 
meaningless


If you want to fit a surface you should use s(x,y).

If you want to fit a varying coefficient model (interaction between a 
linear and a smooth term) you should use the argument by in s().


The help files of the mgcv package by S. Wood  are quite clear and they 
also include a lot of examples.


Hope this helps you

vito



Karen Moore ha scritto:

Hi,

I've an issue adding an interaction to a GAMM:

My model was of form:

gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV
+ s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) +
s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE),  family = poisson, data =
BIOFOR2)

with interaction of s(OLDWDLD:DISTWOOD).

However I got this error message below but don't know what it means? (my
data is composed of info for 150 plots)

#I Warning messages:
#2: In OLDWDLD:DISTWOOD :
 # numerical expression has 150 elements: only the first used
#3: In OLDWDLD:DISTWOOD :
#  numerical expression has 150 elements: only the first used

Can anyone offer advice on how to include this interaction in GAMM model?
Thanks

Karen Moore

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lm: order of dropped columns

2010-07-21 Thread Anirban Mukherjee
Hi all,

If presented with a singular design matrix, lm drops columns to make the
design matrix non-singular. What algorithm is used to select which (and how
many) column(s) to drop? Particularly, given a factor, how does lm choose
levels of the factor to discard?

Thanks for the help.

Best,
Anirban

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactions in GAMMs

2010-07-21 Thread David Winsemius


On Jul 21, 2010, at 11:17 AM, Karen Moore wrote:


Hi,

I've an issue adding an interaction to a GAMM:

My model was of form:


# Package? Probably:
require(mgcv)



gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV)  
+ CANCOV

+ s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) +
s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE),  family = poisson,  
data =

BIOFOR2)

with interaction of s(OLDWDLD:DISTWOOD).

However I got this error message below but don't know what it means?


The documentation for s() says the unnamed arguments must be a list of  
covariates and I do not think OLDWDLD:DISTWOOD constitutes a list. (On  
the other hand it appears that the term list is being used a bit  
loosely here, since the examples do not explicitly enclose the  
covariates in the list() function and when I did so with the example  
on the te help page I get an error.)



(my
data is composed of info for 150 plots)

#I Warning messages:
#2: In OLDWDLD:DISTWOOD :
# numerical expression has 150 elements: only the first used
#3: In OLDWDLD:DISTWOOD :
#  numerical expression has 150 elements: only the first used

Can anyone offer advice on how to include this interaction in GAMM  
model?


Do you have Wood's book? Ch/sect 6.7 has worked examples using the  
te() function which I believe is the GAM(M) counterpart to linear  
interactions. He also makes the point that random arguments to gamm()  
need to be in the list form. (The documentation suggests this has not  
changes since the book was published.)


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Default For list.len Argument (list output truncated)

2010-07-21 Thread Soeren . Vogel
Hello

How can I set a default for the 'list.len' argument in 'str'?

x - as.data.frame(matrix(1:1000, ncol=1000))
str(x)
formals(str) - alist(object=, ...=, list.len=1000)
args(str); formals(str)
str(x)

Does not display errors, but does not work either.

Thanks for help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Gabor Grothendieck
On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have a data frame A (below) with a grouping factor (group). I take
 my DV and create the new, lagged DV by applying the function lag.it
 (below). It works fine.

 A - data.frame(year=rep(c(1980:1984),3), group=
 factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
 lag.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, -1))[idx,]
  out-cbind(x, DVs[,2])  # wages[,2]
   names(out)[length(out)]-DV.lag
   return(out)
 }
 A
 A.lagged - do.call(rbind, by(A, A$group, lag.it))
 A.lagged


 Now, I am trying to create the oppostive of lag for DV (should I call
 it lead?)
 I tried exactly the same as above, but with a different number under
 lag function (below), but it's not working. I am clearly doing
 something wrong. Any advice?
 Thanks a lot!


 lead.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, 2))[idx,]
  out-cbind(x, DVs[,2])
   names(out)[length(out)]-DV.lead
   return(out)
 }
 A
 A.lead - do.call(rbind, by(A, A$group, lead.it))
 A.lead


 Try this:

 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 lag(z, c(-1, 0, 1))


The ### line was missing:

library(zoo)
z - read.zoo(A, index = 1, split = group, frequency = 1)
z - as.zooreg(z) ###
lag(z, c(-1, 0, 1))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Dimitri Liakhovitski
Never mind- I figured it out:

A
library(zoo)
z - read.zoo(A, index = 1, split = group, frequency = 1)
z - as.zooreg(z) ###
result-lag(z, c(-1, 0, 1))
A
result

Thank you very much!

On Wed, Jul 21, 2010 at 12:08 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Thank you, Gabor, but sorry - what is the exact order of those rows again?
 Thank you!
 Dimitri

 On Wed, Jul 21, 2010 at 11:59 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have a data frame A (below) with a grouping factor (group). I take
 my DV and create the new, lagged DV by applying the function lag.it
 (below). It works fine.

 A - data.frame(year=rep(c(1980:1984),3), group=
 factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
 lag.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, -1))[idx,]
  out-cbind(x, DVs[,2])  # wages[,2]
   names(out)[length(out)]-DV.lag
   return(out)
 }
 A
 A.lagged - do.call(rbind, by(A, A$group, lag.it))
 A.lagged


 Now, I am trying to create the oppostive of lag for DV (should I call
 it lead?)
 I tried exactly the same as above, but with a different number under
 lag function (below), but it's not working. I am clearly doing
 something wrong. Any advice?
 Thanks a lot!


 lead.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, 2))[idx,]
  out-cbind(x, DVs[,2])
   names(out)[length(out)]-DV.lead
   return(out)
 }
 A
 A.lead - do.call(rbind, by(A, A$group, lead.it))
 A.lead


 Try this:

 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 lag(z, c(-1, 0, 1))


 The ### line was missing:

 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 z - as.zooreg(z) ###
 lag(z, c(-1, 0, 1))




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Dimitri Liakhovitski
Sorry, I don't think it's working.
the last 3 columns (on the right) of result contain the original data
of each group.
But there is no shift at all. I am trying to reach the following
result for each group: The first number disappears and at the bottom
an NA appears.
Is it possible?

On Wed, Jul 21, 2010 at 12:12 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Never mind- I figured it out:

 A
 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 z - as.zooreg(z) ###
 result-lag(z, c(-1, 0, 1))
 A
 result

 Thank you very much!

 On Wed, Jul 21, 2010 at 12:08 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Thank you, Gabor, but sorry - what is the exact order of those rows again?
 Thank you!
 Dimitri

 On Wed, Jul 21, 2010 at 11:59 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have a data frame A (below) with a grouping factor (group). I take
 my DV and create the new, lagged DV by applying the function lag.it
 (below). It works fine.

 A - data.frame(year=rep(c(1980:1984),3), group=
 factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
 lag.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, -1))[idx,]
  out-cbind(x, DVs[,2])  # wages[,2]
   names(out)[length(out)]-DV.lag
   return(out)
 }
 A
 A.lagged - do.call(rbind, by(A, A$group, lag.it))
 A.lagged


 Now, I am trying to create the oppostive of lag for DV (should I call
 it lead?)
 I tried exactly the same as above, but with a different number under
 lag function (below), but it's not working. I am clearly doing
 something wrong. Any advice?
 Thanks a lot!


 lead.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, 2))[idx,]
  out-cbind(x, DVs[,2])
   names(out)[length(out)]-DV.lead
   return(out)
 }
 A
 A.lead - do.call(rbind, by(A, A$group, lead.it))
 A.lead


 Try this:

 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 lag(z, c(-1, 0, 1))


 The ### line was missing:

 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 z - as.zooreg(z) ###
 lag(z, c(-1, 0, 1))




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday

2010-07-21 Thread Ben Bolker
On Wed, Jul 21, 2010 at 10:38 AM, Henrique Dallazuanna www...@gmail.com wrote:


 Maybe:
 packageDescription('RGoogleDocs', fields = 'Author')



or rather

packageDescription('RgoogleDocs', fields='Maintainer')

 (usually but not always the same person)

  thanks
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Dimitri Liakhovitski
Basically, if you look at the A lagged below, then you see how in each
group the values of A move downwards: and NA is added on top and
the last value in each group disappears.
I am trying to get the opposite: the first value in each group
disappears and NA is added at the bottom.
D.

set.seed(1234)
A - data.frame(year=rep(c(1980:1984),3), group=
factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
lag.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
idx - seq(length = length(DV))
DVs - cbind(DV, lag(DV, -1))[idx,]
out-cbind(x, DVs[,2])  # wages[,2]
   names(out)[length(out)]-DV.lag
   return(out)
}
A
A.lagged - do.call(rbind, by(A, A$group, lag.it))
A.lagged





On Wed, Jul 21, 2010 at 12:18 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Sorry, I don't think it's working.
 the last 3 columns (on the right) of result contain the original data
 of each group.
 But there is no shift at all. I am trying to reach the following
 result for each group: The first number disappears and at the bottom
 an NA appears.
 Is it possible?

 On Wed, Jul 21, 2010 at 12:12 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Never mind- I figured it out:

 A
 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 z - as.zooreg(z) ###
 result-lag(z, c(-1, 0, 1))
 A
 result

 Thank you very much!

 On Wed, Jul 21, 2010 at 12:08 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Thank you, Gabor, but sorry - what is the exact order of those rows again?
 Thank you!
 Dimitri

 On Wed, Jul 21, 2010 at 11:59 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have a data frame A (below) with a grouping factor (group). I take
 my DV and create the new, lagged DV by applying the function lag.it
 (below). It works fine.

 A - data.frame(year=rep(c(1980:1984),3), group=
 factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
 lag.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, -1))[idx,]
  out-cbind(x, DVs[,2])  # wages[,2]
   names(out)[length(out)]-DV.lag
   return(out)
 }
 A
 A.lagged - do.call(rbind, by(A, A$group, lag.it))
 A.lagged


 Now, I am trying to create the oppostive of lag for DV (should I call
 it lead?)
 I tried exactly the same as above, but with a different number under
 lag function (below), but it's not working. I am clearly doing
 something wrong. Any advice?
 Thanks a lot!


 lead.it - function(x) {
  DV - ts(x$DV, start = x$year[1])
  idx - seq(length = length(DV))
  DVs - cbind(DV, lag(DV, 2))[idx,]
  out-cbind(x, DVs[,2])
   names(out)[length(out)]-DV.lead
   return(out)
 }
 A
 A.lead - do.call(rbind, by(A, A$group, lead.it))
 A.lead


 Try this:

 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 lag(z, c(-1, 0, 1))


 The ### line was missing:

 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 z - as.zooreg(z) ###
 lag(z, c(-1, 0, 1))




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrain density to 0 at 0?

2010-07-21 Thread Ravi Varadhan
This is the boundary problem in density estimation.  One simple trick is to 
use the idea of reflection.  If your data is (y1, y2, ...,yn), you create a 
reflected data by appending `n' negative values to your data, call this y*.  
Estimate the kernel density for this as fhat(y*).  Redefine this estimate so 
that it is zero for all y  0, and is equal to 2 * fhat(y*).

Here is an example:

y - rexp(100, rate=10)

yref - c(-y, y)

fref - density(yref)

sel - fref$x = 0

fref$y[sel] - 2 * fref$y[sel]

par(mfrow=c(2,1))

plot(density(y))

plot(fref$x[sel], fref$y[sel], type=l)

This approach assumes that the density has 0 gradient at the boundary (gradient 
defined from the right hand side).

There are other (better) approaches as well.  Look at a book on density 
estimation.

Ravi.  


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Farley, Robert farl...@metro.net
Date: Tuesday, July 20, 2010 2:47 pm
Subject: Re: [R] Constrain density to 0 at 0?
To: r-help@r-project.org r-help@r-project.org


 Read the help page for density more carefully. Especially the bw and  
 
  from arguments.
  
  
  Yes,  it was my inability to make sense of the help page that 
 motivated my email.
  
  My distances range from 0.4 to 7.6 but the density plot ranges from 
 -2 to 10.  
  
  from=0 seems to hide the negative portion of the density without 
 setting it to 0.
  
  
  I've tried adjust, but it requires a value of 0.2, which results in a 
 very, very spiky plot.
  
  
  I was hoping for a process that could forbid impossible (negative) 
 values, yet allow the density plot to blend the individual 
 measurements. Is there a variable bw that could be set small at the 
 extrema, and larger in the range of the data?
  
  
  
  
   
  Robert Farley
  Metro
  www.Metro.net 
   
  -Original Message-
  From: David Winsemius [ 
  Sent: Monday, July 19, 2010 19:31
  To: Farley, Robert
  Cc: r-help@r-project.org
  Subject: Re: [R] Constrain density to 0 at 0?
  
  
  On Jul 19, 2010, at 9:57 PM, Farley, Robert wrote:
  
   I'm plotting some trip length frequencies using the following code:
  
   plot( density(zTestData$Distance, weights=zTestData$Actual),
  xlim=c(0,10),
  main=Test TLFD,
  xlab=Distance,
 col=6  )
   lines(density(zTestData$Distance, weights=zTestData$FlatWeight),  
   col=2)
   lines(density(zTestData$Distance, weights=zTestData$BrdWeight ),  
   col=3)
  
   which works fine except the distances are all positive, but the  
   densities don't drop to 0 until around -2 or -3.
  
   Is there a way for me to force the density plot to 0 at 0?
  
  Yes. (Assuming it can be zero, given the data.)
  
  Read the help page for density more carefully. Especially the bw and  
 
  from arguments.
  
  -- 
  David.
  
  __
  R-help@r-project.org mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Objects within environment

2010-07-21 Thread Duncan Murdoch

On 21/07/2010 12:27 PM, Megh Dal wrote:

Hi Duncan, thanks for your clarification. However I do not think I could really 
understand the envir argument in objects() function.

It is saying an alternative argument for name Is the alternative means the 
alternative of, let say, package:graphics (which is the name of an environment?). Can you give me 
an example of an alternative argument of that particular environment?
  


as.environment(package:graphics) is an environment.  
package:graphics is its name.

This is specifying the environment evaluation environment. What does the phase 
environment evaluation environment mean? Can you give me an example?
  


That looks like a typo to me.  Environment evaluation environment is 
meaningless.

Mostly there for back compatibility: again totally in dark, what does it mean for 
back compatibility? An example would definitely be great.
  


Presumably some older release of R used envir, and we still have it so 
that old code will still work.  But that's a signal that new code should 
never need to use it.

Toy said you need to give an environment, not the name of one. If I call 
someone I need to call with his name, right? Then if I need to give an environment then, 
without its name how can I do so?
  


This is a computer language, not a conversation.  Words have technical 
meanings that aren't always perfect matches to English meanings of the 
same words.  Here the name of an environment is what you get when you 
ask for the name attribute of it.  There are lots of different ways to 
refer to objects other than by the name in that technical sense.  For 
example, you could say


x - as.environment(package:graphics) # uses the environment's name
ls(envir=x) # refers to it by a variable holding it.

Duncan Murdoch

Can you please explain me in simple english? I think R help file should use 
more non-technical simple english language so that student like me can 
understand R in more easier way!

--- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com wrote:

 From: Duncan Murdoch murdoch.dun...@gmail.com
 Subject: Re: [R] Objects within environment
 To: Megh Dal megh700...@yahoo.com
 Cc: r-h...@stat.math.ethz.ch
 Date: Wednesday, July 21, 2010, 4:48 PM
 On 21/07/2010 5:57 AM, Megh Dal
 wrote:
  Hi all, I have following environments loaded with my
 current R session:
  
  search()
  [1] .GlobalEnv   
 package:stats 
package:graphics 
 package:grDevices
  [5] package:utils 
package:datasets 
 package:methods   Autoloads   
 [9] package:base
  How can I find the objects under a specific

 environment? Here I tries following:
  
  objects(envir=package:base)

  Error in objects(envir = package:base) : invalid
 'envir' argument
  
  It would be great if somebody would point me the

 correct arguments for object() function to find the onjects
 associated with it. In help file it is written that:
   envir: an alternative argument to ‘name’ for
 specifying the

environment evaluation environment. Mostly

 there for back

compatibility

  What is the wrong in my code?
 
 The easiest way to pick an item from the search list is by

 number:
 
 objects(3)

  or
 ls(3)
 
 will give you the objects in the graphics package, with the

 search list as above.  You can also specify the name as
 the name argument, e.g.
 
 objects(package:base)
 
 If you want to use the envir argument (why?), you need to

 give an environment, not the name of one.
 
 Duncan Murdoch
 
 



  



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Gabor Grothendieck
On Wed, Jul 21, 2010 at 12:18 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Sorry, I don't think it's working.
 the last 3 columns (on the right) of result contain the original data
 of each group.
 But there is no shift at all. I am trying to reach the following
 result for each group: The first number disappears and at the bottom
 an NA appears.
 Is it possible?


It works for me.  Here is the result.  As we see

- the first 3 columns are lagged ahead so that they start at 1981
  which is one year later than the original columns start,
- the second set of 3 columns are the originals so they start at 1980
- the last set of 3 columns are lagged so they start at 1979 which is
  one year before the original columns start.

 set.seed(123)
 A - data.frame(year=rep(c(1980:1984),3), group=
+ factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 z - as.zooreg(z) ###
 lag(z, c(-1, 0, 1))
X1.lag.1   X2.lag.1   X3.lag.1 X1.lag0X2.lag0X3.lag0
1979  NA NA NA  NA NA NA
1980  NA NA NA -0.56047565  1.7150650  1.2240818
1981 -0.56047565  1.7150650  1.2240818 -0.23017749  0.4609162  0.3598138
1982 -0.23017749  0.4609162  0.3598138  1.55870831 -1.2650612  0.4007715
1983  1.55870831 -1.2650612  0.4007715  0.07050839 -0.6868529  0.1106827
1984  0.07050839 -0.6868529  0.1106827  0.12928774 -0.4456620 -0.5558411
1985  0.12928774 -0.4456620 -0.5558411  NA NA NA
 X1.lag1X2.lag1X3.lag1
1979 -0.56047565  1.7150650  1.2240818
1980 -0.23017749  0.4609162  0.3598138
1981  1.55870831 -1.2650612  0.4007715
1982  0.07050839 -0.6868529  0.1106827
1983  0.12928774 -0.4456620 -0.5558411
1984  NA NA NA
1985  NA NA NA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX

2010-07-21 Thread Duncan Temple Lang
Hi Harlan

  Can you send some code so that we can reproduce the problem.
That will enable me to fix the problem quicker.

  D.

On 7/21/10 8:26 AM, Harlan Harris wrote:
 I unfortunately haven't received any responses about this problem. We
 (the company I work for) are willing to discuss payment to someone who
 is willing to quickly contribute a fix to the RGoogleDocs/RCurl
 toolchain that will restore write access. Please contact me directly if
 you're interested. Thank you,
 
  -Harlan Harris
 
 On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name
 mailto:har...@harris.name wrote:
 
 Hi,
 
 I'm using RGoogleDocs/RCurl to update a Google Spreadsheet.
 Everything worked OK until this morning, when my ability to write
 into spreadsheet cells went away. I get the following weird error:
 
 Error in els[[type + 1]] : subscript out of bounds
 
 Looking at the Google Docs API changelog, I see the following:
 
 http://code.google.com/apis/spreadsheets/changelog.html
 
 
 Release 2010-01 (July 14, 2010)
 
 This is an advanced notice about an upcoming change.
 
 * Starting July 19, 2010, all links returned by all Spreadsheets
   API feeds will use HTTPS. This is being done in the interests
   of increased security. If you require the use of HTTP, we
   recommend that you remove the replace |https| with |http| in
   these links. Another announcement will be made on July 19,
   2010, when this change goes to production.
 
 
 I suspect this is the problem. Fixing it is above my head, I'm
 afraid. Could anyone help? This is urgent. Thank you,
 
  -Harlan Harris
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with invoking R from the command line (Windows Vista)

2010-07-21 Thread Camster

Dear all

I am running in a problem when trying to run R from Windows command line.
I am runnning Windows Vista, R-2.10.1. 

I have a script I would like to run remotely from another program. As it did
not work, I thought I would test the script from the Windows command line
which highlighted a problem. 
When I open the command line and type R at the prompt (or any other variant
(R CMD BATCH, Rgui.exe or Rterm.exe), I get the error message: 
'Rscript' is not recognized as an internal or external command,
operable program or batch file.

if I move to the R directory (cd usr\R\R-2.10.1\bin), and try running the
script ( R  myscript.R --vanilla) 
 it works fine. 
I thought the problem may come from the environment variables (tried
specifying explicit location of R.exe), but this did not solve the problem.
I also re-install R but that did not help either.

I cannot think of anything else. 
I have read the R installation and rwfaq notes, and look through the posts
but I cannot find any suggestion on how to fix the bug.

If anyone can suggest anything, that will be much appreciated.

Many thanks

Camille

PS: I am also running R on another machine (Windows XP) and I don't have a
similar problem!



-- 
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-invoking-R-from-the-command-line-Windows-Vista-tp2297429p2297429.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SPAM](Aktien) Re: Chi-square distribution probability density function:

2010-07-21 Thread David Winsemius


On Jul 21, 2010, at 12:36 PM, Knut Krueger wrote:


David Winsemius schrieb:



On Jul 21, 2010, at 6:52 AM, Knut Krueger wrote:


Hi to all I found
an formular of an **
***p-Value Calculator for the Chi-Square test*
*http://www.danielsoper.com/statcalc/calc11.aspx*
*with the formula*
http://www.danielsoper.com/statkb/topic11.aspx
*what's the gamma function of this formula in r?*
*df=5*
*ch2=25.50878*
*the following code does not give the result  0.001 for the  
values above *
*p= ((0.5^(df/2))/gamma(df))*(ch2^((df/2)-1))* (2.718281828459^(- 
ch2/2))

or is there any other error?


Check your implementation of that formula. You made an error in the  
gamma argument.


Sorry a copy and paste error but gamma(df/2) does not solve the  
problem

I am not able to get the implementation of the function working :-(


See also:

?Chisquare
?gamma


there is also a different result using this function from the helpf  
files and the function

/f_n(x) = 1 / (2^(n/2) Gamma(n/2)) x^(n/2-1) e^(-x/2)


That is (perhaps) because you are confusing a density function with a  
cumulative probability function. The expression above should give the  
same result as a (proper) R transcription of the formula on the page  
you offered above and should be what dchisq referenced on the ? 
Chisquare page returns as well.


Sure enough...
 dchisq(25.50878, 5)
[1] 4.951e-05
 p= ((0.5^(df/2))/gamma(df/2))*(ch2^((df/2)-1))* (2.718281828459^(- 
ch2/2))

 p
[1] 4.951e-05

 Those formulae only differ in how they use grouping of the constant  
terms. (In R the function is gamma, not Gamma.) You should be able to  
get the P(X^2|x25.508, df=5) from an integration of that function.




Knut

/


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] String processing - is there a better way

2010-07-21 Thread Davis, Brian
I have a two part question

Part 1) 
I am trying to remove characters in a string based on the position of a key 
character in another string.  I have a solution that works but it requires a 
for-loop.  A vectorized way of doing this has alluded me.  

CleanRead-function(x,y) {

  if (!is.character(x)) 
    x - as.character(x)
  if (!is.character(y)) 
    y - as.character(y)

  idx-grep(\\*, x, value=FALSE)
  starpos-gregexpr(\\*, x[idx])
  
  ysplit-strsplit(y[idx], '')
  n-length(idx)
  for(i in 1:n) {
    ysplit[[i]][starpos[[i]]] = 
  }

  y[idx]-unlist(lapply(ysplit, paste, sep='', collapse=''))
  return(y)
}

x-c(AA*.*A **a.a*,,,A, C*c.., **aA) 
y-c(abcdefghi, abcdefghij, abcde, abcd)

CleanRead(x,y)
[1] abdfghi cdeghij acde    cd


Is there a better way to do this?

Part 2) 
My next step in the string processing is to take the characters in the output 
of CleanRead and subtract 33 from the ascii value of the character to obtain an 
integer. Again I have a solution that works, involving splitting the string 
into characters then converting them to factors (starting at ascii 34) and 
using unclass to get the integer value. (kindof a atoi(x)-33 all in one step)

I looked for the C equivalent of atoi, but the only help I could find (R-help 
2003) suggested using as.numeric.  However, the help file (and testing) shows 
you get 'NA'.   

Am I missing an easier way to do this?



Thanks in advance,

Brian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with invoking R from the command line (Windows Vista)

2010-07-21 Thread Duncan Murdoch

On 21/07/2010 12:55 PM, Camster wrote:

Dear all

I am running in a problem when trying to run R from Windows command line.
I am runnning Windows Vista, R-2.10.1. 


I have a script I would like to run remotely from another program. As it did
not work, I thought I would test the script from the Windows command line
which highlighted a problem. 
When I open the command line and type R at the prompt (or any other variant
(R CMD BATCH, Rgui.exe or Rterm.exe), I get the error message: 
'Rscript' is not recognized as an internal or external command,

operable program or batch file.
  


This is a sign that your PATH environment variable doesn't include the 
directory RHOME\bin. 




if I move to the R directory (cd usr\R\R-2.10.1\bin), and try running the
script ( R  myscript.R --vanilla) 
 it works fine. 
I thought the problem may come from the environment variables (tried

specifying explicit location of R.exe), but this did not solve the problem.
I also re-install R but that did not help either.
  


You don't say exactly what you tried, but if you tried changing your 
PATH, you should try again, and verify that the change is in place in 
the session where you're trying to run R.


Duncan Murdoch

I cannot think of anything else. 
I have read the R installation and rwfaq notes, and look through the posts

but I cannot find any suggestion on how to fix the bug.

If anyone can suggest anything, that will be much appreciated.

Many thanks

Camille

PS: I am also running R on another machine (Windows XP) and I don't have a
similar problem!






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Dimitri Liakhovitski
I reinstalled zoo and now I can see the years on left (raw names).
Before - I could not see them.
Thank you!

On Wed, Jul 21, 2010 at 12:51 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 12:18 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Sorry, I don't think it's working.
 the last 3 columns (on the right) of result contain the original data
 of each group.
 But there is no shift at all. I am trying to reach the following
 result for each group: The first number disappears and at the bottom
 an NA appears.
 Is it possible?


 It works for me.  Here is the result.  As we see

 - the first 3 columns are lagged ahead so that they start at 1981
  which is one year later than the original columns start,
 - the second set of 3 columns are the originals so they start at 1980
 - the last set of 3 columns are lagged so they start at 1979 which is
  one year before the original columns start.

 set.seed(123)
 A - data.frame(year=rep(c(1980:1984),3), group=
 + factor(sort(rep(1:3,5))), DV=c(rnorm(15)))
 library(zoo)
 z - read.zoo(A, index = 1, split = group, frequency = 1)
 z - as.zooreg(z) ###
 lag(z, c(-1, 0, 1))
        X1.lag.1   X2.lag.1   X3.lag.1     X1.lag0    X2.lag0    X3.lag0
 1979          NA         NA         NA          NA         NA         NA
 1980          NA         NA         NA -0.56047565  1.7150650  1.2240818
 1981 -0.56047565  1.7150650  1.2240818 -0.23017749  0.4609162  0.3598138
 1982 -0.23017749  0.4609162  0.3598138  1.55870831 -1.2650612  0.4007715
 1983  1.55870831 -1.2650612  0.4007715  0.07050839 -0.6868529  0.1106827
 1984  0.07050839 -0.6868529  0.1106827  0.12928774 -0.4456620 -0.5558411
 1985  0.12928774 -0.4456620 -0.5558411          NA         NA         NA
         X1.lag1    X2.lag1    X3.lag1
 1979 -0.56047565  1.7150650  1.2240818
 1980 -0.23017749  0.4609162  0.3598138
 1981  1.55870831 -1.2650612  0.4007715
 1982  0.07050839 -0.6868529  0.1106827
 1983  0.12928774 -0.4456620 -0.5558411
 1984          NA         NA         NA
 1985          NA         NA         NA




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with invoking R from the command line (Windows Vista)

2010-07-21 Thread Gabor Grothendieck
On Wed, Jul 21, 2010 at 12:55 PM, Camster c.szmar...@googlemail.com wrote:

 Dear all

 I am running in a problem when trying to run R from Windows command line.
 I am runnning Windows Vista, R-2.10.1.

 I have a script I would like to run remotely from another program. As it did
 not work, I thought I would test the script from the Windows command line
 which highlighted a problem.
 When I open the command line and type R at the prompt (or any other variant
 (R CMD BATCH, Rgui.exe or Rterm.exe), I get the error message:
 'Rscript' is not recognized as an internal or external command,
 operable program or batch file.

 if I move to the R directory (cd usr\R\R-2.10.1\bin), and try running the
 script ( R  myscript.R --vanilla)
  it works fine.
 I thought the problem may come from the environment variables (tried
 specifying explicit location of R.exe), but this did not solve the problem.
 I also re-install R but that did not help either.

 I cannot think of anything else.
 I have read the R installation and rwfaq notes, and look through the posts
 but I cannot find any suggestion on how to fix the bug.

 If anyone can suggest anything, that will be much appreciated.

 Many thanks

 Camille

 PS: I am also running R on another machine (Windows XP) and I don't have a
 similar problem!


If you don't want to change your path you can use Rgui.bat, Rterm.bat,
etc. in the batchfiles distribution at
   http://batchfiles.googlecode.com
Just put any of them that you wish to use on your path (they are self
contained Windows batch files, i.e. with no dependencies) and then
when you invoke any of them they will look up R in the registry and
run the corresponding command.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Gabor Grothendieck
On Wed, Jul 21, 2010 at 1:16 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 I reinstalled zoo and now I can see the years on left (raw names).
 Before - I could not see them.
 Thank you!

Note that the result is a zoo object.  A zoo object consists of data
(which is everything except
the years, here) and the time index which is shown along the left hand side.

If lg is the zoo object then coredata(lg) and index(lg) give the two
components while
as.data.frame(lg) does give a data frame with the years as rownames.

lg - lag(z, c(-1, 0, 1))
coredata(lg)
index(lg)
as.data.frame(lg)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Chi-square distribution probability density function:

2010-07-21 Thread David Winsemius


On Jul 21, 2010, at 1:18 PM, Knut Krueger wrote:


David Winsemius schrieb:



And exactly why did you think I offered

?Chisquare
I was completely on the wrong way, and tried to find a solution with  
the formula instead to substitute the formula.


So I tried to implement pchisq into the formula - and of course I  
got wrong values ...


 p - function(x) ((0.5^(5/2))/gamma(5/2))*(x^((5/2)-1))*  
(2.718281828459^(-x/2))

 integrate(p, 25.508, Inf)
0.000 with absolute error  2.7e-05

Check result

 pchisq(25.508,5)
[1] 0.
 1-pchisq(25.508,5)
[1] 0.000



Thank's Knut


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The opposite of lag

2010-07-21 Thread Dimitri Liakhovitski
Thanks a lot, it's very helpful!

On Wed, Jul 21, 2010 at 1:21 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 1:16 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 I reinstalled zoo and now I can see the years on left (raw names).
 Before - I could not see them.
 Thank you!

 Note that the result is a zoo object.  A zoo object consists of data
 (which is everything except
 the years, here) and the time index which is shown along the left hand side.

 If lg is the zoo object then coredata(lg) and index(lg) give the two
 components while
 as.data.frame(lg) does give a data frame with the years as rownames.

 lg - lag(z, c(-1, 0, 1))
 coredata(lg)
 index(lg)
 as.data.frame(lg)




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] String processing - is there a better way

2010-07-21 Thread Gabor Grothendieck
On Wed, Jul 21, 2010 at 1:02 PM, Davis, Brian brian.da...@uth.tmc.edu wrote:
 I have a two part question

 Part 1)
 I am trying to remove characters in a string based on the position of a key 
 character in another string.  I have a solution that works but it requires a 
 for-loop.  A vectorized way of doing this has alluded me.

 CleanRead-function(x,y) {

   if (!is.character(x))
     x - as.character(x)
   if (!is.character(y))
     y - as.character(y)

   idx-grep(\\*, x, value=FALSE)
   starpos-gregexpr(\\*, x[idx])

   ysplit-strsplit(y[idx], '')
   n-length(idx)
   for(i in 1:n) {
     ysplit[[i]][starpos[[i]]] = 
   }

   y[idx]-unlist(lapply(ysplit, paste, sep='', collapse=''))
   return(y)
 }

 x-c(AA*.*A **a.a*,,,A, C*c.., **aA)
 y-c(abcdefghi, abcdefghij, abcde, abcd)

 CleanRead(x,y)
 [1] abdfghi cdeghij acde    cd


 Is there a better way to do this?

 Part 2)
 My next step in the string processing is to take the characters in the output 
 of CleanRead and subtract 33 from the ascii value of the character to obtain 
 an integer. Again I have a solution that works, involving splitting the 
 string into characters then converting them to factors (starting at ascii 34) 
 and using unclass to get the integer value. (kindof a atoi(x)-33 all in one 
 step)

 I looked for the C equivalent of atoi, but the only help I could find (R-help 
 2003) suggested using as.numeric.  However, the help file (and testing) shows 
 you get 'NA'.


This splits x and y into vectors of single characters, extracts those
from y for which x is not * and then matches the result to letters to
return a number.

f - function(x, y) match(y[x != *], letters)
mapply(f, strsplit(x, ), strsplit(y, ))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX

2010-07-21 Thread Duncan Temple Lang

Hi Harlan

 If you install the latest version of RCurl from source via

  install.packages(RCurl, repos = http://www.omegahat.org/R;)

 and that should solve the problem, assuming I have been reproducing the same
 problem you mentioned.


 You haven't mentioned what operating system your are on. If you are on Windows,
 that will pick up the binary version. If you are on the mac, you will have to 
build
 it from source.


D.

On 7/21/10 8:26 AM, Harlan Harris wrote:
 I unfortunately haven't received any responses about this problem. We
 (the company I work for) are willing to discuss payment to someone who
 is willing to quickly contribute a fix to the RGoogleDocs/RCurl
 toolchain that will restore write access. Please contact me directly if
 you're interested. Thank you,
 
  -Harlan Harris
 
 On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name
 mailto:har...@harris.name wrote:
 
 Hi,
 
 I'm using RGoogleDocs/RCurl to update a Google Spreadsheet.
 Everything worked OK until this morning, when my ability to write
 into spreadsheet cells went away. I get the following weird error:
 
 Error in els[[type + 1]] : subscript out of bounds
 
 Looking at the Google Docs API changelog, I see the following:
 
 http://code.google.com/apis/spreadsheets/changelog.html
 
 
 Release 2010-01 (July 14, 2010)
 
 This is an advanced notice about an upcoming change.
 
 * Starting July 19, 2010, all links returned by all Spreadsheets
   API feeds will use HTTPS. This is being done in the interests
   of increased security. If you require the use of HTTP, we
   recommend that you remove the replace |https| with |http| in
   these links. Another announcement will be made on July 19,
   2010, when this change goes to production.
 
 
 I suspect this is the problem. Fixing it is above my head, I'm
 afraid. Could anyone help? This is urgent. Thank you,
 
  -Harlan Harris
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Objects within environment

2010-07-21 Thread Bert Gunter
 Can you please explain me in simple english? I think R help file should
use more non-technical simple english language so that student like me can
understand R in more easier way!

-- Megh (and others):

Without judging this particular issue, writing good technical documentation
-- whether it be plumbing, refrigerator repair, or computer code -- is
difficult. I think relatively few people do it well. IMHO, for example,
Brian Ripley excels at this, because I find him complete, accurate, and
terse; I, however, do not. But others may disagree. It's hard to hit a
sweet spot that is at the same time, complete, correct, as concise as
possible, but still comprehensible to a broad range of users.

So while I do not wish to censure -- after all, if you had trouble
understanding, your complaint is justified -- I do want to urge a bit more
understanding and effort on your part. Before you complain, try writing some
documentation yourself. It's damn hard! ;)

Cheers,
Bert


Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactions in GAMMs

2010-07-21 Thread Gavin Simpson
On Wed, 2010-07-21 at 16:17 +0100, Karen Moore wrote:
 Hi,
 
 I've an issue adding an interaction to a GAMM:
 
 My model was of form:
 
 gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV
 + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) +
 s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE),  family = poisson, data =
 BIOFOR2)
 
 with interaction of s(OLDWDLD:DISTWOOD).
 
 However I got this error message below but don't know what it means? (my
 data is composed of info for 150 plots)

An interaction is formed in a GAMM using a smoother in two or more
dimensions, such as s(OLDWDLD, DISTWOOD) or te(OLDWDLD, DISTWOOD), the
latter allows for different basis function within the two smoothers and
for different inherent scales for the two variables.

IIRC the latter allows you to produce strictly nested models for
likelihood ratio testing:

te(OLDWDLD) + te(DISTWOOD)

is nested within

te(OLDWDLD, DISTWOOD)

HTH

G

 
 #I Warning messages:
 #2: In OLDWDLD:DISTWOOD :
  # numerical expression has 150 elements: only the first used
 #3: In OLDWDLD:DISTWOOD :
 #  numerical expression has 150 elements: only the first used
 
 Can anyone offer advice on how to include this interaction in GAMM model?
 Thanks
 
 Karen Moore
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with invoking R from the command line (Windows Vista) - - SORTED!

2010-07-21 Thread Camster

Thanks for suggestion.

It was indeed a path problem. 
I move R into a directory without spaces and added the directory RHOME\bin
in the Path environment variable.

All sorted.

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-invoking-R-from-the-command-line-Windows-Vista-tp2297429p2297552.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX

2010-07-21 Thread Harlan Harris
*Thank you*! Yes, version 1.4-3 of RCurl solves this problem! If you put a tip
jar https://www.paypal.com/cgi-bin/webscr?cmd=_donate-intro-outside on
OmegaHat, Duncan, I will seriously kick a few bucks your way. Getting my
code back running will save me a lot of time and anxiety! Thanks for all
your efforts,

 -Harlan

On Wed, Jul 21, 2010 at 1:45 PM, Duncan Temple Lang dun...@wald.ucdavis.edu
 wrote:


 Hi Harlan

  If you install the latest version of RCurl from source via

  install.packages(RCurl, repos = http://www.omegahat.org/R;)

  and that should solve the problem, assuming I have been reproducing the
 same
  problem you mentioned.


  You haven't mentioned what operating system your are on. If you are on
 Windows,
  that will pick up the binary version. If you are on the mac, you will have
 to build
  it from source.


D.

 On 7/21/10 8:26 AM, Harlan Harris wrote:
  I unfortunately haven't received any responses about this problem. We
  (the company I work for) are willing to discuss payment to someone who
  is willing to quickly contribute a fix to the RGoogleDocs/RCurl
  toolchain that will restore write access. Please contact me directly if
  you're interested. Thank you,
 
   -Harlan Harris
 
  On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name
  mailto:har...@harris.name wrote:
 
  Hi,
 
  I'm using RGoogleDocs/RCurl to update a Google Spreadsheet.
  Everything worked OK until this morning, when my ability to write
  into spreadsheet cells went away. I get the following weird error:
 
  Error in els[[type + 1]] : subscript out of bounds
 
  Looking at the Google Docs API changelog, I see the following:
 
  http://code.google.com/apis/spreadsheets/changelog.html
 
 
  Release 2010-01 (July 14, 2010)
 
  This is an advanced notice about an upcoming change.
 
  * Starting July 19, 2010, all links returned by all Spreadsheets
API feeds will use HTTPS. This is being done in the interests
of increased security. If you require the use of HTTP, we
recommend that you remove the replace |https| with |http| in
these links. Another announcement will be made on July 19,
2010, when this change goes to production.
 
 
  I suspect this is the problem. Fixing it is above my head, I'm
  afraid. Could anyone help? This is urgent. Thank you,
 
   -Harlan Harris
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prediction from a logistic mixed effects model

2010-07-21 Thread kdionisio

Hi,
Is there any similar command to predict which can be used with a logistic
random effects model?

I have run a random effects model using lme(), and then use
predict.lme() with no problems.

However, I would also like to run a logistic random effects model, and then
also run a predict command on the logistic random effects model.

If I use lme(), I can not include a family option to indicate I am
running a logistic model.  I have instead successfully used lmer(), with
the family=binomial(link=logit) option. However, I then can not find an
equivalent command to predict to use with lmer. Any suggestions/options?
Thanks very much for the help.
-K
-- 
View this message in context: 
http://r.789695.n4.nabble.com/prediction-from-a-logistic-mixed-effects-model-tp2297668p2297668.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting Between Character and Numeric Objects

2010-07-21 Thread Erin
Hello,

I'm trying to convert a vector of string objects to a numeric object by
initializing the variables with values.  I use the function below to scan
through a matrix and create mass action flux relationships:

makeMassActionFluxes = function(sMatrix) {
 #Allocate a matrix with identical dimensions as the inputted stoichiometric
matrix
 temp = matrix(nrow = dim(sMatrix)[1], ncol = dim(sMatrix)[2])
velocities = vector(length = dim(sMatrix)[2])
 ## This series of 'for' loops scans the stoichiometrix matrix looking for
negative integers.
 ## Once it identifies a negative integer it pulls the row name and
exponentiates it to the
## appropriate stoichiometry, creating the flux expression.
 for (i in 1:dim(sMatrix)[1]) {
for (j in 1:dim(sMatrix)[2]) {
 if (sMatrix[i, j] = 0) (temp[i, j] = NA)
else temp[i, j] = paste(rownames(sMatrix)[i], '^', -sMatrix[i, j], sep = '')
 }
 }
 ## Scan temp by column.  Ignore strings that begin with '1.'  Use paste()
to multiply strings.
 for (k in 1:dim(temp)[2]) {#Loop through columns
 velocities[k] = paste(c(temp[ , k]), sep = '', collapse = '*')
 velocities[k] = gsub('*NA', '', velocities[k], fixed = TRUE)
velocities[k] = gsub('NA*', '', velocities[k], fixed = TRUE)
 }
 return(velocities)
}

If I pass in the following example matrix, I get a list of flux expressions
('v') just like I want:

testS = matrix(c(-1, 0, 2, -1, -1, 0, 0, -1, -1, 1, 0, 0, 0, 1, 0),
nrow = 5,
ncol = 3,
 byrow = TRUE)
rownames(testS) = c(met1, met2, met3, met4, met5)
colnames(testS) = c('rxn1', 'rxn2', 'rxn3')

v = makeMassActionFluxes(testS)
v
[1] met1^1*met2^1 met2^1*met3^1 met3^1

My next step is to use an ODE generating function to initialize values for
the met variables.

state = c(met1 = 10, met2 = 10, met3 = 10, met4 = 10, met5 = 10)
parameters = 1
ODE = function(t, state, parameters, S, velocity) {
 with(as.list(c(state, parameters, S, velocity)), {
 des = S %*% v
 #The products should look like this:
#dmet1 = (-1)*met1^1*met2^1 + (0)*met2^1*met3^1 + (2)*met3^1
 #Return
list(des)
 })
 }

When I call 'makeODE' I get the following error (which I expected):

Error in S %*% v : requires numeric/complex matrix/vector arguments

I'm wondering if there is an easy way to remove the quotes on the elements
on 'v' or make 'v' into a numeric vector so that I can initialize them with
values?  Or, is there an easier way to generate the flux expressions in my
'makeMassActionFluxes' function so that eventually passing in initial values
for the met variables will be easier?

Thanks in advance!

Erin

-- 
Erin Rachael Shellman
The University of Michigan
Bioinformatics PhD Candidate
http://www.erinshellman.com
shell...@umich.edu
(937) 321.1129

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm: order of dropped columns

2010-07-21 Thread Dennis Murphy
Hi:

(1) lm() drops columns in a rank-deficient model matrix X to make X'X
nonsingular - this is called a full-rank reparameterization of the linear
model.

(2) How many columns of X are dropped depends on its rank, which in turn
depends on the number of constraints in the model matrix. This is related to
the degrees of freedom associated with each term in the corresponding linear
model.. The default contrasts are
 options()$contrasts
unordered   ordered
contr.treatment  contr.poly

Other choices include contr.helmert() and contr.sum(). See the help page
?contrasts for further details. See also section 6.2 of Venables and
Ripley's _Modern Applied Statistics with S, 4th ed._ for further information
on the connection between the columns of the model matrix in ANOVA and the
defined sets of contrasts.

Under the default contrasts, the first column is dropped for the main effect
terms. Here's a simple example of a balanced 2-way ANOVA with interaction:

d - data.frame(a = factor(rep(letters[1:3], each = 4)),
b = factor(rep(rep(1:2, each = 2), 3)),
c = rnorm(12))
 d
   a b   c
1  a 1 -0.77367688
2  a 1 -0.79069791
3  a 2  0.69257133
4  a 2  2.46788204
5  b 1  0.38892289
6  b 1 -0.03521033
7  b 2 -0.01071611
8  b 2 -0.74209425
9  c 1  1.36974281
10 c 1 -1.22775441
11 c 2  0.29621976
12 c 2  0.28208192
m - aov(c ~ a * b, data = d)
model.matrix(m)
   (Intercept) ab ac b2 ab:b2 ac:b2
11  0  0  0 0 0
21  0  0  0 0 0
31  0  0  1 0 0
41  0  0  1 0 0
51  1  0  0 0 0
61  1  0  0 0 0
71  1  0  1 1 0
81  1  0  1 1 0
91  0  1  0 0 0
10   1  0  1  0 0 0
11   1  0  1  1 0 1
12   1  0  1  1 0 1
attr(,assign)
[1] 0 1 1 2 3 3
attr(,contrasts)
attr(,contrasts)$a
[1] contr.treatment

attr(,contrasts)$b
[1] contr.treatment

Notice that the first column of each main effect is dropped, and that the
interaction columns are the products of the retained a columns with the
retained b columns. The attr() components verify that the contrast method
used for this matrix is the default contr.treatment.

 anova(m)
Analysis of Variance Table

Response: c
  Df Sum Sq Mean Sq F value Pr(F)
a  2 0.5001 0.25003  0.2827 0.7633
b  1 1.3700 1.36999  1.5489 0.2597
a:b2 4.5647 2.28235  2.5804 0.1554
Residuals  6 5.3070 0.88450

Examination of the degrees of freedom tells us that there are two
independent contrasts for a, one independent contrast for b and two
independent contrasts for the interaction of a and b, which are shown in
model.matrix(m) above.

If you want to reset the baselline factor level, see ?relevel. Also look
into the contrast package on CRAN.

HTH,
Dennis

On Wed, Jul 21, 2010 at 8:40 AM, Anirban Mukherjee am...@cornell.eduwrote:

 Hi all,

 If presented with a singular design matrix, lm drops columns to make the
 design matrix non-singular. What algorithm is used to select which (and how
 many) column(s) to drop? Particularly, given a factor, how does lm choose
 levels of the factor to discard?

 Thanks for the help.

 Best,
 Anirban

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Get distribution of positive/negative examples for each cluster

2010-07-21 Thread Boya Sun
Dear R experts,

I have a labeled data set. Each data is assigned a binary label 0 or 1.
Assume that I use some clustering algorithm to group the data by clusters
(using some features of the data). Now I want to know how many data are
labeled as 0/1 in each cluster.

For example, assume that I have 9 labeled data grouped into three clusters.
The ids of the clusters are 1, 2, and 3.  The dataset is represented by the
following matrix:

membershipLabel
d110
d210
d311
d420
d521
d621
d731
d831
d931

Now I want to get the following output, telling me how many data are labeled
as 0 and 1 in each cluster

cluster_id0-data1-data
121
212
303

The output does not have to be a matrix, it could be a summary of the
statistics.

How should I approach this problem? What R functions should I use to get
such information?

Thanks so much!

Boya

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help me with prediction in linear model

2010-07-21 Thread vijaysheegi

Hi R-community,
I have the code as follows,i Fitted model as follows
lbeer-log(beer_monthly)
t-seq(1956,1995.2,length=length(beer_monthly)) #beer_monthly contains 400+
entries
t2=t^2
beer_fit_parabola=lm(lbeer~t+t2)

Below is not working for me.
Please help me in preparing the new data set for the below prediction



predict(beer_fit_parabola,newdata=data.frame(t=seq(1995,1998,length=20),t2=seq(1995,1998,length=20))

#it is listing all 400+ entries ,but not 20 ahead prediction.

Thanks In advance for your help
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Help-me-with-prediction-in-linear-model-tp2297313p2297313.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmomRFA package: error bounds/confidence intervals

2010-07-21 Thread Tonja Krueger

   Dear List

   I’m using the ”lmomRFA” package to fit different distributions to my data
   sample. To calculate the error bounds I used:
regsimq(…)
and
sitequantbounds(…)
 
So my questions are:
Are error bounds and confidence intervals the same thing?
And: Does 
 
regsimq(…  boundprob = c(0.05, 0.95)) 
 
calculate the 90 or the 95% confidence interval?
If error bounds and confidence intervals are not equal: Is there a way to calcu
late confidence intervals for my fitted distributions (gev, weibull, gumbel…)?
 
Thank you in advance,
Tonja


   Neu: WEB.DE De-Mail - Einfach wie E-Mail, sicher wie ein Brief!
   Jetzt De-Mail-Adresse reservieren: https://produkte.web.de/go/demail02
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sem by variable x

2010-07-21 Thread Daniel Caro
Hi Jarret,

Thank you for your answer. I get the following message:

Error in cov(a.df[, 2:7], nrow(a.df)) : incompatible dimensions

The function seems to run for some countries but then appears to stop
when a country has incomplete data (1 var is missing, for example).
How to force the function to continue or skip those countries?

Thank you,
Daniel



On Tue, Jul 20, 2010 at 7:24 PM, Jarrett Byrnes byr...@msi.ucsb.edu wrote:
 You may want to take a look at the lavaan package and use the multigroup 
 analysis there (and see if you even need to group by country as well).

 Otherwise, you could do something like

 library(sem)
 library(plyr)

 cfa_func-function(a.df){

        cfa-sem(ses.model, cov(a.df[,2:7], nrow(a.df)))
        print(summary(cfa))
 }
 d_ply(data, idcntry, cfa_func)

 -Jarrett

 On Jul 20, 2010, at 6:22 AM, Daniel Caro wrote:

 Hi R users,

 I am new in R. I would like to perform confirmatory factor analysis
 for a data set of countries. My data are:

 data - read.csv(ses.raw, header = TRUE)
 attach(data)
 names(data)

 [1] idcntry momed   daded   dadocu  momocu  hompos  finan


 The country id is idcntry, my model is ses.model, and variables to
 be included in the analysis are momed   daded   dadocu  momocu
 hompos  finan . How can I run

 cfa-sem(ses.model, cov(data[,2:7], nrow(data)))
 summary(cfa)

 by country? I am able to perform sem on all data by not by country. I tried

 by(data[,2:7], idcntry, function(x) sem(ses.model, cov(data[,2:7]), 
 nrow(data)))

 but the output is the same for all countries.

 Thank you,
 Daniel

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice Panel Print Order

2010-07-21 Thread Mark Ebbert
For pete's sake! That's just embarrassing. I don't know how I missed that. 
Thanks for your help Deepayan.

I would also like to say thank you for supporting Lattice so well.

On Jul 21, 2010, at 2:39 AM, Deepayan Sarkar wrote:

 On Wed, Jul 21, 2010 at 12:13 AM, Mark Ebbert mark.ebb...@hci.utah.edu 
 wrote:
 Hi,
 
 I have done a bit of searching and have not found a good answer for my 
 question.
 Although I have not difficulty ordering the panels, Lattice prints them from 
 bottom
 to top and left to right for each page. Is it possible to make it print from 
 top to bottom
 for each page?
 
 Search for as.table in ?xyplot. Is this what you were looking for?
 
 -Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Objects within environment

2010-07-21 Thread Megh Dal
Hi Duncan, thanks for your clarification. However I do not think I could really 
understand the envir argument in objects() function.

It is saying an alternative argument for name Is the alternative 
means the alternative of, let say, package:graphics (which is the name of an 
environment?). Can you give me an example of an alternative argument of that 
particular environment?

This is specifying the environment evaluation environment. What does the 
phase environment evaluation environment mean? Can you give me an example?

Mostly there for back compatibility: again totally in dark, what does it mean 
for back compatibility? An example would definitely be great.

Toy said you need to give an environment, not the name of one. If I call 
someone I need to call with his name, right? Then if I need to give an 
environment then, without its name how can I do so?

Can you please explain me in simple english? I think R help file should use 
more non-technical simple english language so that student like me can 
understand R in more easier way!

--- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com wrote:

 From: Duncan Murdoch murdoch.dun...@gmail.com
 Subject: Re: [R] Objects within environment
 To: Megh Dal megh700...@yahoo.com
 Cc: r-h...@stat.math.ethz.ch
 Date: Wednesday, July 21, 2010, 4:48 PM
 On 21/07/2010 5:57 AM, Megh Dal
 wrote:
  Hi all, I have following environments loaded with my
 current R session:
  
  search()
  [1] .GlobalEnv       
 package:stats 
    package:graphics 
 package:grDevices
  [5] package:utils 
    package:datasets 
 package:methods   Autoloads   
     [9] package:base    
  How can I find the objects under a specific
 environment? Here I tries following:
  
  objects(envir=package:base)
  Error in objects(envir = package:base) : invalid
 'envir' argument
  
  It would be great if somebody would point me the
 correct arguments for object() function to find the onjects
 associated with it. In help file it is written that:
   envir: an alternative argument to ‘name’ for
 specifying the
        
    environment evaluation environment. Mostly
 there for back
        
    compatibility
  What is the wrong in my code?
 
 The easiest way to pick an item from the search list is by
 number:
 
 objects(3)
  or
 ls(3)
 
 will give you the objects in the graphics package, with the
 search list as above.  You can also specify the name as
 the name argument, e.g.
 
 objects(package:base)
 
 If you want to use the envir argument (why?), you need to
 give an environment, not the name of one.
 
 Duncan Murdoch
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Objects within environment

2010-07-21 Thread Megh Dal
Thanks Duncan, I understood. Your explanation is really great. Thank you so 
much for your time.

--- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com wrote:

 From: Duncan Murdoch murdoch.dun...@gmail.com
 Subject: Re: [R] Objects within environment
 To: Megh Dal megh700...@yahoo.com
 Cc: r-h...@stat.math.ethz.ch
 Date: Wednesday, July 21, 2010, 10:15 PM
 On 21/07/2010 12:27 PM, Megh Dal
 wrote:
  Hi Duncan, thanks for your clarification. However I do
 not think I could really understand the envir argument in
 objects() function.
  
  It is saying an alternative argument for
 name Is the alternative means the alternative of,
 let say, package:graphics (which is the name of an
 environment?). Can you give me an example of an alternative
 argument of that particular environment?
    
 
 as.environment(package:graphics) is an environment. 
 package:graphics is its name.
  This is specifying the environment evaluation
 environment. What does the phase environment evaluation
 environment mean? Can you give me an example?
    
 
 That looks like a typo to me.  Environment evaluation
 environment is meaningless.
  Mostly there for back compatibility: again totally
 in dark, what does it mean for back compatibility? An
 example would definitely be great.
    
 
 Presumably some older release of R used envir, and we still
 have it so that old code will still work.  But that's a
 signal that new code should never need to use it.
  Toy said you need to give an environment, not the
 name of one. If I call someone I need to call with his
 name, right? Then if I need to give an environment then,
 without its name how can I do so?
    
 
 This is a computer language, not a conversation. 
 Words have technical meanings that aren't always perfect
 matches to English meanings of the same words.  Here
 the name of an environment is what you get when you ask
 for the name attribute of it.  There are lots of
 different ways to refer to objects other than by the name
 in that technical sense.  For example, you could say
 
 x - as.environment(package:graphics) # uses the
 environment's name
 ls(envir=x) # refers to it by a variable holding it.
 
 Duncan Murdoch
  Can you please explain me in simple english? I think R
 help file should use more non-technical simple english
 language so that student like me can understand R in more
 easier way!
  
  --- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com
 wrote:
  
   From: Duncan Murdoch murdoch.dun...@gmail.com
   Subject: Re: [R] Objects within environment
   To: Megh Dal megh700...@yahoo.com
   Cc: r-h...@stat.math.ethz.ch
   Date: Wednesday, July 21, 2010, 4:48 PM
   On 21/07/2010 5:57 AM, Megh Dal
   wrote:
Hi all, I have following environments loaded
 with my
   current R session:
  search()
[1] .GlobalEnv   
     package:stats    
 package:graphics  package:grDevices
[5] package:utils    
 package:datasets 
 package:methods   Autoloads
    [9] package:base     
 How can I find the objects under a specific
   environment? Here I tries following:
  objects(envir=package:base)
Error in objects(envir = package:base) :
 invalid
   'envir' argument
  It would be great if somebody
 would point me the
   correct arguments for object() function to find
 the onjects
   associated with it. In help file it is written
 that:
 envir: an alternative argument to
 ‘name’ for
   specifying the
       
   environment evaluation environment. Mostly
   there for back
       
   compatibility
What is the wrong in my code?
The easiest way to pick an item from the
 search list is by
   number:
objects(3)
    or
   ls(3)
will give you the objects in the graphics
 package, with the
   search list as above.  You can also specify
 the name as
   the name argument, e.g.
objects(package:base)
If you want to use the envir argument
 (why?), you need to
   give an environment, not the name of one.
Duncan Murdoch

  
          
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Chi-square distribution probability density function:

2010-07-21 Thread Knut Krueger

David Winsemius schrieb:



And exactly why did you think I offered

?Chisquare
I was completely on the wrong way, and tried to find a solution with the 
formula instead to substitute the formula.


So I tried to implement pchisq into the formula - and of course I got 
wrong values ...


Thank's Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] .jinit and .jnew

2010-07-21 Thread behrooz

i have a jar under z:/downloads/AA, i use .jinit(classpath=z:/downloads/AA)
and then
.jnew(Myclass) but .jnew throes an error:

Error in .jnew(MyClass) : 
  java.lang.NoClassDefFoundError: MyClass

tried 

.jinit(classpath=z:/downloads/AA/myClass.jar) and got the same error.

any idea?



-- 
View this message in context: 
http://r.789695.n4.nabble.com/jinit-and-jnew-tp2297486p2297486.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday

2010-07-21 Thread David Scott

Henrique Dallazuanna wrote:

On Wed, Jul 21, 2010 at 11:24 AM, Ben Bolker bbol...@gmail.com wrote:


Harlan Harris harlan at harris.name writes:


Hi,

I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything
worked OK until this morning, when my ability to write into spreadsheet
cells went away. I get the following weird error:

Error in els[[type + 1]] : subscript out of bounds

Looking at the Google Docs API changelog, I see the following:

http://code.google.com/apis/spreadsheets/changelog.html
Release 2010-01 (July 14, 2010)

This is an advanced notice about an upcoming change.

   - Starting July 19, 2010, all links returned by all Spreadsheets API
   feeds will use HTTPS. This is being done in the interests of increased
   security. If you require the use of HTTP, we recommend that you remove

the

   replace https with http in these links. Another announcement will be

made

   on July 19, 2010, when this change goes to production.

I suspect this is the problem. Fixing it is above my head, I'm afraid.

Could

anyone help? This is urgent. Thank you,


   This is an Omegahat package (took me a little while to find it). Perhaps
you should write to the package maintainer?

library(RGoogleDocs)
help(package=RGoogleDocs)

or, more obscurely:

help(package=RGoogleDocs)$info[[1]][9]

 (there may be a better way to deal with objects of type packageInfo
but I can't figure it out right at the moment).



Maybe:
packageDescription('RGoogleDocs', fields = 'Author')



From the News file for 2.11.0:

 o  maintainer() has been added, to give convenient access to the
name of the maintainer of a package





 It looks as though one might be able to fix this by hacking the
hard-coded URLs in the code, but as you suggest that might be above
your head.

 good luck ...
Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about allocMatrix error message

2010-07-21 Thread Moohwan Kim
Dear R family,

I faced a technical problem in r coding.
 #s=t(dev)%*%dev/(nr-1) # dev (100,000 by 2) stands for
deviation from the mean
 #sinv=solve(s)
 #t2=diag(dev%*%sinv%*%t(dev))

I got an error message at t2 statement:
Error in diag(dev %*% si %*% t(dev)) : allocMatrix: too many elements specified

Please let me know if there is a way to overcome this problem.

best
moohwan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calling R from within Java, using jri

2010-07-21 Thread Hawcandy

I've gotten the same problems when running the examples, the programs just
exit after creating Rengine. After trying all the possibilities, I found
that I set the environment variable R_HOME wrong. After changing it from
C:\Program Files\R\R-2.11.1\bin to C:\Program Files\R\R-2.11.1, the programs
worked just fine!!!

Hope this helps!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/R-calling-R-from-within-Java-using-jri-tp808810p2297558.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xtable

2010-07-21 Thread Silvano

Hi,

How do I build a table from a regression model adjusted 
using xtable?


Commands are:

modelo1 = lm(Y~X1 + X2)
influencia = influence.measures(modelo1)

require(xtable)
xtable(influencia)

but it isn't work.

Thanks,

--
Silvano Cesar da Costa
Departamento de Estatística
Universidade Estadual de Londrina
Fone: 3371-4346

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Get distribution of positive/negative examples for each cluster

2010-07-21 Thread Phil Spector

Boya-
   table() is the function that does what you want:


cdat = data.frame(membership=rep(1:3,rep(3,3)),

+   label=as.character(c(0,0,1,0,1,1,1,1,1)))

table(cdat)

  label
membership 0 1
 1 2 1
 2 1 2
 3 0 3


From there, you can rearrange it in a variety of ways:



as.data.frame(table(cdat))

  membership label Freq
1  1 02
2  2 01
3  3 00
4  1 11
5  2 12
6  3 13

Or, to conform with your request


reshape(as.data.frame(table(cdat)),idvar='membership',

+ v.names='Freq',timevar='label',direction='wide')
  membership Freq.0 Freq.1
1  1  2  1
2  2  1  2
3  3  0  3


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Wed, 21 Jul 2010, Boya Sun wrote:


Dear R experts,

I have a labeled data set. Each data is assigned a binary label 0 or 1.
Assume that I use some clustering algorithm to group the data by clusters
(using some features of the data). Now I want to know how many data are
labeled as 0/1 in each cluster.

For example, assume that I have 9 labeled data grouped into three clusters.
The ids of the clusters are 1, 2, and 3.  The dataset is represented by the
following matrix:

   membershipLabel
d110
d210
d311
d420
d521
d621
d731
d831
d931

Now I want to get the following output, telling me how many data are labeled
as 0 and 1 in each cluster

cluster_id0-data1-data
121
212
303

The output does not have to be a matrix, it could be a summary of the
statistics.

How should I approach this problem? What R functions should I use to get
such information?

Thanks so much!

Boya

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Variance of the prediction in the linear regression model (Theory and programming)

2010-07-21 Thread Yi
Hi, folks,

Here are the codes:

##
y=1:10
x=c(1:9,1)
lin=lm(log(y)~x)  ### log(y) is following Normal distribution
x=5:14
prediction=predict(lin,newdata=x)  ##prediction=predict(lin)
###

1. The codes do not work, and give the error message: Error in
eval(predvars, data, env) :
  numeric 'envir' arg not of length one.  But if I use the code after the
pound sign, it works. I mean the name of the newdata is x, why it does not
work though?


2. Because the prediction is conducted for log(y). I need to get the
expected value of y, which is LN distribution, for the new data sets. I need
to know the expectation of log(y) and variance of log(y).

#
mean=mean(prediction)
sd=sd(prediction)
mean_y=exp(mean+0.5*sd^2) ### formula from Normal to LN
##

Is sd(prediction) the correct why to calculate the sigma of the prediction?
Or should I just use the value of Residual standard error from summary(lin)?

Answer to either question will be appreciated!

Thanks

Yi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about allocMatrix error message

2010-07-21 Thread Phil Spector

Moohwan -
   It appears that you are trying to calculate a 10
by 10 matrix when all you want are the diagonal 
elements.  Here's one approach that might work:


s = t(dev)%*%dev/(nr-1)
sinv = solve(s)
part = sinv%*%t(dev)
t2 = dev[,1]*part[1,] + dev[,2]*part[2,]

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Wed, 21 Jul 2010, Moohwan Kim wrote:


Dear R family,

I faced a technical problem in r coding.
#s=t(dev)%*%dev/(nr-1) # dev (100,000 by 2) stands for
deviation from the mean
#sinv=solve(s)
#t2=diag(dev%*%sinv%*%t(dev))

I got an error message at t2 statement:
Error in diag(dev %*% si %*% t(dev)) : allocMatrix: too many elements specified

Please let me know if there is a way to overcome this problem.

best
moohwan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variance of the prediction in the linear regression model (Theory and programming)

2010-07-21 Thread Yi
Sorry, for the second question. I stated in a wrong way. My aim is the mean
and sd of each new observation.

#
mean=fitted(prediction)
##

But I do not know how to get sd for each new observation.

Any tips?

Thanks

Yi



On Wed, Jul 21, 2010 at 2:29 PM, Yi liuyi.fe...@gmail.com wrote:

 Hi, folks,

 Here are the codes:

 ##
 y=1:10
 x=c(1:9,1)
 lin=lm(log(y)~x)  ### log(y) is following Normal distribution
 x=5:14
 prediction=predict(lin,newdata=x)  ##prediction=predict(lin)
 ###

 1. The codes do not work, and give the error message: Error in
 eval(predvars, data, env) :
   numeric 'envir' arg not of length one.  But if I use the code after the
 pound sign, it works. I mean the name of the newdata is x, why it does not
 work though?


 2. Because the prediction is conducted for log(y). I need to get the
 expected value of y, which is LN distribution, for the new data sets. I need
 to know the expectation of log(y) and variance of log(y).

 #
 mean=mean(prediction)
 sd=sd(prediction)
 mean_y=exp(mean+0.5*sd^2) ### formula from Normal to LN
 ##

 Is sd(prediction) the correct why to calculate the sigma of the prediction?
 Or should I just use the value of Residual standard error from summary(lin)?

 Answer to either question will be appreciated!

 Thanks

 Yi



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issues reshaping data

2010-07-21 Thread JGull8502

Hello All,

I'm having some trouble reshaping my data from wide to long format. I have
tried using both the reshape function and package. Although I haven't worked
much with the reshape function, I have found the reshape package useful and
intuitive for reshaping data from long to wide format. However, going the
other way has me stumped with this type of data. My data is set up, roughly,
as follows:

SUBJECT  TRIALWORD1 WORD1.RT WORD2 WORD2.RT ...
WORD25.RT
1   1My   100  friend200
 
... ...
1   2John's  250  dog   320 
   
... ...
1   3His  120  waiter400

... ...
2   1My   100  friend200

... ...
2   2John's  250  dog   320 
   
... ...
2   3His  120  waiter400

... ...

What I would like to do is transform it roughly to:

Subject  TRIAL  WORDRT
1 1  My 100
1 1  friend 200
   .   
2 3  His120  
2 3  Waiter 400

The furthest I've gotten with the package is melting the data such that RT
is in its own column by using each WORD#.RT as a measure.var. However, I
can't get the words to go with them (either in the melted frame or trying to
cast the melted data).

I tried the reshape function as well, but kept getting an error about
duplicate row names. I lost the exact code I was using after closing R, but
roughly, I would use Subject as an ID variable, varying =
list(wordRTs,words) [where wordRTs are the set of WORD#.RT and words are the
set of WORD# columns] , v.names=c(RT,WORD).

If anyone has any help, it would be much appreciated!

Thanks,
Jason



-- 
View this message in context: 
http://r.789695.n4.nabble.com/Issues-reshaping-data-tp2297853p2297853.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help me with prediction in linear model

2010-07-21 Thread Dennis Murphy
Hi:

On Wed, Jul 21, 2010 at 8:46 AM, vijaysheegi vijay.she...@gmail.com wrote:


 Hi R-community,
 I have the code as follows,i Fitted model as follows
 lbeer-log(beer_monthly)
 t-seq(1956,1995.2,length=length(beer_monthly)) #beer_monthly contains 400+
 entries


This is unnecessary:

 t2=t^2
 beer_fit_parabola=lm(lbeer~t+t2)


Try

beer_fit_parabola - lm(lbeer ~ poly(t, 2))

instead. poly() is a safer function to use for polynomial regression. See
?poly for the reasons why.


 Below is not working for me.
 Please help me in preparing the new data set for the below prediction


 tpred - data.frame(t = seq(1995, 1998, length = 20))
predict(beer_fit_parabola, newdata = tpred)

If that doesn't work, try putting lbeer and t in a data frame and use it as
the data = argument of lm() [same model]. Then try predict() again with the
same code as above. This shouldn't be necessary, though.

Here's a quick made up example:

x - 1:10
y - 3 + 2 * x - x^2 + rnorm(10)
plot(x, y)
mm - lm(y ~ poly(x, 2))
predict(mm, newdata = data.frame(x = c(11, 12)))
 1  2
 -96.10648 -117.04534

If you look at summary(mm), you'll see why polynomial models are not very
useful for prediction, particularly when extrapolating outside the range of
the observed data. (Compare the parameter estimates to the function that
produced y as a crude check.) A better choice might be a natural or
restricted cubic spline to reduce the standard error of prediction.

HTH,
Dennis

predict(beer_fit_parabola,newdata=data.frame(t=seq(1995,1998,length=20),t2=seq(1995,1998,length=20))

 #it is listing all 400+ entries ,but not 20 ahead prediction.

 Thanks In advance for your help
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Help-me-with-prediction-in-linear-model-tp2297313p2297313.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Command that is conditional upon file retrieval: is it possible?

2010-07-21 Thread AndrewPage

Hi all,

I'm currently working on an R program where I have to access an FTP server
to download some of the data I need.  However, the people who post up the
files I access are at times inconsistent with regards to time posted, if
they post at all, etc  Here's some of the code I use:

library(RCurl)

url1 = paste(ftp://user:passw...@a.great.website.com/;, file, num1,
.csv, sep = )

data1 = getURL(url1)

write(data1, file = paste(inMyFolder, num1, .csv, sep = ))


Sometimes this process works perfectly, and sometimes I get an error message
like this attached to data1 = getURL(url1):

Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : 
  RETR response: 550

That's because that particular file isn't on the FTP server yet.  Now...
let's just imagine that there's another way for me to access the file
elsewhere, and I can drag it into the working directory with the same name
as the file I'm telling R to write immediately after finding it on the FTP
server.

So here's my question: is it possible to write a command that will write the
file if there isn't an error message going along with data1 =
getURL(url1), but won't write the file if it can't find it  As of right
now, if I got the error message, dragged the file into the working directory
and ran the program again, R will overwrite my good file with an empty one
because in all cases, I'm telling it to write a file with that name that
includes the information in data1.

Thanks in advance,
Andrew
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Command-that-is-conditional-upon-file-retrieval-is-it-possible-tp2297811p2297811.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >