[R] latin hypercube sampling

2013-02-19 Thread Aimee Kopolow
Hi all,

I am attempting to use latin hypercube sampling to sample different
variable functions in a series of simultaneous differential equations.
There is very little code online about lhs or clhs, so from different
other help threads I have seen, it seems I need to create a
probability density function for each variable function, and then use
latin hypercube sampling on this pdf.

So far, I have created a data frame consisting of the y output of
density(functionX) for each of the different functions I wish to
sample. [examples of functions include T1(t), WL1(T1,t),
BE1(WL1,T1,t)] The dataframe consists of 512 rows/vectors for each
function.



I tried running
res - clhs(df, size = 500, iter = 2000, progress = FALSE, simple = TRUE)


and it returned a single series of 500 samples, rather than a series
of 500 samples per function.

I ultimately need a sample of each variable function that I can run
through my model, putting each individual variable function as a
constant instead, and then performing PRCC. Is there anyone who can
advise on how to do this, or failing that, where I should look for
sample code?

Thank you for any help you are able to give,
Aimee.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum by group and under some criteria

2013-02-19 Thread arun
Hi,



Thanks. I tried this code below, why does expanded dataset 'res1' has m1=3 and 
n1=3
, dataset 'd3' doesn't have m1=3, n1=3.
 

d3-structure(list(m1 = c(2, 3, 2), n1 = c(2, 2, 3), cterm1_P0L = c(0.9025, 
0.857375, 0.9025), cterm1_P1L = c(0.64, 0.512, 0.64), cterm1_P0H = c(0.9025, 
0.9025, 0.857375), cterm1_P1H = c(0.64, 0.64, 0.512)), .Names = c(m1, 
n1, cterm1_P0L, cterm1_P1L, cterm1_P0H, cterm1_P1H), row.names = 
c(NA, 
3L), class = data.frame)
d3

d3
#  m1 n1 cterm1_P0L cterm1_P1L cterm1_P0H cterm1_P1H
#1  2  2   0.902500  0.640   0.902500  0.640
#2  3  2   0.857375  0.512   0.902500  0.640
#3  2  3   0.902500  0.640   0.857375  0.512

Here, there is no row with m1=3 and n1=3.  





d2- data.frame()  
#this was part of your original code.  In this, m1 is expanding on each value 
of n1
for (m1 in 2:3) { 
    for (n1 in 2:3) { 
    for (x1 in 0:(m1-1)) { 
    for (y1 in 0:(n1-1)) { 
        for (m in (m1+2): (7-n1)){ 
           for (n in (n1+2):(9-m)){ 
           for (x in x1:(x1+m-m1)){ 
         for(y in y1:(y1+n-n1)){     
 d2- rbind(d2,c(m1,n1,x1,y1,m,n,x,y)) 
  
colnames(d2)-c(m1,n1,x1,y1,m,n,x,y)   #Here the combination is 
there

 expand.grid(2:3,2:3)
#  Var1 Var2
#1    2    2
#2    3    2
#3    2    3
#4    3    3


res1-do.call(rbind,lapply(unique(d3$m1),function(m1)
 do.call(rbind,lapply(unique(d3$n1),function(n1)
 do.call(rbind,lapply(0:(m1-1),function(x1)
 do.call(rbind,lapply(0:(n1-1),function(y1)
 do.call(rbind,lapply((m1+2):(7-n1),function(m)
 do.call(rbind,lapply((n1+2):(9-m),function(n)
 do.call(rbind,lapply(x1:(x1+m-m1), function(x)
 do.call(rbind,lapply(y1:(y1+n-n1), function(y)
 expand.grid(m1,n1,x1,y1,m,n,x,y)) ))) 
 names(res1)- c(m1,n1,x1,y1,m,n,x,y) 
 attr(res1,out.attrs)-NULL res1[]- sapply(d2,as.integer)  
 res1 #here too
identical(d2,res1)
#[1] TRUE


library(plyr)
res2- join(res1,d3,by=c(m1,n1),type=full)  #combination with m1=3, n1=3 
didn't exist in d3, so that rows are left as missing or NA

 tail(res2,3)
 #   m1 n1 x1 y1 m n x y cterm1_P0L cterm1_P1L cterm1_P0H cterm1_P1H
#427  3  3  2  2 4 5 3 2 NA NA NA NA
#428  3  3  2  2 4 5 3 3 NA NA NA NA
#429  3  3  2  2 4 5 3 4 NA NA NA NA


I hope it helps.


A.K.

__
From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Saturday, February 16, 2013 8:46 PM
Subject: Re: [R] cumulative sum by group and under some criteria


Hi,
What I need is to expand each row by adding several columns and . Let me 
restate the question. I have a dataset d, I want to expand it to d2 showed 
below. 
d-data.frame()
for (m1 in 2:3) {
    for (n1 in 2:2) {
    for (x1 in 0:(m1-1)) {
    for (y1 in 0:(n1-1)) {
d-rbind(d,c(m1,n1,x1,y1))
}
}
}}
colnames(d)-c(m1,n1,x1,y1)
d

   m1 n1 x1 y1
1   2  2  0  0
2   2  2  0  1
3   2  2  1  0
4   2  2  1  1
5   3  2  0  0
6   3  2  0  1
7   3  2  1  0
8   3  2  1  1
9   3  2  2  0
10  3  2  2  1
I want to expand it as follows:
for (m in (m1+2): (7-n1){
    for (n in (n1+2):(9-m){
    for (x in x1:(x1+m-m1){  
}}}
so for the first row, 
   m1 n1 x1 y1
1   2  2  0  0
it should be expanded as
   m1 n1 x1 y1  m  n  x 
    2  2  0  0  4  4  0
    2  2  0  0  4  4  1 
    2  2  0  0  4  4  2
    2  2  0  0  4  5  0
    2  2  0  0  4  5  1
    2  2  0  0  4  5  2

On Tue, Feb 12, 2013 at 8:19 PM, arun smartpink...@yahoo.com wrote:

Hi,

Saw your reply again in Nabble.  I thought I sent you the solution previously.


res3new-  aggregate(.~m1+n1,data=res2[,c(1:2,12:13)],max) 

 d2-res3new[res3new[,3]0.01  res3new[,4]0.01,] 


m1- 3 #from d2
maxN- 9
n1- 2 #from d2

In the example that you provided:
 (m1+2):(maxN-(n1+2))
#[1] 5 
 (n1+2):(maxN-5)
#[1] 4 
#Suppose
 x1- 4 
 y1- 2 
 x1:(x1+5-m1)
#[1] 4 5 6
 y1:(y1+4-n1)
#[1] 2 3 4

 datnew-expand.grid(5,4,4:6,2:4)
 colnames(datnew)- c(m,n,x,y)
datnew-within(datnew,{p1- x/m;p2-y/n})
res-cbind(datnew,d2[rep(1:nrow(d2),nrow(datnew)),])
 row.names(res)- 1:nrow(res)
 res
#  m n x y   p2  p1 m1 n1 cterm1_P1L cterm1_P0H
#1 5 4 4 2 0.50 0.8  3  2    0.00032     0.0025
#2 5 4 5 2 0.50 1.0  3  2    0.00032     0.0025
#3 5 4 6 2 0.50 1.2  3  2    0.00032     0.0025
#4 5 4 4 3 0.75 0.8  3  2    0.00032     0.0025
#5 5 4 5 3 0.75 1.0  3  2    0.00032     0.0025
#6 5 4 6 3 0.75 1.2  3  2    0.00032     0.0025
#7 5 4 4 4 1.00 0.8  3  2    0.00032     0.0025
#8 5 4 5 4 1.00 1.0  3  2    0.00032     0.0025
#9 5 4 6 4 1.00 1.2  3  2    0.00032     0.0025

A.K.





- Original Message -
From: Zjoanna zjoanna2...@gmail.com
To: r-help@r-project.org
Cc:

Sent: Sunday, February 10, 2013 6:04 PM
Subject: Re: [R] cumulative sum by group and under some criteria


Hi,
How to expand or loop for one variable n based on another variable? for
example, I want to add m (from m1 to maxN- n1-2) and for each m, I want to
add n (n1+2 to maxN-m), and similarly 

Re: [R] cumulative sum by group and under some criteria

2013-02-19 Thread arun
Hi,

If you don't want the m1=3, n1=3 combination in the final dataset:
library(plyr)
res2- join(res1,d3,by=c(m1,n1),type=inner)
tail(res2)
#    m1 n1 x1 y1 m n x y cterm1_P0L cterm1_P1L cterm1_P0H cterm1_P1H
#235  3  2  2  1 5 4 3 1   0.857375  0.512 0.9025   0.64
#236  3  2  2  1 5 4 3 2   0.857375  0.512 0.9025   0.64
#237  3  2  2  1 5 4 3 3   0.857375  0.512 0.9025   0.64
#238  3  2  2  1 5 4 4 1   0.857375  0.512 0.9025   0.64
#239  3  2  2  1 5 4 4 2   0.857375  0.512 0.9025   0.64
#240  3  2  2  1 5 4 4 3   0.857375  0.512 0.9025   0.64
A.K.




- Original Message -
From: Zjoanna zjoanna2...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, February 18, 2013 10:36 PM
Subject: Re: [R] cumulative sum by group and under some criteria

Thanks. I tried this code below, why does expanded dataset 'res1' has m1=3
and n1=3
, dataset 'd3' doesn't have m1=3, n1=3.


d3-structure(list(m1 = c(2, 3, 2), n1 = c(2, 2, 3), cterm1_P0L = c(0.9025,
0.857375, 0.9025), cterm1_P1L = c(0.64, 0.512, 0.64), cterm1_P0H =
c(0.9025,
0.9025, 0.857375), cterm1_P1H = c(0.64, 0.64, 0.512)), .Names = c(m1,
n1, cterm1_P0L, cterm1_P1L, cterm1_P0H, cterm1_P1H), row.names =
c(NA,
3L), class = data.frame)
d3

res1-do.call(rbind,lapply(unique(d3$m1),function(m1)
do.call(rbind,lapply(unique(d3$n1),function(n1)
do.call(rbind,lapply(0:(m1-1),function(x1)
do.call(rbind,lapply(0:(n1-1),function(y1)
do.call(rbind,lapply((m1+2):(7-n1),function(m)
do.call(rbind,lapply((n1+2):(9-m),function(n)
do.call(rbind,lapply(x1:(x1+m-m1), function(x)
do.call(rbind,lapply(y1:(y1+n-n1), function(y)
expand.grid(m1,n1,x1,y1,m,n,x,y)) )))
names(res1)- c(m1,n1,x1,y1,m,n,x,y)
attr(res1,out.attrs)-NULL
res1

library(plyr)
res2- join(res1,d3,by=c(m1,n1),type=full)
res2

On Sun, Feb 17, 2013 at 11:10 PM, arun kirshna [via R] 
ml-node+s789695n4658895...@n4.nabble.com wrote:

 Hi,


 Yes, I wanted to expand directly from d. If there are other variables,
 says A, B, C that I need to keep  in the final expanded data, how
 to modify the code?

 d-data.frame()
 for (m1 in 2:3) {
     for (n1 in 2:2) {
         for (x1 in 0:(m1-1)) {
             for (y1 in 0:(n1-1)) {
 d-rbind(d,c(m1,n1,x1,y1))
 }
 }
 }}
 colnames(d)-c(m1,n1,x1,y1)
 d
 res1-do.call(rbind,lapply(unique(d$m1),function(m1)
 do.call(rbind,lapply(unique(d$n1),function(n1)
 do.call(rbind,lapply(0:(m1-1),function(x1)
 do.call(rbind,lapply(0:(n1-1),function(y1)
 do.call(rbind,lapply((m1+2):(7-n1),function(m)
 do.call(rbind,lapply((n1+2):(9-m),function(n)
 do.call(rbind,lapply(x1:(x1+m-m1), function(x)
 expand.grid(m1,n1,x1,y1,m,n,x)) )
  names(res1)- c(m1,n1,x1,y1,m,n,x)

 set.seed(235)
  d$A- sample(1:50,10,replace=TRUE)
  set.seed(23)
  d$B- sample(1:50,10,replace=TRUE)

  d
  #  m1 n1 x1 y1  A  B
 #1   2  2  0  0 50 29
 #2   2  2  0  1 40 12
 #3   2  2  1  0 31 17
 #4   2  2  1  1  7 36
 #5   3  2  0  0 13 41
 #6   3  2  0  1 27 22
 #7   3  2  1  0 49 49
 #8   3  2  1  1 47 49
 #9   3  2  2  0 23 43
 #10  3  2  2  1  4 50
 library(plyr)
 res2- join(res1,d,by=c(m1,n1,x1,y1),type=full)
 res2
  #  m1 n1 x1 y1 m n x  A  B
 #1   2  2  0  0 4 4 0 50 29
 #2   2  2  0  0 4 4 1 50 29
 #3   2  2  0  0 4 4 2 50 29
 #4   2  2  0  0 4 5 0 50 29
 #5   2  2  0  0 4 5 1 50 29
 #6   2  2  0  0 4 5 2 50 29
 #7   2  2  0  0 5 4 0 50 29
 #8   2  2  0  0 5 4 1 50 29
 #9   2  2  0  0 5 4 2 50 29
 #10  2  2  0  0 5 4 3 50 29
 #11  2  2  0  1 4 4 0 40 12
 #12  2  2  0  1 4 4 1 40 12
 #13  2  2  0  1 4 4 2 40 12
 #14  2  2  0  1 4 5 0 40 12
 #15  2  2  0  1 4 5 1 40 12
 #16  2  2  0  1 4 5 2 40 12
 #17  2  2  0  1 5 4 0 40 12
 #18  2  2  0  1 5 4 1 40 12
 #19  2  2  0  1 5 4 2 40 12
 #20  2  2  0  1 5 4 3 40 12
 #21  2  2  1  0 4 4 1 31 17
 #22  2  2  1  0 4 4 2 31 17
 #23  2  2  1  0 4 4 3 31 17
 #24  2  2  1  0 4 5 1 31 17
 #25  2  2  1  0 4 5 2 31 17
 #26  2  2  1  0 4 5 3 31 17
 #27  2  2  1  0 5 4 1 31 17
 #28  2  2  1  0 5 4 2 31 17
 #29  2  2  1  0 5 4 3 31 17
 #30  2  2  1  0 5 4 4 31 17
 #31  2  2  1  1 4 4 1  7 36
 #32  2  2  1  1 4 4 2  7 36
 #33  2  2  1  1 4 4 3  7 36
 #34  2  2  1  1 4 5 1  7 36
 #35  2  2  1  1 4 5 2  7 36
 #36  2  2  1  1 4 5 3  7 36
 #37  2  2  1  1 5 4 1  7 36
 #38  2  2  1  1 5 4 2  7 36
 #39  2  2  1  1 5 4 3  7 36
 #40  2  2  1  1 5 4 4  7 36
 #41  3  2  0  0 5 4 0 13 41
 #42  3  2  0  0 5 4 1 13 41
 #43  3  2  0  0 5 4 2 13 41
 #44  3  2  0  1 5 4 0 27 22
 #45  3  2  0  1 5 4 1 27 22
 #46  3  2  0  1 5 4 2 27 22
 #47  3  2  1  0 5 4 1 49 49
 #48  3  2  1  0 5 4 2 49 49
 #49  3  2  1  0 5 4 3 49 49
 #50  3  2  1  1 5 4 1 47 49
 #51  3  2  1  1 5 4 2 47 49
 #52  3  2  1  1 5 4 3 47 49
 #53  3  2  2  0 5 4 2 23 43
 #54  3  2  2  0 5 4 3 23 43
 #55  3  2  2  0 5 4 4 23 43
 #56  3  2  2  1 5 4 2  4 50
 #57  3  2  2  1 5 4 3  4 50
 #58  3  2  2  1 5 4 4  4 50
 A.K.








 - Original Message -
 From: arun [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4658895i=0

 To: Joanna 

Re: [R] How to do a backward calculation for each record in a dataset

2013-02-19 Thread Prakasit Singkateera
Hi everyone,

From your helps of giving me several ideas, eventually I can solve the
posted problem. Here is the R code. It can be done by applying the
uniroot.all to the data frame together with the proper form of equation
(slightly modification of the original equation).

#Generate the sample data frame
customer.name = c(John,Mike,Peter)
product = c(Toothpaste,Toothpaste,Toothpaste)
cost = c(30,45,40)
mydata = data.frame(customer.name,product,cost)

#Original cost function - not used
#fcost = function(orders) 3.40 + (1.20 * orders^2)

#Slightly modification of the cost function to be a proper form for root
finding
#This is basically to set == 3.40 + (1.20 * orders^2) - fcost = 0
f.to.findroot = function(orders,fcost) 3.40 + (1.20 * orders^2) - fcost

#Using rootSolve package which contains uniroot.all function
library(rootSolve)

#Using plyr package which contains adply function
library(plyr)

#Use uniroot function to find the 'orders' variable (from the f.to.findroot
function) for each customer and put it into no.of.orders column in
mysolution data frame
#Replace 'fcost' with 'cost' column from mydata
#Interval of 0 to 1,000 is to make the f.to.findroot function have both
negative and positive sign, otherwise uniroot.all will give an error
mysolution = data.frame(adply(mydata, 1, summarize, no.of.orders =
uniroot.all(f.to.findroot,interval = c(0,1000),fcost=cost)))
mysolution

#Remove the redundant mydata as mysolution it is an extended version of
mydata
rm(mydata)

#Note uniroot.all can be used for both linear (e.g.orders^1) and non-linear
(e.g.orders^2) equations.


Thank you,
Prakasit Singkateera

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] recode data according to quantile breaks

2013-02-19 Thread D. Alain
Dear R-List, 

I would like to recode my data according to quantile breaks, i.e. all data 
within the range of 0%-25% should get a 1, 25%-50% a 2 etc.
Is there a nice way to do this with all columns in a dataframe.

e.g.

df- 
f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18))

df
   id    a      b      c
1 x01     1  2  1
2 x02 2      4      3
3 x03 3      6      9
4 x04     4      8 12
5 x05     5     10 15
6 x06     6     12     18

#I can do it in very complicated way


apply(df[-1],2,quantile)
   a    b    c
0%   1.0  2.0  1.0
25%  2.2  4.5  4.5
50%  3.5  7.0 10.5
75%  4.8  9.5 14.2
100% 6.0 12.0 18.0

#then 

df$a[df$a=2.2]-1
...

#result should be


df.breaks

id        a        b        c
x01    1           1        1
x02    1      1        1
x03    2           2        2
x04    3   3    3
x05    4   4    4
x06    4   4    4 



But there must be a way to do it more elegantly, something like


df.breaks- apply(df[-1],2,recode.by.quantile)

Can anyone help me with this?


Best wishes 


Alain      
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new variable.

2013-02-19 Thread Rui Barradas

Hello,

Try the following.

levels - c(democrat, republican, other)
dem - c(1,1,1,1,0,0,0,0)
rep - c(1,1,1,0,0,0,0,0)
other - c(1,0,0,0,0,0,0,0)

party - factor(rep(levels, c(sum(dem), sum(rep), sum(other
party


Hope this helps,

Rui Barradas

Em 19-02-2013 00:01, Nicole Ford escreveu:

hello, all.

in my previous research, i have always used existing data.  i am trying 
something new as an exploratory exercise and have never create my own variable 
form scratch.

essentially, i am creating a variable for party affiliation.

here is an example.

var =party.

levels= democrat, republican, other.

respondents will indicate which category they fall under.

for the sake of ease, i will use small data as an example.

i was thinking the levels would need to be created first-

dem - c(1,1,1,1,0,0,0,0)

rep - c(1,1,1,0,0,0,0,0)

other - c(1,0,0,0,0,0,0,0)

then, i could do:

party -cbind(den, rep, other)

par1 -factor(party)

this is where i am getting stuck...  any thoughts would be appreciated.

i promise this isn't homework.  i am trying to understand how i would go about 
creating variables if i choose to go in this direction in the future...  and 
work out the kinks now.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode data according to quantile breaks

2013-02-19 Thread Jorge I Velez
Hi Alain,

The following should get you started:

apply(df[,-1], 2, function(x) cut(x, breaks = quantile(x), include.lowest =
TRUE, labels = 1:4))

Check ?cut and ?apply for more information.

HTH,
Jorge.-



On Tue, Feb 19, 2013 at 9:01 PM, D. Alain  wrote:

 Dear R-List,

 I would like to recode my data according to quantile breaks, i.e. all data
 within the range of 0%-25% should get a 1, 25%-50% a 2 etc.
 Is there a nice way to do this with all columns in a dataframe.

 e.g.

 df-
 f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18))

 df
ida  b  c
 1 x01 1  2  1
 2 x02 2  4  3
 3 x03 3  6  9
 4 x04 4  8 12
 5 x05 5 10 15
 6 x06 6 12 18

 #I can do it in very complicated way


 apply(df[-1],2,quantile)
abc
 0%   1.0  2.0  1.0
 25%  2.2  4.5  4.5
 50%  3.5  7.0 10.5
 75%  4.8  9.5 14.2
 100% 6.0 12.0 18.0

 #then

 df$a[df$a=2.2]-1
 ...

 #result should be


 df.breaks

 idabc
 x011   11
 x021  11
 x032   22
 x043   33
 x054   44
 x064   44



 But there must be a way to do it more elegantly, something like


 df.breaks- apply(df[-1],2,recode.by.quantile)

 Can anyone help me with this?


 Best wishes


 Alain
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] foreach loop, stata equivalent

2013-02-19 Thread Matthieu Stigler
Hi Nelissa

I hope the answers from Joshua and Milan helped you figure out a good
solution.

So basically in R you have two big parallel packages, the parallel/snow
described by Joshua, or the foreach one. If you would be to use foreach,
you would need probably a nested loop, which is described in detail in a
vignette:
http://stat.ethz.ch/CRAN/web/packages/foreach/vignettes/nested.pdf


I guess in any case you will want to be more specific in your loop
regarding what is returned, and select precisely the parameter you want
(instead of returning the whole object). In your case, function getTh()
will probably do the trick.

Now regarding the specificities of the threshold VECM. Unfortunately the lm
methods discussed by Joshua won't work here, and you cannot even use here
Joshua's trick to run multiple responses (a simple reason being that a VECM
is already a multiple response model). Threshold VECMs are really heavy
models, so do expect a very long time to run each model, so enormous time
to run 3000 models!

Do not hesitate to ask also on the dedicated mailing list:
ts...@googlegroups.com

best

Message: 36
Date: Mon, 18 Feb 2013 08:50:04 -0800
From: Joshua Wiley jwiley.ps...@gmail.com
To: Milan Bouchet-Valat nalimi...@club.fr
Cc: r-help@r-project.org r-help@r-project.org, Jamora, Nelissa
nelissa.jam...@agr.uni-goettingen.de
Subject: Re: [R] foreach loop, stata equivalent
Message-ID:
canz9z_ld_subcacw1xabbubqtds2kwu-9e2mn02f4g40h2u...@mail.gmail.com
Content-Type: text/plain; charset=UTF-8

If you have the same set of predictors for multiple outcomes, you can
speed up the process considerably by taking advantage of the fact that
the design matrix is the same for multiple outcomes.  As an example:

set.seed(10)
y - matrix(rnorm(1 * 14), ncol = 14)
x - matrix(rnorm(1 * 2), ncol = 2)
system.time(res - lapply(1:14, function(i) lm(y[, i] ~ x)))
##   user  system elapsed
##   0.340.000.34
system.time(res2 - lm(y ~ x))
##   user  system elapsed
##   0.050.020.06

lm can accept a matrix as the dependent variable.  So if various
combinations of variables predict all 14 outcomes, do not fit 14 *
number of combinations of predictors separately, do them in chunks for
substantial speedups.

Finally, as long as you are not using factors or any other fancy
things, and can work with just raw data matrices, instead of using
lm(), which is a high level function, use lm.fit().  It is not
especially clever, just expects the design matrix and response matrix.
 It will not an intercept by default, so to your data column bind on a
vector of 1s.

system.time(res3 - lm.fit(cbind(1, x), y))
##   user  system elapsed
##   0.020.000.01

These three methods produce identical results:

res[[1]]
## Call:
## lm(formula = y[, i] ~ x)
## Coefficients:
## (Intercept)   x1   x2
##  0.0014401   -0.0198232   -0.0005721
res2[[1]][, 1]
##   (Intercept)x1x2
##  0.0014401149 -0.0198232209 -0.0005720764
res3[[1]][, 1]
##x1x2x3
##  0.0014401149 -0.0198232209 -0.0005720764

however, fit each response one at a time instead of taking advantage
of fitting multiple responses at once (so the design matrix can be
reused) and taking advantage of lower level functions when you have a
simple, specific, repetitive task takes the estimation time from .34
down to .01.  You would need 34 cores to achieve that by simply
throwing more hardware at the problem as opposed to using more
optimized code.  Of course you can do both, and probably get the
results pretty quickly.

Cheers,

Josh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] introducing jitter in overlapping graphs using ggplots (plotmeans). Also sciplot.

2013-02-19 Thread Anna Zakrisson
Hi,

I want to plot means with standard deviations of Total Nitrogen (TN) across 
4 stations (S1-S4) and over 3 years (2007-2009). I want this plot in one 
panel.
I do not want medians (bwplot, boxplot).

I have tried a few different packages and it seems that ggplots with 
plotmeans was the fastest (I am not extremely skilled in writing my own 
scripts). Unfortunately, there is no grouping argument in plotmeans(). 
Therefore, I have overlaid three plots (plotting all stations witin year). 
However, as the graphs are exctly overlaid, it is not possible to 
distinguish the confidence intervals. How do I introduce jitter?
Data2007, Data2008 and Data2009 are the dataframes with the data from that 
particular year including all the stations.

I have also tried lineplot.CI from the package sciplot and also there I am 
not able to introduce jitter. I have looked at the helpfiles.


PACKAGE GGPLOT:

library(gplots)
par(family=serif,font=1)
plotmeans(TN ~ STATION,
  data =Data2007,
  bars=TRUE, p=0.95,
  pch=1,
  cex=0.8,
  n.label=FALSE,
  mean.labels=FALSE,
  barwidth=1.5, barcol=black,
  connect=TRUE,
  xlab=Station,
  ylab = expression(paste(TN (,mu, g~L^{-1}, 
par(new=T)
plotmeans(TN ~ STATION,
  data =Data2008,
  bars=TRUE, p=0.95,
  pch=2,
  cex=0.8,
  lty=2,
  n.label=FALSE,
  mean.labels=FALSE,
  barwidth=1.5, barcol=black,
  connect=TRUE,
  na.action=na.exclude,
  yaxt='n', ann=FALSE,
  xaxt=n)
par(new=T)
plotmeans(TN ~ STATION,
  data =Data2009,
  bars=TRUE, p=0.95,
  pch=3,
  cex=0.8,
  lty=3,
  n.label=FALSE,
  mean.labels=FALSE,
  barwidth=1.5, barcol=black,
  connect=TRUE,
  na.action=na.exclude,
  yaxt='n', ann=FALSE,
  xaxt=n)

PACKAGE SCIPLOT:
library(sciplot) 
lineplot.CI(response=TN, x.factor=STATION, group=YEAR,
ci.fun= function(x) c(mean(x)-sd(x), mean(x) + sd(x)),
data = Mydata,
xlab=Station,
ylab = expression(paste(TN(,mu, g~L^{-1}, 

#here I have calculated the standard deviations instead of the default 
standard errors.

Thank you for taking your time. I have spent a long time trying to solve 
this and the frustration slowly approaches anger (in myself)  :-)

Yours sincerely

Anna Zakrisson Braeunlich
PhD Student

Department of Systems Ecology
Stockholm University
Svante Arrheniusv. 21A
SE-106 91 Stockholm

Lives in Berlin.
For fast paper-mail responses, please use the below address:
Katzbachstr. 21
D-10965, Berlin - Kreuzberg
Germany/Deutschland

E-mail: a...@ecology.su.se
Tel work: +49-(0)3091541281
Mobile: +49-(0)15777374888
Web site: http://www.ecology.su.se/staff/personal.asp?id=163

º`•. . • `•. .• `•. . º`•. . • `•. .• 
`•. .º

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mtext unicode failure

2013-02-19 Thread e-letter
On 19/02/2013, r-help-requ...@r-project.org
r-help-requ...@r-project.org wrote:
 --

 Message: 22
 Subject: Re: [R] mtext unicode failure

 On 18/02/2013 14:15, e-letter wrote:
 Readers,

 How to solve this unicode input error please?

 On what system, in what locale?  And where is Unicode mentioned?


GNU/Linux; obtained locale settings like so:

Sys.localeconv()
decimal_point thousands_sep  grouping   int_curr_symbol
  .GBP 
  currency_symbol mon_decimal_point mon_thousands_sep  mon_grouping
  £   .   ,\003\003
positive_sign negative_sign   int_frac_digits   frac_digits
  -   2   2
p_cs_precedesp_sep_by_space n_cs_precedesn_sep_by_space
  1   0   1   0
  p_sign_posn   n_sign_posn
  1   1

The unicode was in the 'mtext...' command, but it seems that the
mailing list server does not accept UTF-8. Tried to set this encoding:

postscript([filename],encoding='UTF-8',...)
Error in postscript([filename], encoding = UTF-8,  :
  failed to load encoding file in postscript()
In addition: Warning message:
In postscript([filename], encoding = UTF-8,  :
  failed to load encoding file 'UTF-8'

 If you intended a subscript 2, use plotmath. That is not a character in
 the standard Postscript fonts, nor is in the encoding you selected.  We
 don't know what that is since it depends on your locale: see ?postscript.

 Or use the cairo_ps device.

Thanks, the result is adequate, although the position of the subscript
doesn't look pleasing personally.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] foreach loop, stata equivalent

2013-02-19 Thread Uwe Ligges



On 18.02.2013 13:39, Jamora, Nelissa wrote:

Hi! I'm a recent convert from Stata, so forgive my ignorance.



In Stata, I can write foreach loops (example below)



foreach var of varlist p1-p14 {

foreach y of varlist p15-p269 {

   reg `var' `y'

}

}



It's looping p1-p15, p1-p16, p1-p269, p2-p15, p2-p16,... p2-p269,...
variable pairs.



How can I write something similar in R?



help(for)

Uwe Ligges




I 'tried' understanding the package.foreach but can't get it to work.



Thanks for any help

Nelissa


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using lm to estimate a parameter?

2013-02-19 Thread hellen
Hi,
I have a data with three variables (X,Y,Z) and I have an equation as
Z=X/(1+L*X/Y) where L is a constant which need to be estimated from data.
How should I write the formula in lm or is it possible to fit a linear
model in this case?

Thanks!
Hallen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Uplift modeling with R ?

2013-02-19 Thread Franck . BERTHUIT
Thank you Bert but i already stumbled upon that kind of article (especially 
those of Radcliffe, the precursor). I'm looking for something more pratical, if 
not a package, a piece of code i coud use in R. 
But maybe, it's a good opportunity for me to work on my developer's skills ;-).

Bye.

Franck Berthuit
France



-Bert Gunter gunter.ber...@gene.com a écrit : -
A : franck.berth...@maif.fr
De : Bert Gunter gunter.ber...@gene.com
Date : 18/02/2013 19:12
Cc : r-help@r-project.org
Objet : Re: [R] Uplift modeling with R ?

Search! Google on uplift modeling in R or similar. I got:

http://blog.data-miners.com/2009/12/differential-response-or-uplift.html

There is undoubtedly more.

-- Bert

On Mon, Feb 18, 2013 at 9:38 AM,  franck.berth...@maif.fr wrote:
 Hello R'users,

 I've tried to find a package or some code about Uplift modeling within R (or 
 Netlift modeling, Incremental or Differential response modeling) but failed.

 If you have any clue of source about Uplift modeling with R on the web, i 
 would appreciate to share it with you.

 Thank's beforehand.


 Franck Berthuit
 France


 =
         [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

=
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random number generator used in 'runif'

2013-02-19 Thread Mauricio Zambrano-Bigiarini
2013/2/18 Daniel Nordlund djnordl...@frontier.com:
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Mauricio Zambrano-Bigiarini
 Sent: Monday, February 18, 2013 1:33 AM
 To: r-help@r-project.org
 Subject: [R] Random number generator used in 'runif'

 Dear list,

 For the implementation of a particular optimization algorithm it is
 very important the random number generator.

 I would like to know if somebody could tell me what is the random
 number generator used by default in the 'runif' function.

 From the help page of 'runif' and '.Random.seed' I guess that the
 default algorithm is 'Mersenne-Twister', but I would be really
 grateful if somebody with experience in random number generators could
 confirm that or tell me what is the method actually used.

 No guessing necessary, as the R-help is quite explicit

Thank you very much for all the replies.

I was asking because I did not see any 'rng.kind' argument within the
'runif' function, and I wanted to be sure that what is mentioned in
the help page of '.Random.seed' is valid for 'runif' and all the
related functions.

Thanks again,

Mauricio


-- 
==
Linux user #454569 -- Ubuntu user #17469
==
Where ambition ends happiness begins
(Source Unknown)
==
http://www.r-project.org/posting-guide.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] latin hypercube sampling

2013-02-19 Thread Rob Carnell
Aimee Kopolow alj27 at georgetown.edu writes:
 
 Hi all,
 
 I am attempting to use latin hypercube sampling to sample different
 variable functions in a series of simultaneous differential equations.
 There is very little code online about lhs or clhs, so from different
 other help threads I have seen, it seems I need to create a
 probability density function for each variable function, and then use
 latin hypercube sampling on this pdf.
 
 So far, I have created a data frame consisting of the y output of
 density(functionX) for each of the different functions I wish to
 sample. [examples of functions include T1(t), WL1(T1,t),
 BE1(WL1,T1,t)] The dataframe consists of 512 rows/vectors for each
 function.
 
 I tried running
 res - clhs(df, size = 500, iter = 2000, progress = FALSE, simple = TRUE)
 
 and it returned a single series of 500 samples, rather than a series
 of 500 samples per function.
 
 I ultimately need a sample of each variable function that I can run
 through my model, putting each individual variable function as a
 constant instead, and then performing PRCC. Is there anyone who can
 advise on how to do this, or failing that, where I should look for
 sample code?
 
 Thank you for any help you are able to give,
 Aimee.
 
 

Aimee,

I'm the package maintainer for the lhs package.  Unfortunately, I'm not familiar
with the functions you mentioned (reproducible code would help us answer your
post).  I will try to show something parallel to what you described.

require(lhs)

# functions you described
T1 - function(t) t*t
WL1 - function(T1, t) T1*t
BE1 - function(WL1, T1, t) WL1*T1*t

# t is distributed according to some pdf (e.g. normal)
require(lhs)
# draw a lhs with 512 rows and 3 columns (one for each function)
y - randomLHS(512, 3)
# transform the three columns to a normal distribution (these could be any
# distribution)
t - apply(y, 2, function(columny) qnorm(columny, 2, 1))
# transform t using the functions provided
result - cbind(
  T1(t[,1]),
  WL1(T1(t[,2]), t[,2]),
  BE1(WL1(T1(t[,3]), t[,3]), T1(t[,3]), t[,3])
)
# check the results
# these should be approximately uniform
windows()
par(mfrow=c(2,2))
apply(y, 2, hist, breaks=50)
# these should be approximately normal
windows()
par(mfrow=c(2,2))
apply(t, 2, hist, breaks=50)
# these should be the results of the functions
windows()
par(mfrow=c(2,2))
apply(result, 2, hist, breaks=50)

Please feel free to contact me as the package maintainer if you need additional
help with lhs.

Rob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R nls results different from those of Excel ??

2013-02-19 Thread Prof J C Nash (U30A)

This thread unfortunately pushes a number of buttons:

- Excel computing a model by linearization which fits to
residual = log(data) - log(model)

rather than
wanted_residual = data - model

The COBB.RES example in my (freely available but rather dated) book
at http://macnash.telfer.uottawa.ca/nlpe/  shows an example where
comparing the results shows how extreme the differences can be.

- nls not doing well when the fit is near perfect. Package nlmrt is 
happy to compute such models, which have a role in approximation. The 
builders of nls() are rather (too?) insistent that nls() is a 
statistical function rather than simply nonlinear least squares. I can 
agree with their view in its context, but not for a general scientific 
computing package that R has become. It is one of the gotchas of R.


- Rolf's suggestion to inform Microsoft is, I'm sure, made with the sure 
knowledge that M$ will ignore such suggestions. They did, for example, 
fix one financial function temporarily (I don't know which). However, 
one of Excel's maintainers told me he would disavow admitting that 
Bill called to tell them to put the bug back in because the president 
of a large American bank called to complain his 1998 profit and loss 
spreadsheet had changed in the new version of Excel. Appearances are 
more important than getting things right. At the same conference where 
this I won't admit I told you conversation took place, a presentation 
was made estimating that 95% of major investment decisions were made 
based on Excel spreadsheets. The conference took place before the 2008 
crash. One is tempted to make non-statistical inferences.



JN



On 13-02-19 06:00 AM, r-help-requ...@r-project.org wrote:

Message: 79
Date: Mon, 18 Feb 2013 22:40:25 -0800
From: Jeff Newmillerjdnew...@dcn.davis.ca.us
To: Greg Snow538...@gmail.com, David Gwenzidgwe...@gmail.com
Cc: r-helpr-help@r-project.org
Subject: Re: [R] R nls results different from those of Excel ??
Message-ID:50c09528-7917-4a20-ad0e-5f4ebf9d0...@email.android.com
Content-Type: text/plain; charset=UTF-8

Excel definitely does not use nonlinear least squares fitting for power curve 
fitting. It uses linear LS fitting of the logs of x and y. There should be no 
surprise in the OP's observation.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

Greg Snow538...@gmail.com  wrote:


Have you plotted the data and the lines to see how they compare?  (see
fortune(193)).

Is there error around the line in the data?  The nls function is known
to
not work well when there is no error around the line.   Also check and
make
sure that the 2 methods are fitting the same model.

You might consider taking the log of both sides of the function to turn
it
into a linear function and using lm to fit the logs.


On Mon, Feb 18, 2013 at 9:49 PM, David Gwenzidgwe...@gmail.com
wrote:


Hi all

I have a set of data whose scatter plot shows a very nice power
relationship. My problem is when I fit a Power Trend Line in an Excel
spreadsheet, I get the model y= 44.23x^2.06  with an R square value of

0.72.

Now, if I input the same data into R and use
model -nls(y~ a*x^b , trace=TRUE, data= my_data, start = c(a=40,

b=2)) I

get a solution with a = 246.29 and b = 1.51. I have tried several

starting

values and this what I always get. I was expecting to get a value of

a

close to 44 and that of b close to 2. Why are these values of a and b
so different from those Excel gave me. Also the R square value for

the nls

model is as low as 0.41. What have I done wrong here? Please help.

Thanks

in advance

David

 [[alternative HTML version deleted]]

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using lm to estimate a parameter?

2013-02-19 Thread Uwe Ligges



On 19.02.2013 11:23, hellen wrote:

Hi,
I have a data with three variables (X,Y,Z) and I have an equation as
Z=X/(1+L*X/Y) where L is a constant which need to be estimated from data.
How should I write the formula in lm or is it possible to fit a linear
model in this case?


Neither, it is nonlinear in the parameters. See ?nls or ?optim, for example.

Uwe Ligges





Thanks!
Hallen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare posterior samples from R2OpenBugs and R function bugs{R2WinBUGS}

2013-02-19 Thread Uwe Ligges



On 18.02.2013 05:24, Jia Liu wrote:

Hi all,

I used both OpenBugs and R function bugs{R2WinBUGS} to run a linear mixed
effects model based on the same data set and initial values. I got the same
summary statistics but different posterior samples. However, if I order
these two sets of samples, one is generated from OpenBugs and the other is
generated from R, they turn to be the same. And the samples from R do not
have any autocorrelation. I do not know why and how this R function destroy
the orders of posterior samples. Have anyone ever met this situation
before? Any idea is appreciated.


Not sure what you are looking at, since there is no reproducible example 
nor any code in your message.


However, I guess you came across a specific design decision by Andrew 
Gelman, who wrote some code of R2WinBUGS before it was turned into an R 
package. That feature is documented on the ?bugs help page:


for convenience, the n.keep*n.chains simulations in sims.matrix and 
sims.list (but NOT sims.array) have been randomly permuted.


Best,
Uwe Ligges








Thanks,

Jia



sessionInfo()R version 2.15.1 (2012-06-22)

Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] R2WinBUGS_2.1-18 BRugs_0.8-0  coda_0.15-2  lattice_0.20-6

loaded via a namespace (and not attached):
[1] grid_2.15.1  tools_2.15.1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting WinBUGS Leuk example to work from R using R2winBUGS

2013-02-19 Thread Uwe Ligges



On 17.02.2013 08:47, Andy Cox wrote:

I am trying to learn to use winBUGS from R, I have experience with R.
I have managed to successfully run a simple example from R with no
problems. I have been trying to run the Leuk: Survival from winBUGS
examples Volume 1. I have managed to run this from winBUGS GUI with no
problems. My problem is try as I might ( and I have been trying and
searching for days) I cannot get it to run using R2winBUGS.I am sure
it is something simple.

The error message I get if I try and set inits in the script is

Error in bugs(data = L, inits = inits,
   parameters.to.save = params, model.file model.txt,  :
   Number of initialized chains (length(inits)) != n.chains

I know this means I have not initialised some of the chains, but I am
pasting the inits code from winbugs examples manual and all other
settings seem to me to be the same as when run on the winBUGS GUI.



If I try inits=NULL I get another error message

 display(log)
check(C:/BUGS/model.txt)
model is syntactically correct
data(C:/BUGS/data.txt)
data loaded
compile(1)
model compiled
gen.inits()
shape parameter (r) of gamma dL0[1] too small -- cannot sample
thin.updater(1)
update(500)
command #Bugs:update cannot be executed (is greyed out)
set(beta)

Which indicates to me I will still have problems after solving the
first one!! I am about to give up on using winBUGS, please can someone
save me? I know I am probably going to look stupid, but everyone has
to learn:-)

I have also tried changing nc-2 (On advice, which doesnt work and
gives an uninitialised chain error)

I am using winBUGS 1.4.3 on Windows XP 2002 SP3

My R code is below, many thanks for at least reading this far.

rm(list = ls())

L-list(N = 42, T = 17, eps = 1.0E-10,
 obs.t = c(1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12,
12, 15, 17, 22, 23, 6,
 6, 6, 6, 7, 9, 10, 10, 11, 13, 16, 17, 19, 20, 22, 23, 25, 32,
32, 34, 35),
 fail = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0,
0),
 Z = c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
-0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
-0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
-0.5),
 t = c(1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 22, 23, 35))

### 5.4. Analysis using WinBUGS
library(R2WinBUGS)  # Load the R2WinBUGS library CHOOSE to use WinBUGS
 #library(R2OpenBUGS)# Load the R2OpenBUGS library CHOOSE
to use OpenBUGS
setwd(C://BUGS)

# Save BUGS description of the model to working directory
sink(model.txt)
cat(

model
{
# Set up data
for(i in 1:N) {
for(j in 1:T) {

# risk set = 1 if obs.t = t
Y[i,j] - step(obs.t[i] - t[j] + eps)
# counting process jump = 1 if obs.t in [ t[j], t[j+1] )
# i.e. if t[j] = obs.t  t[j+1]
dN[i, j] - Y[i, j] * step(t[j + 1] - obs.t[i] - eps) * fail[i]
}
}
# Model
for(j in 1:T) {
for(i in 1:N) {
dN[i, j] ~ dpois(Idt[i, j]) # Likelihood
Idt[i, j] - Y[i, j] * exp(beta * Z[i]) * dL0[j] # Intensity
}
dL0[j] ~ dgamma(mu[j], c)
mu[j] - dL0.star[j] * c # prior mean hazard
# Survivor function = exp(-Integral{l0(u)du})^exp(beta*z)
S.treat[j] - pow(exp(-sum(dL0[1 : j])), exp(beta * -0.5));
S.placebo[j] - pow(exp(-sum(dL0[1 : j])), exp(beta * 0.5));
}
c - 0.001
r - 0.1
for (j in 1 : T) {
dL0.star[j] - r * (t[j + 1] - t[j])
}
beta ~ dnorm(0.0,0.01)
}


,fill=TRUE)
sink()

params- c(beta,S.placebo,S.treat)

inits-list( beta = 0.0,
  dL0 = c(1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,
  1.0,1.0,1.0,1.0,1.0,1.0, 1.0,1.0))



You need a list containing one list of initial values for each chain. 
For you nc=1 number of chains, this means:


 inits - list(list( beta = 0.0,
   dL0 = c(1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,
   1.0,1.0,1.0,1.0,1.0,1.0, 1.0,1.0)))


Best,
Uwe Ligges



# MCMC settings
nc -1  # Number of chains
ni - 1000  # Number of draws from posterior (for each chain)
ns-1000 #Number of sims (n.sims)
nb - floor(ni/2)   # Number of draws to discard as burn-in
nt - max(1, floor(nc * (ni - nb) / ns))# Thinning rate
Lout-list()



# Start Gibbs sampler: Run model in WinBUGS and save results in object
called out
out - bugs(data = L, inits =inits, parameters.to.save = params,
model.file = model.txt,
n.thin = nt, n.chains = nc, n.burnin = nb, n.iter = ni, debug = T, DIC
= TRUE,digits=5,
codaPkg=FALSE, working.directory = getwd())

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] Using lm to estimate a parameter?

2013-02-19 Thread Hans W Borchers
Uwe Ligges ligges at statistik.tu-dortmund.de writes:
 
 On 19.02.2013 11:23, hellen wrote:
  Hi,
  I have a data with three variables (X,Y,Z) and I have an equation as
  Z=X/(1+L*X/Y) where L is a constant which need to be estimated from data.
  How should I write the formula in lm or is it possible to fit a linear
  model in this case?
 
 Neither, it is nonlinear in the parameters. See ?nls or ?optim, for example.

Well, if the Z values are not too small, you can linearize it as

U = (X Y - Y Z) / Z = L X

and solve it with lm(U ~ X - 1), that is without absolute term.

 Uwe Ligges
 
 
  Thanks!
  Hallen
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R nls results different from those of Excel ??

2013-02-19 Thread Hans W Borchers
Jeff Newmiller jdnewmil at dcn.davis.ca.us writes:
 
 Excel definitely does not use nonlinear least squares fitting for power
 curve fitting. It uses linear LS fitting of the logs of x and y. There
 should be no surprise in the OP's observation.

May I be allowed to say that the general comments on MS Excel may be alright,
in this special case they are not.  The Excel Solver -- which is made by an
external company, not MS -- has a good reputation for being fast and accurate.
And it indeed solves least-squares and nonlinear problems better than some of
the solvers available in R.
There is a professional version of this solver, not available from Microsoft,
that could be called excellent. We, and this includes me, should not be too 
arrogant towards the outside, non-R world, the 'barbarians' as the ancient 
Greeks called it.

Hans Werner

 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnewmil at dcn.davis.ca.us   Basics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 --- 
 Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cramer von Mises test for a discrete distribution

2013-02-19 Thread Santiago Guallar
Hi,
 
I'm trying to carry out Cramer von Mises tests between pairs of vectors 
belonging to a discrete distribution (concretely frequencies from 0 to 200). 
However, the program crashes in the attempt. The problem seems to be that these 
vectors only have positive integer numbers (+ zero). When I add a random very 
small positive decimal to the non-decimal part everything works fine (files 
prm1  prpmr1). I attach two of these vectors so you can run the following 
code. I've also thought to divide both vectors by a real constant such as pi. 
Do you think these two approaches are acceptable?
 
setwd()
require(CvM2SL2Test)
prm = scan('prm.txt')
prpmr = scan('prpmr.txt')
ct1 = cvmts.test(prm, prpmr) # here R crashes
ct1
cvmts.pval( ct1, length(prm), length(prpmr) )
 
 
Thank you for your help,
 
Santi199.09  
193.59  
199.99  
173.89  
179.99  
200.89  
198.89  
200.09  
186.59  
171.79  
118.79  
44.19   
155.79  
200.49  
201.29  
199.99  
201.09  
201.19  
200.39  
200.59  
201.19  
200.79  
201.09  
201.29  
200.79  
200.99  
201.19  
200.99  
201.49  
201.39  
200.99  
199.99  
201.29  
201.39  
201.19  
164.89  
200.79  
200.29  
201.49  
200.99  
198.99  
198.29  
200.09  
201.09  
194.29  
189.49  
170.29  
173.99  
23.99   
200.39  
200.49  
200.79  
151.19  
201.09  
131.79  
25.39   
0.29
0.69
1.39
170.49  
200.59  
201.89  
200.89  
201.19  
201.29  
200.99  
197.89  
200.89  
185.09  
166.69  
172.59  
185.99  
201.59  
201.59  
184.49  
200.99  
201.99  
200.19  
200.69  
201.19  
201.89  
197.29  
201.29  
200.39  
201.69  
200.39  
140.99  
200.69  
200.49  
201.69  
197.39  
198.79  
201.09  
200.39  
201.09  
200.79  
201.39  
200.89  
201.29  
201.39  
201.49  
201.29  
150.39  
178.29  
199.29  
201.49  
200.69  
201.29  
200.69  
182.29  
200.99  
201.19  
200.99  
201.59  
136.69  
200.59  
200.89  
201.89  
200.39  
201.49  
184.29  
185.49  
187.09  
124.29  
187.19  
64.19   
65.19   
201.39  
201.09  
201.39  
201.89  
175.39  
200.19  
200.79  
200.99  
200.19  
201.19  
201.19  
201.19  
201.59  
200.69  
200.79  
201.29  
201.69  
193.49  
197.99  
118.19  
115.89  
95.29   
92.89   
201.29  
200.99  
201.09  
201.59  
201.49  
193.99  
179.59  
198.39  
139.09  
36.09   
0.99
1.49
141.39  
174.89  
169.19  
128.59  
3.59
18.19   
0.89
0.69
0.49
1.39
59.89   
1.69
135.09  
186.99  
185.49  
190.19  
19.79   
5.79
1.19
1.19
1.49
102.09  
200.79  
201.39  
201.09  
193.19  
201.49  
180.59  
0.69
189.09  
98.79   
123.69  
119.09  
148.19  
179.69  
185.29  
199.09  
195.59  
197.99  
200.09  
172.19  
198.59  
199.99  
164.69  
201.39  
200.59  
198.39  
175.69  
200.59  
192.09  
143.49  
142.39  
196.79  
191.69  
200.99  
200.79  
200.89  
193.39  
193.39  
201.59  
201.39  
200.39  
200.89  
200.79  
195.39  
200.49  
200.99  
200.69  
201.49  
201.49  
200.59  
201.29  
200.59  
200.79  
200.79  
201.69  
200.99  
201.39  
201.09  
201.39  
200.49  
201.49  
201.09  
201.29  
200.39  

Re: [R] introducing jitter in overlapping graphs using ggplots (plotmeans). Also sciplot.

2013-02-19 Thread John Kane
Hi Anna,
A small point -- there is no package called ggplots.  There is a package called 
gplots and one called ggplot2, which in earlier form was called ggplot.

To see what is happening I believe we need some sample data from  the three 
data  files or some mock-up data that matches your actual data in form.

 The easiest way to supply data  is to use the dput() function.  Example with 
your file named testfile: 
dput(testfile) 
Then copy the output and paste into your email.  For large data sets, you can 
just supply a representative sample.  Usually, 
dput(head(testfile, 100)) will be sufficient.   

In this case it looks like we would want three dput results, one for each year.

John Kane
Kingston ON Canada


 -Original Message-
 From: a...@ecology.su.se
 Sent: Tue, 19 Feb 2013 12:38:25 +0100
 To: r-help@r-project.org
 Subject: [R] introducing jitter in overlapping graphs using ggplots
 (plotmeans). Also sciplot.
 
 Hi,
 
 I want to plot means with standard deviations of Total Nitrogen (TN)
 across
 4 stations (S1-S4) and over 3 years (2007-2009). I want this plot in one
 panel.
 I do not want medians (bwplot, boxplot).
 
 I have tried a few different packages and it seems that ggplots with
 plotmeans was the fastest (I am not extremely skilled in writing my own
 scripts). Unfortunately, there is no grouping argument in plotmeans().
 Therefore, I have overlaid three plots (plotting all stations witin
 year).
 However, as the graphs are exctly overlaid, it is not possible to
 distinguish the confidence intervals. How do I introduce jitter?
 Data2007, Data2008 and Data2009 are the dataframes with the data from
 that
 particular year including all the stations.
 
 I have also tried lineplot.CI from the package sciplot and also there I
 am
 not able to introduce jitter. I have looked at the helpfiles.
 
 
 PACKAGE GGPLOT:
 
 library(gplots)
 par(family=serif,font=1)
 plotmeans(TN ~ STATION,
   data =Data2007,
   bars=TRUE, p=0.95,
   pch=1,
   cex=0.8,
   n.label=FALSE,
   mean.labels=FALSE,
   barwidth=1.5, barcol=black,
   connect=TRUE,
   xlab=Station,
   ylab = expression(paste(TN (,mu, g~L^{-1}, 
 par(new=T)
 plotmeans(TN ~ STATION,
   data =Data2008,
   bars=TRUE, p=0.95,
   pch=2,
   cex=0.8,
   lty=2,
   n.label=FALSE,
   mean.labels=FALSE,
   barwidth=1.5, barcol=black,
   connect=TRUE,
   na.action=na.exclude,
   yaxt='n', ann=FALSE,
   xaxt=n)
 par(new=T)
 plotmeans(TN ~ STATION,
   data =Data2009,
   bars=TRUE, p=0.95,
   pch=3,
   cex=0.8,
   lty=3,
   n.label=FALSE,
   mean.labels=FALSE,
   barwidth=1.5, barcol=black,
   connect=TRUE,
   na.action=na.exclude,
   yaxt='n', ann=FALSE,
   xaxt=n)
 
 PACKAGE SCIPLOT:
 library(sciplot)
 lineplot.CI(response=TN, x.factor=STATION, group=YEAR,
 ci.fun= function(x) c(mean(x)-sd(x), mean(x) + sd(x)),
 data = Mydata,
 xlab=Station,
 ylab = expression(paste(TN(,mu, g~L^{-1}, 
 
 #here I have calculated the standard deviations instead of the default
 standard errors.
 
 Thank you for taking your time. I have spent a long time trying to solve
 this and the frustration slowly approaches anger (in myself)  :-)
 
 Yours sincerely
 
 Anna Zakrisson Braeunlich
 PhD Student
 
 Department of Systems Ecology
 Stockholm University
 Svante Arrheniusv. 21A
 SE-106 91 Stockholm
 
 Lives in Berlin.
 For fast paper-mail responses, please use the below address:
 Katzbachstr. 21
 D-10965, Berlin - Kreuzberg
 Germany/Deutschland
 
 E-mail: a...@ecology.su.se
 Tel work: +49-(0)3091541281
 Mobile: +49-(0)15777374888
 Web site: http://www.ecology.su.se/staff/personal.asp?id=163
 
 B:`b?. . b? `b?. .b? `b?. . B:`b?. . b? `b?. .b?
 `b?. .B:
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R nls results different from those of Excel ??

2013-02-19 Thread Jeff Newmiller
I use Excel regularly, and do not consider this a slam... it is a fact. I am 
aware of Solver, but the query was about trend lines generated by Excel. In 
general it is possible to do arbitrarily complex computations with a four 
function calculator, but we don't describe that as something the calculator 
does. Setting up a Solver sheet to obtain trend coefficients is not a typical 
way to obtain them in Excel... that would be the user's doing, not Excel's. 
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Hans W Borchers hwborch...@googlemail.com wrote:

Jeff Newmiller jdnewmil at dcn.davis.ca.us writes:
 
 Excel definitely does not use nonlinear least squares fitting for
power
 curve fitting. It uses linear LS fitting of the logs of x and y.
There
 should be no surprise in the OP's observation.

May I be allowed to say that the general comments on MS Excel may be
alright,
in this special case they are not.  The Excel Solver -- which is made
by an
external company, not MS -- has a good reputation for being fast and
accurate.
And it indeed solves least-squares and nonlinear problems better than
some of
the solvers available in R.
There is a professional version of this solver, not available from
Microsoft,
that could be called excellent. We, and this includes me, should not be
too 
arrogant towards the outside, non-R world, the 'barbarians' as the
ancient 
Greeks called it.

Hans Werner


---
 Jeff NewmillerThe .   .  Go
Live...
 DCN:jdnewmil at dcn.davis.ca.us   Basics: ##.#.   ##.#.  Live
Go...
   Live:   OO#.. Dead: OO#.. 
Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#. 
rocks...1k

---

 Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to do a backward calculation for each record in a dataset

2013-02-19 Thread Berend Hasselman

On 19-02-2013, at 09:55, Prakasit Singkateera asltjoey.rs...@gmail.com wrote:

 Hi everyone,
 
 From your helps of giving me several ideas, eventually I can solve the posted 
 problem. Here is the R code. It can be done by applying the uniroot.all to 
 the data frame together with the proper form of equation (slightly 
 modification of the original equation).
 
 #Generate the sample data frame
 customer.name = c(John,Mike,Peter)
 product = c(Toothpaste,Toothpaste,Toothpaste)
 cost = c(30,45,40)
 mydata = data.frame(customer.name,product,cost)
 
 #Original cost function - not used
 #fcost = function(orders) 3.40 + (1.20 * orders^2)
 
 #Slightly modification of the cost function to be a proper form for root 
 finding
 #This is basically to set == 3.40 + (1.20 * orders^2) - fcost = 0
 f.to.findroot = function(orders,fcost) 3.40 + (1.20 * orders^2) - fcost
 
 #Using rootSolve package which contains uniroot.all function
 library(rootSolve)
 
 #Using plyr package which contains adply function
 library(plyr)
 
 #Use uniroot function to find the 'orders' variable (from the f.to.findroot 
 function) for each customer and put it into no.of.orders column in mysolution 
 data frame
 #Replace 'fcost' with 'cost' column from mydata
 #Interval of 0 to 1,000 is to make the f.to.findroot function have both 
 negative and positive sign, otherwise uniroot.all will give an error
 mysolution = data.frame(adply(mydata, 1, summarize, no.of.orders = 
 uniroot.all(f.to.findroot,interval = c(0,1000),fcost=cost)))
 mysolution
 
 #Remove the redundant mydata as mysolution it is an extended version of mydata
 rm(mydata)
 
 #Note uniroot.all can be used for both linear (e.g.orders^1) and non-linear 
 (e.g.orders^2) equations.


1. You don't need rootSolve. uniroot is sufficient in your case. You don't have 
multiple roots for each element of cost.

2. You are now storing more information than you require into the resulting 
dataframe. Use uniroot(…)$root to store only the root of the equation.

3. you don't need plyr. You can do it like this

mysolution - within(mydata, 
no.of.orders - sapply(seq_len(length(cost)),function(k) 
uniroot(f.to.findroot,interval = c(0,1000),fcost=cost[k])$root )
)
# for printing the dataframe

mysolution

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantiles of a subset of data

2013-02-19 Thread Pete Brecknock
bradleyd wrote
 Excuse the request from an R novice! I have a data frame (DATA) that has
 two numeric columns (YEAR and DAY) and 4000 rows. For each YEAR I need to
 determine the 10% and 90% quantiles of DAY. I'm sure this is easy enough,
 but I am a new to this. 
 
 quantile(DATA$DAY,c(0.1,0.9)) 
 10% 90% 
  12  29 
 
 But this is for the entire 4000 rows, when I need it to be for each YEAR.
 Is there no way to use a by argument in the quantile function? 
 
 Thanks for any help you can provide. 
 David

check out

?aggregate or ?by should be of help

HTH

Pete






--
View this message in context: 
http://r.789695.n4.nabble.com/Quantiles-of-a-subset-of-data-tp4659063p4659064.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data format

2013-02-19 Thread arun
Hi,
Try this:
el- read.csv(el.csv,header=TRUE,sep=\t,stringsAsFactors=FALSE)
 elsplit- split(el,el$st)
 
datetrial-data.frame(date1=seq.Date(as.Date(1930.1.1,format=%Y.%m.%d),as.Date(2010.12.31,format=%Y.%m.%d),by=day))
elsplit1- lapply(elsplit,function(x) 
data.frame(date1=as.Date(paste(x[,2],x[,3],x[,4],sep=-),format=%Y-%m-%d),discharge=x[,5]))
 elsplit2-lapply(elsplit1,function(x) x[order(x[,1]),])
library(plyr)
elsplit3-lapply(elsplit2,function(x) join(datetrial,x,by=date1,type=full))
 elsplit4-lapply(elsplit3,function(x) {x[,2][is.na(x[,2])]- -.000;x})
elsplit5-lapply(elsplit4,function(x) {x[,1]-format(x[,1],%Y.%m.%d);x})
elsplit6-lapply(elsplit5,function(x){substr(x[,1],6,6)-ifelse(substr(x[,1],6,6)==0,
 ,substr(x[,1],6,6));substr(x[,1],9,9)- ifelse(substr(x[,1],9,9)==0, 
,substr(x[,1],9,9));x})
 elsplit6[[1]][1:4,]
#   date1 discharge
#1 1930. 1. 1 -.000
#2 1930. 1. 2 -.000
#3 1930. 1. 3 -.000
#4 1930. 1. 4 -.000

 length(elsplit6)
#[1] 124
 tail(elsplit6[[124]],25)
#   date1 discharge
#29561 2010.12. 7 -.000
#29562 2010.12. 8 -.000
#29563 2010.12. 9 -.000
#29564 2010.12.10 -.000
#29565 2010.12.11 -.000
#29566 2010.12.12 -.000
#29567 2010.12.13 -.000
#29568 2010.12.14 -.000
#29569 2010.12.15 -.000
#29570 2010.12.16 -.000
#29571 2010.12.17 -.000
#29572 2010.12.18 -.000
#29573 2010.12.19 -.000
#29574 2010.12.20 -.000
#29575 2010.12.21 -.000
#29576 2010.12.22 -.000
#29577 2010.12.23 -.000
#29578 2010.12.24 -.000
#29579 2010.12.25 -.000
#29580 2010.12.26 -.000
#29581 2010.12.27 -.000
#29582 2010.12.28 -.000
#29583 2010.12.29 -.000
#29584 2010.12.30 -.000
#29585 2010.12.31 -.000

 str(head(elsplit6,3))
#List of 3
# $ AGOMO:'data.frame':    29585 obs. of  2 variables:
 # ..$ date1    : chr [1:29585] 1930. 1. 1 1930. 1. 2 1930. 1. 3 1930. 
1. 4 ...
  #..$ discharge: chr [1:29585] -.000 -.000 -.000 -.000 
...
 #$ AGONO:'data.frame':    29585 obs. of  2 variables:
  #..$ date1    : chr [1:29585] 1930. 1. 1 1930. 1. 2 1930. 1. 3 1930. 
1. 4 ...
  #..$ discharge: chr [1:29585] -.000 -.000 -.000 -.000 
...
 #$ ANZMA:'data.frame':    29585 obs. of  2 variables:
  #..$ date1    : chr [1:29585] 1930. 1. 1 1930. 1. 2 1930. 1. 3 1930. 
1. 4 ...
  #..$ discharge: chr [1:29585] -.000 -.000 -.000 -.000 
...


Regarding the space between date1 and discharge, I haven't checked it as you 
didn't mention whether it is needed in data.frame or not.

A.K.







From: eliza botto eliza_bo...@hotmail.com
To: smartpink...@yahoo.com smartpink...@yahoo.com 
Sent: Tuesday, February 19, 2013 10:01 AM
Subject: RE:



THANKS ARUN..
ITS A CHARACTER
SORRY FOR NOT TELLING YOU IN ADVANCE

ELISA


 Date: Tue, 19 Feb 2013 07:00:03 -0800
 From: smartpink...@yahoo.com
 Subject: Re: 
 To: eliza_bo...@hotmail.com
 
 
 
 Hi,
 One more doubt.
 You mentioned about -.000.  Is it going to be a number or character like 
 -.000?  If it is a number, the final product will be -.
 Arun
 
 
 
 
 
 From: eliza botto eliza_bo...@hotmail.com
 To: smartpink...@yahoo.com smartpink...@yahoo.com 
 Sent: Tuesday, February 19, 2013 9:16 AM
 Subject: RE:
 
 
 
 How can u be wrong arun?? you are right.
 
 elisa
 
 
  Date: Tue, 19 Feb 2013 06:15:31 -0800
  From: smartpink...@yahoo.com
  Subject: Re: 
  To: eliza_bo...@hotmail.com
  
  Hi Elisa,
  
  Just a doubt regarding the format of the date.  Is it the same format as 
  the previous one?  0 replaced by one space if either month or day is less 
  than 10.  Also, if I am correct, the list elements are for the different 
  stationname, right?
  Arun
  
  
  
  
  
  
  
  
  
  From: eliza botto eliza_bo...@hotmail.com
  To: smartpink...@yahoo.com smartpink...@yahoo.com 
  Sent: Tuesday, February 19, 2013 8:35 AM
  Subject: 
  
  
  
  
  
  Dear Arun,
  [Text file is also attached if format is changed, where as el is data file
  Attached with email is the excel file with contains the data. the data is 
  following form
  
  col1.    col2. col3.col4.col5.
  stationname year month day discharge
  A         2004 11232
  A          2004 1 2 334
  .
  
  B         2009 11       323
  B                       2009 12332
  
  
  There are stations where data starts from and ends at different years but i 
  want each year to start from 1930 and ends at 2010 with -.000 for those 
  days when data is missing. i want to make a list which should appear like 
  the following
  
  [[A]]
  1930. 1. 1 -.000
  1930. 1. 2 -.000
  1930. 1. 3 -.000
  1930. 1. 4 -.000
  1930. 1. 5 -.000
  1930. 1. 6 -.000
  1930. 1. 7 -.000
  1930. 1. 8 -.000
  1930. 1. 9 -.000
  1930. 1.10 -.000
  1930. 1.11 -.000
  1930. 1.12 -.000
  1930. 1.13 -.000
  

[R] calcMin

2013-02-19 Thread Fowler, Mark
I tried to use calcMin with a function that uses a number of ...
arguments (all args from resid on) besides the vector of parameters
being fit. Same idea as optim, nlm, nlminb for which this form of ...
syntax works. But with calcMin I get an error regarding unused
arguments. No partial matches to previous arguments that I can see.
Anybody know the reason or fix for this?

calcMin(pvec=data.frame(val=par,min=rep(.01,length(par)),max=rep(100
,length(par)),active=rep(TRUE,length(par))),func=optimwrap2,
resid=resid,caa=caa,na=na,vcode=vcode,maa=maa,ny=ny,nind=nind,qage=qage,
selmod=selmod,oldagei=oldagei,vpaflag=vpaflag)
Error in f(x, ...) : 
  unused argument(s) (resid = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), caa = c(0.344919912565012,
1.36925481463155, 4.94519930946541, 5.12996667120799, 11.0171415280463,
9.4432641668968, 4.7216320834484, 3.14775472229893, 1.57387736114947,
3.14775472229893, 


And an example using optim to address concerns about the user function
itself. [I'm trying to concoct a solver that handles far-off initial
parameter values without getting stuck in local minima. I'm searching
for something faster and more intelligent than this example.]

 for(i in 1:9) {
+
fitpars=optim(par=parset1,fn=optimwrap,control=list(maxit=100),resid=res
id,caa=caa,na=na,vcode=vcode,maa=maa,ny=ny,nind=nind,qage=qage,selmod=se
lmod,oldagei=oldagei,vpaflag=vpaflag)
+ parset2=fitpars$par
+ parval2=fitpars$value
+ print(parval2)
+ print(parset2)
+ if((parval1-parval2)1.0) {
+ parval1=parval2
+ parset1=parset2
+ }
+ if((parval1-parval2)1.0  parval20.1) {
+ parset1=par*pardown[i]
+ }
+ }
[1] 193.1500
[1] 43.84105 43.47576 42.91166 41.81009 42.57930
[1] 15.69411
[1] 13.35845 12.52650 12.28231 11.78171 11.29699
[1] 15.65625
[1] 11.91989 11.10244 10.84155 10.34886  9.85170
[1] 15.45980
[1] 10.374731  9.569294  9.302000  8.779963  8.325952
[1] 14.85332
[1] 9.105026 8.293079 8.004561 7.521170 7.080387
[1] 8.18217
[1] 6.209737 5.876993 5.229392 4.887182 4.102208
[1] 6.720143
[1] 6.180562 5.276061 5.057946 4.561191 4.317552
[1] 0.04003793
[1] 3.754422 2.871289 2.876904 2.091451 1.806243
[1] 0.04003793
[1] 3.754422 2.871289 2.876904 2.091451 1.806243

Mark Fowler
Population Ecology Division
Bedford Inst of Oceanography
Dept Fisheries  Oceans
Dartmouth NS Canada
B2Y 4A2
Tel. (902) 426-3529
Fax (902) 426-9710
Email mark.fow...@dfo-mpo.gc.ca
Home Tel. (902) 461-0708
Home Email mark.fow...@ns.sympatico.ca



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum by group and under some criteria

2013-02-19 Thread arun
Hello,
The expansion was based on the unique values of m1 and n1 in dataset d3.  I 
guess that is the way it works for expansion.

I am not sure what kind of results you are expecting.  
Even the code that you provided will also give the combination of m1=3 and 
n1=3.  
As I mentioned in the earlier reply, if you don't want the combination of m1=3 
and n1=3 in the expanded dataset, 
use type=inner in ?join().
library(plyr)
res2- join(res1,d3,by=c(m1,n1),type=inner)

A.K.










From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, February 19, 2013 10:42 AM
Subject: Re: [R] cumulative sum by group and under some criteria


Thanks. But I thougth the expanded dataset 'res1' should not have combination 
of m1=3 and n1=3 because it is based on dataset 'd3' which doesn't have m1=3 
and n1=3, right?
In the example that you provided:
 (m1+2):(maxN-(n1+2))
#[1] 5 
 (n1+2):(maxN-5)
#[1] 4 
#Suppose
 x1- 4 
 y1- 2 
 x1:(x1+5-m1)
#[1] 4 5 6
 y1:(y1+4-n1)
#[1] 2 3 4

 datnew-expand.grid(5,4,4:6,2:4)
 colnames(datnew)- c(m,n,x,y)
datnew-within(datnew,{p1- x/m;p2-y/n})
res-cbind(datnew,d2[rep(1:nrow(d2),nrow(datnew)),])
 row.names(res)- 1:nrow(res)
 res
#  m n x y   p2  p1 m1 n1 cterm1_P1L cterm1_P0H
#1 5 4 4 2 0.50 0.8  3  2    0.00032     0.0025
#2 5 4 5 2 0.50 1.0  3  2    0.00032     0.0025
#3 5 4 6 2 0.50 1.2  3  2    0.00032     0.0025
#4 5 4 4 3 0.75 0.8  3  2    0.00032     0.0025
#5 5 4 5 3 0.75 1.0  3  2    0.00032     0.0025
#6 5 4 6 3 0.75 1.2  3  2    0.00032     0.0025
#7 5 4 4 4 1.00 0.8  3  2    0.00032     0.0025
#8 5 4 5 4 1.00 1.0  3  2    0.00032     0.0025
#9 5 4 6 4 1.00 1.2  3  2    0.00032     0.0025

A.K.





- Original Message -
From: Zjoanna zjoanna2...@gmail.com
To: r-help@r-project.org
Cc:

Sent: Sunday, February 10, 2013 6:04 PM
Subject: Re: [R] cumulative sum by group and under some criteria


Hi,
How to expand or loop for one variable n based on another variable? for
example, I want to add m (from m1 to maxN- n1-2) and for each m, I want to
add n (n1+2 to maxN-m), and similarly add x and y, then I need to do some
calculations.

d3-data.frame(d2)
    for (m in (m1+2):(maxN-(n1+2)){
       for (n in (n1+2):(maxN-m)){
             for (x in x1:(x1+m-m1)){
                  for (y in y1:(y1+n-n1)){
                       p1- x/m
                       p2- y/n


On Thu, Feb 7, 2013 at 12:16 AM, arun kirshna [via R] 
ml-node+s789695n4657773...@n4.nabble.com wrote:

 Hi,

 Anyway, just using some random combinations:
  dnew- expand.grid(4:10,5:10,6:10,3:7,4:5,6:8)
 names(dnew)-c(m,n,x1,y1,x,y)
 resF- cbind(dnew,d2[rep(1:nrow(d2),nrow(dnew)),])

  row.names(resF)- 1:nrow(resF)
  head(resF)
 #  m n x1 y1 x y m1 n1 cterm1_P1L cterm1_P0H
 #1 4 5  6  3 4 6  3  2    0.00032     0.0025
 #2 5 5  6  3 4 6  3  2    0.00032     0.0025
 #3 6 5  6  3 4 6  3  2    0.00032     0.0025
 #4 7 5  6  3 4 6  3  2    0.00032     0.0025
 #5 8 5  6  3 4 6  3  2    0.00032     0.0025
 #6 9 5  6  3 4 6  3  2    0.00032     0.0025

  nrow(resF)
 #[1] 6300
 I am not sure what you want to do with this.
 A.K.
 
 From: Joanna Zhang [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4657773i=0

 To: arun [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4657773i=1


 Sent: Wednesday, February 6, 2013 10:29 AM
 Subject: Re: cumulative sum by group and under some criteria


 Hi,

 Thanks! I need to do some calculations in the expended data, the expended
 data would be very large, what is an efficient way, doing calculations
 while expending the data, something similiar with the following, or
 expending data using the code in your message and then add calculations in
 the expended data?

 d3-data.frame(d2)
    for ...{
          for {
               for  {
                   for .{
                        p1- x/m
                        p2- y/n
                       ..
 }}
 }}

 I also modified your code for expending data:
 dnew-expand.grid((m1+2):(maxN-(n1+2)),(n1+2):(maxN-m),0:m1,0:n1,
 x1:(x1+m-m1),y1:(y1+n-n1))
 names(dnew)-c(m,n,x1,y1,x,y)
 dnew
 resF-cbind(dnew[,c(2,1)],d2[rep(1:nrow(d2),nrow(dnew)),])    # this is
 not correct, how to modify it.
 resF
 row.names(resF)-1:nrow(resF)
 resF




 On Tue, Feb 5, 2013 at 2:46 PM, arun [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4657773i=2

 wrote:

 Hi,

 
 You can reduce the steps to reach d2:
 res3-
 with(res2,aggregate(cbind(cterm1_P1L,cterm1_P0H),by=list(m1,n1),max))
 
 #Change it to:
 res3new-  aggregate(.~m1+n1,data=res2[,c(1:2,12:13)],max)
 res3new
  m1 n1 cterm1_P1L cterm1_P0H
 1  2  2    0.01440 0.00273750
 2  3  2    0.00032 0.0025
 3  2  3    0.01952 0.00048125
 d2-res3new[res3new[,3]0.01  res3new[,4]0.01,]
 
  dnew-expand.grid(4:10,5:10)
  names(dnew)-c(n,m)
 resF-cbind(dnew[,c(2,1)],d2[rep(1:nrow(d2),nrow(dnew)),])
 
 row.names(resF)-1:nrow(resF)
  head(resF)
 #  m n m1 n1 cterm1_P1L cterm1_P0H
 #1 5 4  3 

[R] e1071::svm train model based on balanced accuracy

2013-02-19 Thread sabmue
Hey
For classification I use the svm from package e1071.
Since I have unbalanced data sets I would like to train the model based on
balanced accuracy not on just accuracy. I couldn't find an option in the svm
function to do so. 

Does anyone know if it is possible to tell the svm which kind of scoring
function to use for model training?

Would be glad for any help! Thanks!
Best, 
Sabine



--
View this message in context: 
http://r.789695.n4.nabble.com/e1071-svm-train-model-based-on-balanced-accuracy-tp4659066.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantiles of a subset of data

2013-02-19 Thread Pete Brecknock
bradleyd wrote
 Thanks for your help Pete. I can almost get it to work with;
 
 by(day,year,quantile)
 
 but this only gives me  0%  25%  50%  75% 100%, not the ones I'm looking
 for, 10% and 90%.
 
 I have tried;
 
 by(day,year,quantile(c(0.1, 0.9))) but this is rejected by
 Error in FUN(X[[1L]], ...) : could not find function FUN

Need to add the quantiles of interest .

# Dummy Data
d - data.frame(year=c(rep(2010,10),rep(2011,10),rep(2012,10)), quantity =
c(1:30)) 

# Quantiles by Year
by(d$quantity,d$year,quantile,c(0.1,0.9))

HTH

Pete



--
View this message in context: 
http://r.789695.n4.nabble.com/Quantiles-of-a-subset-of-data-tp4659063p4659072.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any R package to do the harmonic analysis

2013-02-19 Thread Janesh Devkota
Hi R Users, 

 

I was wondering if there is any R package available to do the harmonic
analysis of tide. Any suggestion is highly appreciated. 

 

Janesh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cramer von Mises test for a discrete distribution

2013-02-19 Thread Barry Rowlingson
On Tue, Feb 19, 2013 at 2:49 PM, Santiago Guallar sgual...@yahoo.com wrote:
 Hi,

 I'm trying to carry out Cramer von Mises tests between pairs of vectors 
 belonging to a discrete distribution (concretely frequencies from 0 to 200). 
 However, the program crashes in the attempt. The problem seems to be that 
 these vectors only have positive integer numbers (+ zero). When I add a 
 random very small positive decimal to the non-decimal part everything works 
 fine (files prm1  prpmr1). I attach two of these vectors so you can run the 
 following code. I've also thought to divide both vectors by a real constant 
 such as pi. Do you think these two approaches are acceptable?
 
  setwd()
  require(CvM2SL2Test)
  prm = scan('prm.txt')
  prpmr = scan('prpmr.txt')
  ct1 = cvmts.test(prm, prpmr) # here R crashes

 For you maybe. For me, works fine, and:

  ct1

[1] 30.20509

  cvmts.pval( ct1, length(prm), length(prpmr) )

 - this is taking a bit longer. I gave up and killed it. Maybe it
would have eventually crashed R, but you said the other function
call crashed R.

Your two mistakes are:

 1. Saying R crashes without showing us any kind of crash report or
error message.
 2. Not listing your system and package versions.

Ah, your three mistakes are...

 3. Not reading http://www.r-project.org/posting-guide.html


Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting the data in year month day hour and minutes to date

2013-02-19 Thread Janesh Devkota
Hi , 

 

I am trying to convert the date as factor to date using as.date function
in R. I have the date in the following format

2008-01-01 02:30

 

I tried to use the following command : 

as.Date(mydata$Date, format=%y-%m-%d  )

 

Can somebody help me with this ? I was able to convert the format with no
hour but getting difficulty with hour included. 

 

Thank you. 

 

Janesh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calculating seconds

2013-02-19 Thread Erin Hodgess
Dear R People:

I'm looking at some data which has the time in seconds since 1992/10/8,
15:15:42.5

Are there any functions currently available to translate this or should I
just do my own?

I figured that I'd check first.

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum by group and under some criteria

2013-02-19 Thread arun
Hi,Another thing you could do will be to use ?paste()
#for example.
#d3 dataset

 paste(d3$m1,d3$n1)
#[1] 2 2 3 2 2 3

#then you use that instead of unique(d3$m1), unique(d3$n1) in the loop.
I didn't try it.  But, that is one possibility.

You still didn't show me the results you expected in the expansion.  So, I am 
not sure about how it will look like.
A.K.



From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, February 19, 2013 11:43 AM
Subject: Re: [R] cumulative sum by group and under some criteria


Thanks. I can get the data I expected (get rid of the m1=3, n1=3) using the 
join and 'inner' code, but just curious about the way to expand the data. There 
should be a way to expand the data based on each row (combination of the 
variables), unique(d3$m1  d3$n1) ?.

or is there a way to use 'data.frame' and 'for' loop to expand directly from 
the data? like res1-data.frame (d3) for () {


On Tue, Feb 19, 2013 at 9:55 AM, arun smartpink...@yahoo.com wrote:

If you can provide me the output that you expect with all the rows of the 
combination in the res2, I can take a look.
 







From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com

Sent: Tuesday, February 19, 2013 10:42 AM

Subject: Re: [R] cumulative sum by group and under some criteria


Thanks. But I thougth the expanded dataset 'res1' should not have combination 
of m1=3 and n1=3 because it is based on dataset 'd3' which doesn't have m1=3 
and n1=3, right?
In the example that you provided:
 (m1+2):(maxN-(n1+2))
#[1] 5 
 (n1+2):(maxN-5)
#[1] 4 
#Suppose
 x1- 4 
 y1- 2 
 x1:(x1+5-m1)
#[1] 4 5 6
 y1:(y1+4-n1)
#[1] 2 3 4

 datnew-expand.grid(5,4,4:6,2:4)
 colnames(datnew)- c(m,n,x,y)
datnew-within(datnew,{p1- x/m;p2-y/n})
res-cbind(datnew,d2[rep(1:nrow(d2),nrow(datnew)),])
 row.names(res)- 1:nrow(res)
 res
#  m n x y   p2  p1 m1 n1 cterm1_P1L cterm1_P0H
#1 5 4 4 2 0.50 0.8  3  2    0.00032     0.0025
#2 5 4 5 2 0.50 1.0  3  2    0.00032     0.0025
#3 5 4 6 2 0.50 1.2  3  2    0.00032     0.0025
#4 5 4 4 3 0.75 0.8  3  2    0.00032     0.0025
#5 5 4 5 3 0.75 1.0  3  2    0.00032     0.0025
#6 5 4 6 3 0.75 1.2  3  2    0.00032     0.0025
#7 5 4 4 4 1.00 0.8  3  2    0.00032     0.0025
#8 5 4 5 4 1.00 1.0  3  2    0.00032     0.0025
#9 5 4 6 4 1.00 1.2  3  2    0.00032     0.0025

A.K.





- Original Message -
From: Zjoanna zjoanna2...@gmail.com
To: r-help@r-project.org
Cc:

Sent: Sunday, February 10, 2013 6:04 PM
Subject: Re: [R] cumulative sum by group and under some criteria


Hi,
How to expand or loop for one variable n based on another variable? for
example, I want to add m (from m1 to maxN- n1-2) and for each m, I want to
add n (n1+2 to maxN-m), and similarly add x and y, then I need to do some
calculations.

d3-data.frame(d2)
    for (m in (m1+2):(maxN-(n1+2)){
       for (n in (n1+2):(maxN-m)){
             for (x in x1:(x1+m-m1)){
                  for (y in y1:(y1+n-n1)){
                       p1- x/m
                       p2- y/n


On Thu, Feb 7, 2013 at 12:16 AM, arun kirshna [via R] 
ml-node+s789695n4657773...@n4.nabble.com wrote:

 Hi,

 Anyway, just using some random combinations:
  dnew- expand.grid(4:10,5:10,6:10,3:7,4:5,6:8)
 names(dnew)-c(m,n,x1,y1,x,y)
 resF- cbind(dnew,d2[rep(1:nrow(d2),nrow(dnew)),])

  row.names(resF)- 1:nrow(resF)
  head(resF)
 #  m n x1 y1 x y m1 n1 cterm1_P1L cterm1_P0H
 #1 4 5  6  3 4 6  3  2    0.00032     0.0025
 #2 5 5  6  3 4 6  3  2    0.00032     0.0025
 #3 6 5  6  3 4 6  3  2    0.00032     0.0025
 #4 7 5  6  3 4 6  3  2    0.00032     0.0025
 #5 8 5  6  3 4 6  3  2    0.00032     0.0025
 #6 9 5  6  3 4 6  3  2    0.00032     0.0025

  nrow(resF)
 #[1] 6300
 I am not sure what you want to do with this.
 A.K.
 
 From: Joanna Zhang [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4657773i=0

 To: arun [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4657773i=1


 Sent: Wednesday, February 6, 2013 10:29 AM
 Subject: Re: cumulative sum by group and under some criteria


 Hi,

 Thanks! I need to do some calculations in the expended data, the expended
 data would be very large, what is an efficient way, doing calculations
 while expending the data, something similiar with the following, or
 expending data using the code in your message and then add calculations in
 the expended data?

 d3-data.frame(d2)
    for ...{
          for {
               for  {
                   for .{
                        p1- x/m
                        p2- y/n
                       ..
 }}
 }}

 I also modified your code for expending data:
 dnew-expand.grid((m1+2):(maxN-(n1+2)),(n1+2):(maxN-m),0:m1,0:n1,
 x1:(x1+m-m1),y1:(y1+n-n1))
 names(dnew)-c(m,n,x1,y1,x,y)
 dnew
 resF-cbind(dnew[,c(2,1)],d2[rep(1:nrow(d2),nrow(dnew)),])    # this is
 not correct, how to modify it.
 resF
 row.names(resF)-1:nrow(resF)
 resF




 On Tue, 

Re: [R] cumulative sum by group and under some criteria

2013-02-19 Thread arun


Hi,

Try this:
res1- do.call(rbind,lapply(paste(d3$m1,d3$n1),function(m1) 
do.call(rbind,lapply(0:(as.numeric(substr(m1,1,1))-1),function(x1) 
do.call(rbind,lapply(0:(as.numeric(substr(m1,3,3))-1),function(y1) 
do.call(rbind,lapply((as.numeric(substr(m1,1,1))+2):(7-as.numeric(substr(m1,3,3))),function(m)
 do.call(rbind,lapply((as.numeric(substr(m1,3,3))+2):(9-m),function(n) 
 do.call(rbind,lapply(x1:(x1+m-as.numeric(substr(m1,1,1))), function(x) 
 do.call(rbind,lapply(y1:(y1+n-as.numeric(substr(m1,3,3))), function(y) 
 expand.grid(m1,x1,y1,m,n,x,y)) )
names(res1)- c(m1n1,x1,y1,m,n,x,y)
 res1$m1- NA; res1$n1- NA
res1[,8:9]-do.call(rbind,lapply(strsplit(as.character(res1$m1n1), 
),as.numeric))
res2- res1[,c(8:9,3:7)]
library(plyr)
res2-join(res1,d3,by=c(m1,n1),type=full) #Instead of this step, you can 
paste() the whole row of d3 and make suitable changes to the code above 

 tail(res2)
 #   m1n1 x1 y1 m n x y m1 n1 cterm1_P0L cterm1_P1L cterm1_P0H cterm1_P1H
#235  2 3  1  2 4 5 2 2  2  3 0.9025   0.64   0.857375  0.512
#236  2 3  1  2 4 5 2 3  2  3 0.9025   0.64   0.857375  0.512
#237  2 3  1  2 4 5 2 4  2  3 0.9025   0.64   0.857375  0.512
#238  2 3  1  2 4 5 3 2  2  3 0.9025   0.64   0.857375  0.512
#239  2 3  1  2 4 5 3 3  2  3 0.9025   0.64   0.857375  0.512
#240  2 3  1  2 4 5 3 4  2  3 0.9025   0.64   0.857375  0.512

A.K.

From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, February 19, 2013 11:43 AM
Subject: Re: [R] cumulative sum by group and under some criteria


Thanks. I can get the data I expected (get rid of the m1=3, n1=3) using the 
join and 'inner' code, but just curious about the way to expand the data. There 
should be a way to expand the data based on each row (combination of the 
variables), unique(d3$m1  d3$n1) ?.

or is there a way to use 'data.frame' and 'for' loop to expand directly from 
the data? like res1-data.frame (d3) for () {


On Tue, Feb 19, 2013 at 9:55 AM, arun smartpink...@yahoo.com wrote:

If you can provide me the output that you expect with all the rows of the 
combination in the res2, I can take a look.
 







From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com

Sent: Tuesday, February 19, 2013 10:42 AM

Subject: Re: [R] cumulative sum by group and under some criteria


Thanks. But I thougth the expanded dataset 'res1' should not have combination 
of m1=3 and n1=3 because it is based on dataset 'd3' which doesn't have m1=3 
and n1=3, right?
In the example that you provided:
 (m1+2):(maxN-(n1+2))
#[1] 5 
 (n1+2):(maxN-5)
#[1] 4 
#Suppose
 x1- 4 
 y1- 2 
 x1:(x1+5-m1)
#[1] 4 5 6
 y1:(y1+4-n1)
#[1] 2 3 4

 datnew-expand.grid(5,4,4:6,2:4)
 colnames(datnew)- c(m,n,x,y)
datnew-within(datnew,{p1- x/m;p2-y/n})
res-cbind(datnew,d2[rep(1:nrow(d2),nrow(datnew)),])
 row.names(res)- 1:nrow(res)
 res
#  m n x y   p2  p1 m1 n1 cterm1_P1L cterm1_P0H
#1 5 4 4 2 0.50 0.8  3  2    0.00032     0.0025
#2 5 4 5 2 0.50 1.0  3  2    0.00032     0.0025
#3 5 4 6 2 0.50 1.2  3  2    0.00032     0.0025
#4 5 4 4 3 0.75 0.8  3  2    0.00032     0.0025
#5 5 4 5 3 0.75 1.0  3  2    0.00032     0.0025
#6 5 4 6 3 0.75 1.2  3  2    0.00032     0.0025
#7 5 4 4 4 1.00 0.8  3  2    0.00032     0.0025
#8 5 4 5 4 1.00 1.0  3  2    0.00032     0.0025
#9 5 4 6 4 1.00 1.2  3  2    0.00032     0.0025

A.K.





- Original Message -
From: Zjoanna zjoanna2...@gmail.com
To: r-help@r-project.org
Cc:

Sent: Sunday, February 10, 2013 6:04 PM
Subject: Re: [R] cumulative sum by group and under some criteria


Hi,
How to expand or loop for one variable n based on another variable? for
example, I want to add m (from m1 to maxN- n1-2) and for each m, I want to
add n (n1+2 to maxN-m), and similarly add x and y, then I need to do some
calculations.

d3-data.frame(d2)
    for (m in (m1+2):(maxN-(n1+2)){
       for (n in (n1+2):(maxN-m)){
             for (x in x1:(x1+m-m1)){
                  for (y in y1:(y1+n-n1)){
                       p1- x/m
                       p2- y/n


On Thu, Feb 7, 2013 at 12:16 AM, arun kirshna [via R] 
ml-node+s789695n4657773...@n4.nabble.com wrote:

 Hi,

 Anyway, just using some random combinations:
  dnew- expand.grid(4:10,5:10,6:10,3:7,4:5,6:8)
 names(dnew)-c(m,n,x1,y1,x,y)
 resF- cbind(dnew,d2[rep(1:nrow(d2),nrow(dnew)),])

  row.names(resF)- 1:nrow(resF)
  head(resF)
 #  m n x1 y1 x y m1 n1 cterm1_P1L cterm1_P0H
 #1 4 5  6  3 4 6  3  2    0.00032     0.0025
 #2 5 5  6  3 4 6  3  2    0.00032     0.0025
 #3 6 5  6  3 4 6  3  2    0.00032     0.0025
 #4 7 5  6  3 4 6  3  2    0.00032     0.0025
 #5 8 5  6  3 4 6  3  2    0.00032     0.0025
 #6 9 5  6  3 4 6  3  2    0.00032     0.0025

  nrow(resF)
 #[1] 6300
 I am not sure what you want to do with this.
 A.K.
 
 From: Joanna Zhang [hidden 
 

Re: [R] calculating seconds

2013-02-19 Thread Uwe Ligges



On 19.02.2013 18:52, Erin Hodgess wrote:

Dear R People:

I'm looking at some data which has the time in seconds since 1992/10/8,
15:15:42.5


Just ask R to

strptime(1992/10/8,15:15:42.5, %Y/%m/%d,%H:%M:%OS) + NumberOfSeconds

and you get the actual date after the given amount of NumberOfSeconds. 
Including correct timezone / daylight saving stuff / leap seconds etc.


Best,
Uwe Ligges







Are there any functions currently available to translate this or should I
just do my own?

I figured that I'd check first.

Thanks,
Erin




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting the data in year month day hour and minutes to date

2013-02-19 Thread Uwe Ligges



On 19.02.2013 18:47, Janesh Devkota wrote:

Hi ,



I am trying to convert the date as factor to date using as.date function
in R. I have the date in the following format

2008-01-01 02:30



I tried to use the following command :

as.Date(mydata$Date, format=%y-%m-%d  )



Can somebody help me with this ? I was able to convert the format with no
hour but getting difficulty with hour included.


as.Date is about dates, but not hours. See ?strptime for a way to 
convert to POSIXlt / POSIXct representations.


Best,
Uwe Ligges






Thank you.



Janesh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting the data in year month day hour and minutes to date

2013-02-19 Thread Pete Brecknock
jdbaba wrote
 Hi , 
 
  
 
 I am trying to convert the date as factor to date using as.date function
 in R. I have the date in the following format
 
 2008-01-01 02:30
 
  
 
 I tried to use the following command : 
 
 as.Date(mydata$Date, format=%y-%m-%d  )
 
  
 
 Can somebody help me with this ? I was able to convert the format with no
 hour but getting difficulty with hour included. 
 
  
 
 Thank you. 
 
  
 
 Janesh
 
 
   [[alternative HTML version deleted]]
 
 __

 R-help@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

how about 

as.POSIXlt(2008-01-01 02:30, format=%Y-%m-%d %H:%M) 

Pete



--
View this message in context: 
http://r.789695.n4.nabble.com/Converting-the-data-in-year-month-day-hour-and-minutes-to-date-tp4659075p4659080.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculating seconds

2013-02-19 Thread Jeff Newmiller
Your subject line says you want to calculate seconds... the body of your 
message says you want to translate seconds (to something unspecified).

I am not sure how we are supposed to respond. Can you give us a short example 
of what you have and what you want using R syntax?
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Erin Hodgess erinm.hodg...@gmail.com wrote:

Dear R People:

I'm looking at some data which has the time in seconds since 1992/10/8,
15:15:42.5

Are there any functions currently available to translate this or
should I
just do my own?

I figured that I'd check first.

Thanks,
Erin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting the data in year month day hour and minutes to date

2013-02-19 Thread Rui Barradas

Hello,

Try the following.


x - 2008-01-01 02:30
as.POSIXct(x, format = %Y-%m-%d %H:%M)


Hope this helps,

Rui Barradas

Em 19-02-2013 17:47, Janesh Devkota escreveu:

Hi ,



I am trying to convert the date as factor to date using as.date function
in R. I have the date in the following format

2008-01-01 02:30



I tried to use the following command :

as.Date(mydata$Date, format=%y-%m-%d  )



Can somebody help me with this ? I was able to convert the format with no
hour but getting difficulty with hour included.



Thank you.



Janesh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting the data in year month day hour and minutes to date

2013-02-19 Thread Jeff Newmiller
It is possible to squeeze a square peg into a round hole, but often you will 
not be satisfied with the result. Date is for... surprise, dates. You may want 
to use the chron package or the POSIXct type. The R Journal of June 2004 
(Volume 4/1) R Help Desk column is recommended reading.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Janesh Devkota janesh.devk...@gmail.com wrote:

Hi , 

 

I am trying to convert the date as factor to date using as.date
function
in R. I have the date in the following format

2008-01-01 02:30

 

I tried to use the following command : 

as.Date(mydata$Date, format=%y-%m-%d  )

 

Can somebody help me with this ? I was able to convert the format with
no
hour but getting difficulty with hour included. 

 

Thank you. 

 

Janesh


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantiles of a subset of data

2013-02-19 Thread Pete Brecknock
bradleyd wrote
 That does it, thanks. Do you think you help me a little bit further?
 
 I actually have 4 columns, YEAR, DAY, TEMP , and IBI. They are all
 numeric. I need to calculate the average TEMP and IBI values between the
 10% and 90% quantiles  for each YEAR.
 
 The code 
*
 by(data$day,data$year,day,c(0.1,0.9))
*
  was correct in that it calculated the quantile values as intended, but I
 don't know how to then calculate the mean TEMP and IBI values encompasses
 within those quantiles.
 
 Thanks again,
 David

have a look at trim argument of the mean function

?mean

Pete



--
View this message in context: 
http://r.789695.n4.nabble.com/Quantiles-of-a-subset-of-data-tp4659063p4659086.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode data according to quantile breaks

2013-02-19 Thread arun
HI Alain,

Try this:
df.breaks-data.frame(id=df[,1],sapply(df[,-1],function(x) 
findInterval(x,quantile(x),rightmost.closed=TRUE)),stringsAsFactors=FALSE)
df.breaks
#   id a b c
#1 x01 1 1 1
#2 x02 1 1 1
#3 x03 2 2 2
#4 x04 3 3 3
#5 x05 4 4 4
#6 x06 4 4 4
A.K.



- Original Message -
From: D. Alain dialva...@yahoo.de
To: Mailinglist R-Project r-help@r-project.org
Cc: 
Sent: Tuesday, February 19, 2013 5:01 AM
Subject: [R] recode data according to quantile breaks

Dear R-List, 

I would like to recode my data according to quantile breaks, i.e. all data 
within the range of 0%-25% should get a 1, 25%-50% a 2 etc.
Is there a nice way to do this with all columns in a dataframe.

e.g.

df- 
f-data.frame(id=c(x01,x02,x03,x04,x05,x06),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18))

df
   id    a      b      c
1 x01     1  2  1
2 x02 2      4      3
3 x03 3      6      9
4 x04     4      8 12
5 x05     5     10 15
6 x06     6     12     18

#I can do it in very complicated way


apply(df[-1],2,quantile)
   a    b    c
0%   1.0  2.0  1.0
25%  2.2  4.5  4.5
50%  3.5  7.0 10.5
75%  4.8  9.5 14.2
100% 6.0 12.0 18.0

#then 

df$a[df$a=2.2]-1
...

#result should be


df.breaks

id        a        b        c
x01    1           1        1
x02    1      1        1
x03    2           2        2
x04    3   3    3
x05    4   4    4
x06    4   4    4 



But there must be a way to do it more elegantly, something like


df.breaks- apply(df[-1],2,recode.by.quantile)

Can anyone help me with this?


Best wishes 


Alain      
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting the data in year month day hour and minutes to date

2013-02-19 Thread John Kane

You need to convert the factor to character.

as.Date( as.character(mydata$Date), %Y-%m-%d)

should do it.
John Kane
Kingston ON Canada


 -Original Message-
 From: janesh.devk...@gmail.com
 Sent: Tue, 19 Feb 2013 11:47:04 -0600
 To: r-help@r-project.org
 Subject: [R] Converting the data in year month day hour and minutes to
 date
 
 Hi ,
 
 
 
 I am trying to convert the date as factor to date using as.date
 function
 in R. I have the date in the following format
 
 2008-01-01 02:30
 
 
 
 I tried to use the following command :
 
 as.Date(mydata$Date, format=%y-%m-%d  )
 
 
 
 Can somebody help me with this ? I was able to convert the format with no
 hour but getting difficulty with hour included.
 
 
 
 Thank you.
 
 
 
 Janesh
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] make a list with names with s/lapply

2013-02-19 Thread Hermann Norpois
Hello,

I open some files in a directory and get a list.


open.list - sapply (namen, function (x) {file - list.files (ddir,
pattern=x, full.names=TRUE) # namen is vector and each element detects a
special file to open
   file - read.table (file)
}
)



This list has no names. I would like to get a list with key/value pairs and
namen is the vector delivering the names. So how does sapply or lapply get
the information to generate key/value pairs?

Thanks
Hermann

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] introducing jitter in overlapping graphs using ggplots (plotmeans). Also sciplot.

2013-02-19 Thread Jim Lemon

On 02/19/2013 10:38 PM, Anna Zakrisson wrote:

Hi,

I want to plot means with standard deviations of Total Nitrogen (TN) across
4 stations (S1-S4) and over 3 years (2007-2009). I want this plot in one
panel.
I do not want medians (bwplot, boxplot).
...


Hi Anna,
From your description, the brkdn.plot (plotrix) function might do what 
you want. In common with other functions in plotrix, brkdn.plot uses 
offsets rather than jittering to prevent overplotting.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] make a list with names with s/lapply

2013-02-19 Thread Rui Barradas

Hello,

I'm not sure I understabd, but if the names are in 'namen' then the 
following might do what you want.


names(open.list) - namen


Hope this helps,

Rui Barradas

Em 19-02-2013 18:09, Hermann Norpois escreveu:

Hello,

I open some files in a directory and get a list.


open.list - sapply (namen, function (x) {file - list.files (ddir,
pattern=x, full.names=TRUE) # namen is vector and each element detects a
special file to open
file - read.table (file)
 }
)



This list has no names. I would like to get a list with key/value pairs and
namen is the vector delivering the names. So how does sapply or lapply get
the information to generate key/value pairs?

Thanks
Hermann

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantiles of a subset of data

2013-02-19 Thread Pete Brecknock
bradleyd wrote
 Thanks Pete. The TRIM argument in the MEAN function tells me how to trim
 off decimal points, but I am lost as to how to append the mean values of
 TEMP and IBI between the 10% and 90% quantiles of DAY in each YEAR. 
 
 DAY is the julian date that an event occurred in certain years. The events
 occurred numerous times in each year, and I want to be able to say what
 the mean day was in each year excluding those days greater than the 90%
 and less than the 10% quantile in that year.

Would suggest that you forward a small, reproducible example with what you
expect the results to look like. 

Pete



--
View this message in context: 
http://r.789695.n4.nabble.com/Quantiles-of-a-subset-of-data-tp4659063p4659099.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CARET. Relationship between data splitting trainControl

2013-02-19 Thread James Jong
I have carefully read the CARET documentation at:
http://caret.r-forge.r-project.org/training.html, the vignettes, and
everything is quite clear (the examples on the website help a lot!), but I
am still a confused about the relationship between two arguments to
trainControl:

method
index

and the interplay between trainControl and the data splitting functions in
caret (e.g. createDataPartition, createResample, createFolds and
createMultiFolds)


To better frame my questions, let me use the following example from  the
documentation:
*
data(BloodBrain)
set.seed(1)
tmp - createDataPartition(logBBB,p = .8, times = 100)
trControl = trainControl(method = LGOCV, index = tmp)
ctreeFit - train(bbbDescr, logBBB, ctree,trControl=trControl)
*

My questions are:

1) If I use createDataPartition (which I assume that does stratified
bootstrapping), as in the above example, and I pass the result as index to
trainControl do I need to use LGOCV as the method in my call trainControl?
If I use another one (e.g. cv.) What difference would it make? In my head,
once you fix  index, you are fixing the type of cross-validation, so I am
not sure what role method plays if you use index.

2) What is the difference between createDataPartition and createResample?
Is it that createDataPartition does stratified bootstrapping, while
createResample doesn't?

3) How can I do **stratified** k-fold (e.g. 10 fold) cross validation using
caret? Would the following do it?

tmp - createFolds(logBBB, k=10, list=TRUE,  times = 100)
trControl = trainControl(method = cv, index = tmp)
ctreeFit - train(bbbDescr, logBBB, ctree,trControl=trControl)

Thanks so much in advance. CARET is a fantastic package and I am  eager to
learn how to use it properly.

~James

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum by group and under some criteria

2013-02-19 Thread Zjoanna
this is the data I expected

 suppose that I have a dataset 'd'
   m1  n1A B C D
1  2  2   0.9025000.640   0.90250.64
2  3  2   0.8573750.512   0.90250.64
I want to add  x1 (from 0 to m1), y1(from 0 to n1), m (range from m1+2 to
7-n1), n(from n1+2 to 9-m), x (x1 to x1+m-m1), y(y1 to y1+n-n1), expanding
to another dataset 'd2' based on each row (combination of m1 and n1)
so for the first row,
   m1  n1A B C D
1  2  2   0.9025000.640   0.90250.64
it should be expanded as
   m1 n1 x1 y1  m  n  x  y A   B   C D
2  2  0  0  4  4  0  0   0.9025000.640   0.90250.64
2  2  0  0  4  4  0  1   0.9025000.640   0.90250.64
2  2  0  0  4  4  0  2   0.9025000.640   0.90250.64
2  2  0  0  4  4  1  0   0.9025000.640   0.90250.64
2  2  0  0  4  4  1  1   0.9025000.640   0.90250.64
2  2  0  0  4  4  1  2   0.9025000.640   0.90250.64
2  2  0  0  4  4  2  0   0.9025000.640   0.90250.64
2  2  0  0  4  4  2  1   0.9025000.640   0.90250.64
2  2  0  0  4  4  2  2   0.9025000.640   0.90250.64
2  2  0  0  4  5  0  0   0.9025000.640   0.90250.64
2  2  0  0  4  5  0  1   0.9025000.640   0.90250.64
2  2  0  0  4  5  0  2   0.9025000.640   0.90250.64
2  2  0  0  4  5  1  0   0.9025000.640   0.90250.64
2  2  0  0  4  5  1  1   0.9025000.640   0.90250.64
2  2  0  0  4  5  1  2   0.9025000.640   0.90250.64
2  2  0  0  4  5  2  0   0.9025000.640   0.90250.64
2  2  0  0  4  5  2  1   0.9025000.640   0.90250.64
2  2  0  0  4  5  2  2   0.9025000.640   0.90250.64

2  2  0  1  4  4  0  0   0.9025000.640   0.90250.64
2  2  0  1  4  4  0  1   0.9025000.640   0.90250.64
2  2  0  1  4  4  0  2   0.9025000.640   0.90250.64
2  2  0  1  4  4  1  0   0.9025000.640   0.90250.64
2  2  0  1  4  4  1  1   0.9025000.640   0.90250.64
2  2  0  1  4  4  1  2   0.9025000.640   0.90250.64
2  2  0  1  4  4  2  0   0.9025000.640   0.90250.64
2  2  0  1  4  4  2  1   0.9025000.640   0.90250.64
2  2  0  1  4  4  2  2   0.9025000.640   0.90250.64
2  2  0  1  4  5  0  0   0.9025000.640   0.90250.64
2  2  0  1  4  5  0  1   0.9025000.640   0.90250.64
2  2  0  1  4  5  0  2   0.9025000.640   0.90250.64
2  2  0  1  4  5  1  0   0.9025000.640   0.90250.64
2  2  0  1  4  5  1  1   0.9025000.640   0.90250.64
2  2  0  1  4  5  1  2   0.9025000.640   0.90250.64
2  2  0  1  4  5  2  0   0.9025000.640   0.90250.64
2  2  0  1  4  5  2  1   0.9025000.640   0.90250.64
2  2  0  1  4  5  2  2   0.9025000.640   0.90250.64
2  2  0  2  4  4  0  0   0.9025000.640   0.90250.64
2  2  0  2  4  4  0  1   0.9025000.640   0.90250.64
2  2  0  2  4  4  0  2   0.9025000.640   0.90250.64
.
.
.
.
2  2  1  0  4  4  0  0   0.9025000.640   0.90250.64
2  2  1  1  4  4  0  1   0.9025000.640   0.90250.64
2  2  1  2  4  4  0  2   0.9025000.640   0.90250.64
.
.
.

On Tue, Feb 19, 2013 at 11:59 AM, arun kirshna [via R] 
ml-node+s789695n4659078...@n4.nabble.com wrote:



 Hi,

 Try this:
 res1- do.call(rbind,lapply(paste(d3$m1,d3$n1),function(m1)
 do.call(rbind,lapply(0:(as.numeric(substr(m1,1,1))-1),function(x1)
 do.call(rbind,lapply(0:(as.numeric(substr(m1,3,3))-1),function(y1)
 do.call(rbind,lapply((as.numeric(substr(m1,1,1))+2):(7-as.numeric(substr(m1,3,3))),function(m)
 do.call(rbind,lapply((as.numeric(substr(m1,3,3))+2):(9-m),function(n)
  do.call(rbind,lapply(x1:(x1+m-as.numeric(substr(m1,1,1))), function(x)
  do.call(rbind,lapply(y1:(y1+n-as.numeric(substr(m1,3,3))), function(y)
  expand.grid(m1,x1,y1,m,n,x,y)) )
 names(res1)- c(m1n1,x1,y1,m,n,x,y)
  res1$m1- NA; res1$n1- NA
 res1[,8:9]-do.call(rbind,lapply(strsplit(as.character(res1$m1n1),
 ),as.numeric))
 res2- res1[,c(8:9,3:7)]
 library(plyr)
 res2-join(res1,d3,by=c(m1,n1),type=full) #Instead of this step, you
 can paste() the whole row of d3 and make suitable changes to the code above

  tail(res2)
  #   m1n1 x1 y1 m n x y m1 n1 cterm1_P0L cterm1_P1L cterm1_P0H cterm1_P1H
 #235  2 3  1  2 4 5 2 2  2  3 0.9025   0.64   0.857375  0.512
 #236  2 3  1  2 4 5 2 3  2  3 0.9025   0.64   0.857375  0.512
 #237  2 3  1  2 4 5 2 4  2  3 0.9025   0.64   0.857375  0.512
 #238  2 3  1  2 4 5 3 2  2  3 0.9025   0.64   0.857375  0.512
 #239  2 3  1  2 4 5 3 3  2  3 0.9025   0.64   0.857375  0.512
 #240  2 3  1  2 4 5 3 4  2  3 0.9025   0.64   0.857375  0.512

 A.K.
 
 From: Joanna Zhang [hidden 
 

Re: [R] cumulative sum by group and under some criteria

2013-02-19 Thread arun
Hi,
suppose that I have a dataset 'd' 
   m1  n1    A B C D
1  2  2   0.902500    0.640   0.9025    0.64
2  3  2   0.857375    0.512   0.9025    0.64
I want to add  x1 (from 0 to m1), y1(from 0 to n1), m (range from 
m1+2 to 7-n1), n(from n1+2 to 9-m), x (x1 to x1+m-m1), y(y1 to y1+n-n1), 
expanding to another dataset 'd2' based on each row (combination of m1 
and n1)


Try:


 d-read.table(text=
m1  n1    A B C D
1  2  2   0.902500    0.640   0.9025    0.64
2  3  2   0.857375    0.512   0.9025    0.64
,sep=,header=TRUE)

vec1- paste(d[,1],d[,2],d[,3],d[,4],d[,5],d[,6])
res1- do.call(rbind,lapply(vec1,function(m1) 
do.call(rbind,lapply(0:(as.numeric(substr(m1,1,1))),function(x1) 
do.call(rbind,lapply(0:(as.numeric(substr(m1,3,3))),function(y1) 
do.call(rbind,lapply((as.numeric(substr(m1,1,1))+2):(7-as.numeric(substr(m1,3,3))),function(m)
 do.call(rbind,lapply((as.numeric(substr(m1,3,3))+2):(9-m),function(n)
 do.call(rbind,lapply(x1:(x1+m-as.numeric(substr(m1,1,1))), function(x)
 do.call(rbind,lapply(y1:(y1+n-as.numeric(substr(m1,3,3))), function(y)
 expand.grid(m1,x1,y1,m,n,x,y)) )
names(res1)- c(group,x1,y1,m,n,x,y)
 res1$m1- NA; res1$n1- NA; res1$A- NA; res1$B- NA; res1$C- NA;res1$D - NA 
res1[,8:13]-do.call(rbind,lapply(strsplit(as.character(res1$group), 
),as.numeric))
res2- res1[,c(8:9,2:7,10:13)]


 head(res2)
#  m1 n1 x1 y1 m n x y  A    B  C    D
#1  2  2  0  0 4 4 0 0 0.9025 0.64 0.9025 0.64
#2  2  2  0  0 4 4 0 1 0.9025 0.64 0.9025 0.64
#3  2  2  0  0 4 4 0 2 0.9025 0.64 0.9025 0.64
#4  2  2  0  0 4 4 1 0 0.9025 0.64 0.9025 0.64
#5  2  2  0  0 4 4 1 1 0.9025 0.64 0.9025 0.64
#6  2  2  0  0 4 4 1 2 0.9025 0.64 0.9025 0.64






From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, February 19, 2013 11:43 AM
Subject: Re: [R] cumulative sum by group and under some criteria


Thanks. I can get the data I expected (get rid of the m1=3, n1=3) using the 
join and 'inner' code, but just curious about the way to expand the data. There 
should be a way to expand the data based on each row (combination of the 
variables), unique(d3$m1  d3$n1) ?.

or is there a way to use 'data.frame' and 'for' loop to expand directly from 
the data? like res1-data.frame (d3) for () {


On Tue, Feb 19, 2013 at 9:55 AM, arun smartpink...@yahoo.com wrote:

If you can provide me the output that you expect with all the rows of the 
combination in the res2, I can take a look.
 







From: Joanna Zhang zjoanna2...@gmail.com
To: arun smartpink...@yahoo.com

Sent: Tuesday, February 19, 2013 10:42 AM

Subject: Re: [R] cumulative sum by group and under some criteria


Thanks. But I thougth the expanded dataset 'res1' should not have combination 
of m1=3 and n1=3 because it is based on dataset 'd3' which doesn't have m1=3 
and n1=3, right?
In the example that you provided:
 (m1+2):(maxN-(n1+2))
#[1] 5 
 (n1+2):(maxN-5)
#[1] 4 
#Suppose
 x1- 4 
 y1- 2 
 x1:(x1+5-m1)
#[1] 4 5 6
 y1:(y1+4-n1)
#[1] 2 3 4

 datnew-expand.grid(5,4,4:6,2:4)
 colnames(datnew)- c(m,n,x,y)
datnew-within(datnew,{p1- x/m;p2-y/n})
res-cbind(datnew,d2[rep(1:nrow(d2),nrow(datnew)),])
 row.names(res)- 1:nrow(res)
 res
#  m n x y   p2  p1 m1 n1 cterm1_P1L cterm1_P0H
#1 5 4 4 2 0.50 0.8  3  2    0.00032     0.0025
#2 5 4 5 2 0.50 1.0  3  2    0.00032     0.0025
#3 5 4 6 2 0.50 1.2  3  2    0.00032     0.0025
#4 5 4 4 3 0.75 0.8  3  2    0.00032     0.0025
#5 5 4 5 3 0.75 1.0  3  2    0.00032     0.0025
#6 5 4 6 3 0.75 1.2  3  2    0.00032     0.0025
#7 5 4 4 4 1.00 0.8  3  2    0.00032     0.0025
#8 5 4 5 4 1.00 1.0  3  2    0.00032     0.0025
#9 5 4 6 4 1.00 1.2  3  2    0.00032     0.0025

A.K.





- Original Message -
From: Zjoanna zjoanna2...@gmail.com
To: r-help@r-project.org
Cc:

Sent: Sunday, February 10, 2013 6:04 PM
Subject: Re: [R] cumulative sum by group and under some criteria


Hi,
How to expand or loop for one variable n based on another variable? for
example, I want to add m (from m1 to maxN- n1-2) and for each m, I want to
add n (n1+2 to maxN-m), and similarly add x and y, then I need to do some
calculations.

d3-data.frame(d2)
    for (m in (m1+2):(maxN-(n1+2)){
       for (n in (n1+2):(maxN-m)){
             for (x in x1:(x1+m-m1)){
                  for (y in y1:(y1+n-n1)){
                       p1- x/m
                       p2- y/n


On Thu, Feb 7, 2013 at 12:16 AM, arun kirshna [via R] 
ml-node+s789695n4657773...@n4.nabble.com wrote:

 Hi,

 Anyway, just using some random combinations:
  dnew- expand.grid(4:10,5:10,6:10,3:7,4:5,6:8)
 names(dnew)-c(m,n,x1,y1,x,y)
 resF- cbind(dnew,d2[rep(1:nrow(d2),nrow(dnew)),])

  row.names(resF)- 1:nrow(resF)
  head(resF)
 #  m n x1 y1 x y m1 n1 cterm1_P1L cterm1_P0H
 #1 4 5  6  3 4 6  3  2    0.00032     0.0025
 #2 5 5  6  3 4 6  3  2    0.00032     0.0025
 #3 6 5  6  3 4 6  3  2    0.00032     0.0025
 #4 7 5  6  3 4 6  3  2    

[R] Help reshaping a dataset with multiple tuples per row

2013-02-19 Thread Jim Underwood
I have a dataset which contains several multi-column measurement sets per row. 
Col 1 identifies patient:

Col1Col2Col3  Col 4Col 5 Col 6  
 Col7   ...
Patient   Treatment  Outcome  Advice  Treatment  Outcome  Advice
P1 T1  O1  A1T2O2   
   A2

Please advise a reshape strategy to generate an output frame with a single 
measurement per row, e.g.,

Col1Col2  Col3Col4
Patient  Treatment Outcome Advice
P1 T1  O1  A1 
P1 T2  O2  A2
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help reshaping a dataset with multiple tuples per row

2013-02-19 Thread arun


Hi,
Try this:
dat1- read.table(text=
Patient  Treatment Outcome  Advice Treatment  Outcome  Advice
P1    T1  O1  A1    T2    O2  A2
,sep=,header=TRUE,stringsAsFactors=FALSE)

names(dat1)[-1]-paste(gsub(\\..*,,names(dat1)[-1]),_,rep(1:2,each=3),sep=)
 res-reshape(dat1,direction=long,varying=2:7,sep=_)
row.names(res)- 1:nrow(res)
 res- res[,-c(2,6)]
 
 res
 # Patient Treatment Outcome Advice
#1  P1    T1  O1 A1
#2  P1    T2  O2 A2


A.K.

- Original Message -
From: Jim Underwood droolinggee...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Tuesday, February 19, 2013 5:29 PM
Subject: [R] Help reshaping a dataset with multiple tuples per row

I have a dataset which contains several multi-column measurement sets per row. 
Col 1 identifies patient:

Col1        Col2            Col3          Col 4    Col 5             Col 6      
 Col7   ...
Patient   Treatment  Outcome  Advice  Treatment  Outcome  Advice
P1         T1              O1          A1        T2            O2   
       A2

Please advise a reshape strategy to generate an output frame with a single 
measurement per row, e.g.,

Col1        Col2          Col3        Col4
Patient  Treatment Outcome Advice
P1         T1              O1          A1 
P1         T2              O2          A2
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Seeing Global Workspace and dealing with attach and detach

2013-02-19 Thread David Arnold
Hi,

When writing a script, I often start with:

library(ISwR)
attach(thuesen)

Then write some more code, testing it as I write to see if it runs.

Then, at the end, a final:

detach(thuesen)

However, what happens is that I often get errors, which cause an attach, but
no subsequent detach. Now, I can fix the problem at the console window with
several:

detach(thuesen)
detach(thuesen)
detach(thuesen)
detach(thuesen)
detach(thuesen)

Until they are finally all closed.  Two questions:

1. What is a more efficient method?

2. What is the command again for viewing the global workspace?

D.



--
View this message in context: 
http://r.789695.n4.nabble.com/Seeing-Global-Workspace-and-dealing-with-attach-and-detach-tp4659105.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any R package to do the harmonic analysis

2013-02-19 Thread Steve Taylor
Have a look thru the Time Series task view...

http://cran.r-project.org/web/views/TimeSeries.html


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Janesh Devkota
Sent: Wednesday, 20 February 2013 5:52a
To: r-help@r-project.org
Subject: [R] Any R package to do the harmonic analysis

Hi R Users, 

 

I was wondering if there is any R package available to do the harmonic
analysis of tide. Any suggestion is highly appreciated. 

 

Janesh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Seeing Global Workspace and dealing with attach and detach

2013-02-19 Thread Jeff Newmiller
In general, creating variables while attached leads to problems such as you 
describe. Normally the recommendation is to avoid the use of attach and detach 
entirely in favor of explicit reference to the data frame using [[]], [,], $, 
and the data= argument supported by many functions. 
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

David Arnold dwarnol...@suddenlink.net wrote:

Hi,

When writing a script, I often start with:

library(ISwR)
attach(thuesen)

Then write some more code, testing it as I write to see if it runs.

Then, at the end, a final:

detach(thuesen)

However, what happens is that I often get errors, which cause an
attach, but
no subsequent detach. Now, I can fix the problem at the console window
with
several:

detach(thuesen)
detach(thuesen)
detach(thuesen)
detach(thuesen)
detach(thuesen)

Until they are finally all closed.  Two questions:

1. What is a more efficient method?

2. What is the command again for viewing the global workspace?

D.



--
View this message in context:
http://r.789695.n4.nabble.com/Seeing-Global-Workspace-and-dealing-with-attach-and-detach-tp4659105.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading data

2013-02-19 Thread arun
Hi,
Try this:


files-paste(MSMS_,23,PepInfo.txt,sep=)
read.data-function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); 
lapply(x,function(y) read.table(y,header=TRUE,sep = 
\t,stringsAsFactors=FALSE,fill=TRUE))}
lista-do.call(c,lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
names(lista)-paste(group_,gsub(\\d+,,names(lista)),sep=)
res2-split(lista,names(lista))
res3- lapply(res2,function(x) 
{names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x})
#Freq whole data
res4-lapply(seq_along(res3),function(i) 
do.call(rbind,lapply(res3[[i]],function(x) 
as.data.frame(table(factor(x$z,levels=1:3))
names(res4)- names(res2)
library(reshape2)
freq.i1-do.call(rbind,lapply(res4,function(x) 
dcast(melt(data.frame(id=gsub(\\..*,,row.names(x)),x),id.var=c(id,Var1)),id~Var1,value.var=value)))
freq.i1
#  id 1  2 3
#group_a   a1 1 12 6
#group_c.1 c1 0 10 3
#group_c.2 c2 0 12 3
#group_c.3 c3 0 13 4
#group_t.1 t1 0 10 4
#group_t.2 t2 1 12 6

freq.rel.i1- as.matrix(freq.i1[,-1]/rowSums(freq.i1[,-1]) )
 freq.rel.i1
 #  1 2 3
#group_a   0.05263158 0.6315789 0.3157895
#group_c.1 0. 0.7692308 0.2307692
#group_c.2 0. 0.800 0.200
#group_c.3 0. 0.7647059 0.2352941
#group_t.1 0. 0.7142857 0.2857143
#group_t.2 0.05263158 0.6315789 0.3157895



#Freq with FDR 0.01
res5-lapply(seq_along(res3),function(i) 
do.call(rbind,lapply(res3[[i]],function(x) 
as.data.frame(table(factor(x$z[x[[FDR]]0.01],levels=1:3))
names(res5)- names(res2)

freq.f1- do.call(rbind,lapply(res5,function(x) 
dcast(melt(data.frame(id=gsub(\\..*,,row.names(x)),x),id.var=c(id,Var1)),id~Var1,value.var=value)))

 freq.f1
 #     id 1  2 3
#group_a   a1 1 10 5
#group_c.1 c1 0  7 2
#group_c.2 c2 0  8 2
#group_c.3 c3 0  6 4
#group_t.1 t1 0  7 4
#group_t.2 t2 1 10 5


freq.rel.f1- as.matrix(freq.f1[,-1]/rowSums(freq.f1[,-1]))

colour-sample(rainbow(nrow(freq.rel.i1)))
par(mfrow=c(1,2))
barplot(freq.rel.i1,beside=T,main=(Sample),xlab=Charge,ylab=Relative 
Frequencies,col=colour,legend.text = rownames(freq.rel.i1))
barplot(freq.rel.f1,beside=T,main=(Sample with 
FDR0.01),xlab=Charge,ylab=Relative Frequencies,col=colour,legend.text = 
rownames(freq.rel.f1))
#change the legend position

Also, didn't check the rest of the code from chisquare test.
A.K.

From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, February 19, 2013 4:19 PM
Subject: Re: reading data


Here is the code and some outputs.

z.plot - function(directory,number) {
 #reading data
  setwd(directory)
 direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
 directT - direct[grepl(^t, direct)]
 directC - direct[grepl(^c, direct)]

 lista-lapply(direct, function(x) read.table(x,header=TRUE, sep = \t))
 listaC-lapply(directC, function(x) read.table(x,header=TRUE, sep = \t))
 listaT-lapply(directT, function(x) read.table(x,header=TRUE, sep = \t))

 #count different z values
 cab - vector()
    for (i in 1:length(lista)) {
 dc-lista[[i]][ifelse(lista[[i]]$FDR0.01, TRUE, FALSE),]
    dc-table(dc$z)
    cab - c(cab, names(dc))
  }

 #Relative freqs to construct the graph
    cab - unique(cab)
 print(cab)

###[1] 2 3 1



    d - matrix(ncol=length(cab))
 dci- d[-1,]
    dcf - d[-1,]
 dti - d[-1,]
 dtf - d[-1,]

    for (i in 1:length(listaC)) {

  #Relative freq of all data
  dcc-listaC[[i]]
  dcc-table(factor(dcc$z, levels=cab))
  dci- rbind(dci, dcc)
  rownames(dci)-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix = c)


  #Relative freq of data with FDR0.01
  dcc1-listaC[[i]][ifelse(listaC[[i]]$FDR0.01, TRUE, FALSE),]
  dcc1-table(factor(dcc1$z, levels=cab))
  dcf- rbind(dcf,dcc1)
  rownames(dcf)-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix = c)
 }


 for (i in 1:length(listaT)) {

  #Relative freq of all data
  dct-listaT[[i]]
  dct-table(factor(dct$z, levels=cab))
  dti- rbind(dti, dct)
  rownames(dti)-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix = t)


  #Relative freq of data with FDR0.01
  dct1-listaT[[i]][ifelse(listaT[[i]]$FDR0.01, TRUE, FALSE),]
  dct1-table(factor(dct1$z, levels=cab))
  dtf- rbind(dtf,dct1)
  rownames(dtf)-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix = t)
    }

  freq.i-rbind(dci,dti)
  freq.f-rbind(dcf,dtf)
  freq.rel.i-freq.i/apply(freq.i,1,sum)
  freq.rel.f-freq.f/apply(freq.f,1,sum) 

 print(freq.i)
##  2 3 1
#c1 10 3 0
#c2 12 3 0
#c3 13 4 0
#t1 10 4 0
#t2 12 6 1

 print(freq.f)
  ### 2 3 1
#c1  7 2 0
#c2  8 2 0
#c3  6 4 0
#t1  7 4 0
#t2 10 5 1

 print(freq.rel.i)
###   2 3  1
#c1 0.7692308 0.2307692 0.
#c2 0.800 0.200 0.
#c3 0.7647059 0.2352941 0.
#t1 0.7142857 0.2857143 0.
#t2 0.6315789 0.3157895 0.05263158
 print(freq.rel.f)

### 2 3  1
#c1 0.778 0.222 0.
#c2 0.800 0.200 0.
#c3 0.600 

Re: [R] Seeing Global Workspace and dealing with attach and detach

2013-02-19 Thread Ben Bolker
Jeff Newmiller jdnewmil at dcn.davis.ca.us writes:

 In general, creating variables while attached leads to problems such
 as you describe. Normally the recommendation is to avoid the use of
 attach and detach entirely in favor of explicit reference to the
 data frame using [[]], [,], $, and the data= argument supported by
 many functions.

And with() and within() [as well as transform(), mutate(), subset(), etc.:
see http://r4stats.com/2013/01/22/comparing-tranformation-styles/
[sic tranformation]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why R simulation gives same random results?

2013-02-19 Thread C W
Hi, list
I am doing 100,000 iterations of Bayesian simulations.

What I did is I split it into 4 different R sessions, each one runs 25,000
iteration.  But two of the sessions gave the simulation result.

I did not use any set.seed().  What is going on here?

Thanks,
Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 'gmm' package: How to pass controls to a numerical solver used in the gmm() function?

2013-02-19 Thread Malikov, Emir
Hello --

The question I have is about the gmm() function from the 'gmm' package
(v. 1.4-5).

The manual accompanying the package says that the gmm() function is
programmed to use either of four numerical solvers -- optim, optimize,
constrOptim, or nlminb -- for the minimization of the GMM objective
function.

I wonder whether there is a way to pass controls to a solver used
while calling the gmm() function?

In particular, the problem that I have been having is that the gmm()
fails to converge withing the default number of iteration for the
'optim' solver that it uses. Ideally, I would wish to figure out a way
to be able to choose controls, including the number of iterations, for
the solver that I tell gmm() to use.

Currently, the way I call the function is as follows:

model.name - gmm(g=g.fn, x=data, gradv=g.gr, t0=c(start),
type=c(twostep), optfct=c(optim) )

I also would want the gmm() function to know that I want it to pass
the following control -- maxit=1500 -- to the optim solver.
Unfortunately, the 'gmm' manual does not tell whether this is doable.

Thanks for your help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice 3x3 plot: force common y-limits accross rows and align x-axes

2013-02-19 Thread Boris Vasiliev
Duncan and Bert,

Thank you very much for your help with my question.  It's very much 
appreciated.


I used your suggestions to get the plot I needed: 

* 3x3 lattice of dotplots, 

* x-limits are the same for all panels, 

* y-limits and y-ticks in each row are the same, 

* y-limits and y-ticks are different between rows,  

* within each row subjects are ordered by sum(count) in the row.  


My code is below just in case somebody in the r-help list finds it useful.

Regards,
Boris.


# raw data

df - data.frame(
        subject=c('A','A','A',
                  'BB','BB',
                  'CCC','CCC','CCC',
                  'DD','DD',
                  'A','A','A',
                  '','',
                  'A','A',
                  'B','B'),
        risk=c('high','high','high',
               'high','high',
               'high','high','high',
               'med','med',
               'med','med','med',
               'med','med',
               'low','low',
               'low','low'),
        treatment=c('none','optX','optZ',
                    'none','optZ',
                    'none','optX','optZ',
                    'none','optZ',
                    'none','optX','optZ',
                    'none','optZ',
                    'none','optX',
                    'none','optZ'),
        count=c(5,10,2,
                3,5,
                8,1,2,
                3,7,
                10,2,5,
                15,2,
                7,7,

                10,8))


# re-level factors
df$risk - factor(df$risk,levels=c('low','med','high'))
df$treatment - factor(df$treatment,levels=c('none','optX','optZ'))


# create unique subjects ordered by sum(count) and risk
df$sbj.byrisk - paste(df$risk,df$subject,sep=_)
df$sbj - reorder(reorder(df$sbj.byrisk,df$count,sum),as.numeric(df$risk))
df$sbj.ix - as.numeric(df$sbj)

# for each row (i.e. risk), find limits, ticks, labels, and
# maximum number of points per panel
df.byrisk - split(df,df['risk'])
ylim - lapply(df.byrisk,function(idf) return(range(idf$sbj.ix)))
ytck - lapply(df.byrisk,function(idf) return(sort(unique(idf$sbj.ix
ylbl - lapply(df.byrisk,function(idf) {
                          ytck - 
sort(unique(idf$sbj.ix))
                          jdnx - 
pmatch(ytck,idf$sbj.ix)
                          ylbl - 
sub(.*_,,idf$sbj[jdnx])
                          return(ylbl)
                        })
yhei - unlist(lapply(ytck,length))

# set up lists for limits, ticks, labels, panel heights
ylims - rep(ylim,c(3,3,3))
ytcks - rep(list(NULL,NULL,NULL),c(3,3,3))
ytcks[seq(1,7,by=3)] - ytck
ylbls - rep(list(NULL,NULL,NULL),c(3,3,3))
ylbls[seq(1,7,by=3)] - ylbl

# set up plot layout
laywid - list(axis.panel=c(1,0.1,0.1))
layhei - list(panel=yhei/sum(yhei))

# plot
oltc - dotplot(sbj~count|treatment+risk,data=df,
                type=c(p,h),origin=0,
                scales=list(x=list(limits=c(-1,16),
                                   
alternating=FALSE),
                            y=list(relation=free,
                                   
alternating=FALSE,
                                   
limits=ylims,
                                   at=ytcks,
                                   
labels=ylbls)),
                par.settings=list(layout.widths=laywid,
                                  
layout.heights=layhei),
                between=list(y=0.2))
useOuterStrips(oltc)



 From: Duncan Mackay mac...@northnet.com.au

oject.org 
Sent: Monday, February 18, 2013 10:29:23 PM
Subject: Re: [R] lattice 3x3 plot: force common y-limits accross rows  and 
align x-axes

Hi Boris

Just a different take on it, quick glimpse of it

library(lattice)
library(latticeExtra)

z= unique(df$subject)
df$nsub - sapply(df$subject, pmatch, z)

useOuterStrips(xyplot(count ~ nsub|treatment*risk, df, type = h, lwd = 2, 
scales = list(x = list(at = 1:6, labels = z

The order can be changed by changing the order in z by making subject a factor 
in the correct order

see

http://finzi.psych.upenn.edu/R/Rhelp02/archive/43626.html

on how to change the yaxis 

Re: [R] 'gmm' package: How to pass controls to a numerical solver used in the gmm() function?

2013-02-19 Thread David Winsemius

On Feb 19, 2013, at 5:25 PM, Malikov, Emir wrote:

 Hello --
 
 The question I have is about the gmm() function from the 'gmm' package
 (v. 1.4-5).
 
 The manual accompanying the package says that the gmm() function is
 programmed to use either of four numerical solvers -- optim, optimize,
 constrOptim, or nlminb -- for the minimization of the GMM objective
 function.
 
 I wonder whether there is a way to pass controls to a solver used
 while calling the gmm() function?
 
 In particular, the problem that I have been having is that the gmm()
 fails to converge withing the default number of iteration for the
 'optim' solver that it uses. Ideally, I would wish to figure out a way
 to be able to choose controls, including the number of iterations, for
 the solver that I tell gmm() to use.
 
 Currently, the way I call the function is as follows:
 
 model.name - gmm(g=g.fn, x=data, gradv=g.gr, t0=c(start),
 type=c(twostep), optfct=c(optim) )
 
 I also would want the gmm() function to know that I want it to pass
 the following control -- maxit=1500 -- to the optim solver.

The argument name in the manual is `itermax`. I cannot tell from lookng at the 
code whether that would get passed to 'optim'.

 Unfortunately, the 'gmm' manual does not tell whether this is doable.

 There is also a ... argument which is stated in the help page to be passed 
to optim. Looking at ?optim one sees that controls generally need to be in a 
list named 'control'. That this is the intent is supported by the sentence on 
page 11 of the gmm vignette:

We could try dierent starting values, increase the number of iterations in 
the control option of
optim or use nlminb which allows to put restrictions on the parameter space.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with line types in plots saved as PDF files

2013-02-19 Thread Ista Zahn
Hi,

On Tue, Feb 19, 2013 at 7:09 PM, Ian Renner ian_ren...@yahoo.com wrote:
 Hi,

 I am trying to save a plot as a PDF with different line types. In the example 
 below, R displays the plot correctly with 2 curves in solid lines, 2 curves 
 in dashed lines and 1 curve as a dotted line. However, when I save the image 
 as PDF (as attached), the dashed lines become solid.

I see two solid lines, two dashed lines, and one dotted, using okular
version 0.16.0. I suspect your pdf viewer is buggy. What are you using
to view the pdf?

Best,
Ista

This also happens if I use the pdf command directly (by removing the # symbols).

 Is there any way around this? Saving the image as a JPEG or PNG file works 
 fine but the image quality is not desirable.

 b.hat = 6
 a.1 = -12
 a.2 = 0
 a.3 = 200

 b = seq(-10, 10, 0.0002)
 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

 lambda = 20

 p = -lambda*abs(b)

 pen.like = l + p

 y.min = 3*min(p)
 y.max = max(c(l, p, pen.like))

 #pdf(file = TestPlot.pdf, 6, 6)
 #{
 plot(b, l, type = l, ylim = c(y.min, y.max), lwd = 2, xlab = 
 expression(beta), ylab = , col = green, yaxt = n, xaxt = n)
 points(b, p, type = l, lty = dotted, lwd = 2, col = red)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = green)

 axis(1, at = c(0))
 axis(2, at = c(0))

 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)

 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

 b.hat = -3
 a.1 = -1.5
 a.2 = 0
 a.3 = 120

 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

 pen.like = l + p

 points(b, l, type = l, lwd = 2, col = blue)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = blue)

 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)

 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

 abline(h = 0)
 abline(v = 0)

 #}

 #dev.off()


 Thanks,

 Ian

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with line types in plots saved as PDF files

2013-02-19 Thread Ian Renner
Hi Ista,

I'm using Adobe Reader XI. It's good to hear that the plot was produced 
correctly and that it is Adobe that is failing to represent it properly. Thanks!

Ian




 From: Ista Zahn istaz...@gmail.com

Cc: r-help@r-project.org r-help@r-project.org 
Sent: Wednesday, 20 February 2013 2:14 PM
Subject: Re: [R] Problems with line types in plots saved as PDF files

Hi,


 Hi,

 I am trying to save a plot as a PDF with different line types. In the example 
 below, R displays the plot correctly with 2 curves in solid lines, 2 curves 
 in dashed lines and 1 curve as a dotted line. However, when I save the image 
 as PDF (as attached), the dashed lines become solid.

I see two solid lines, two dashed lines, and one dotted, using okular
version 0.16.0. I suspect your pdf viewer is buggy. What are you using
to view the pdf?

Best,
Ista

This also happens if I use the pdf command directly (by removing the # symbols).

 Is there any way around this? Saving the image as a JPEG or PNG file works 
 fine but the image quality is not desirable.

 b.hat = 6
 a.1 = -12
 a.2 = 0
 a.3 = 200

 b = seq(-10, 10, 0.0002)
 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

 lambda = 20

 p = -lambda*abs(b)

 pen.like = l + p

 y.min = 3*min(p)
 y.max = max(c(l, p, pen.like))

 #pdf(file = TestPlot.pdf, 6, 6)
 #{
 plot(b, l, type = l, ylim = c(y.min, y.max), lwd = 2, xlab = 
 expression(beta), ylab = , col = green, yaxt = n, xaxt = n)
 points(b, p, type = l, lty = dotted, lwd = 2, col = red)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = green)

 axis(1, at = c(0))
 axis(2, at = c(0))

 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)

 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

 b.hat = -3
 a.1 = -1.5
 a.2 = 0
 a.3 = 120

 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

 pen.like = l + p

 points(b, l, type = l, lwd = 2, col = blue)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = blue)

 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)

 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

 abline(h = 0)
 abline(v = 0)

 #}

 #dev.off()


 Thanks,

 Ian

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to do a backward calculation for each record in a dataset

2013-02-19 Thread Prakasit Singkateera
Hi Berend,

Your method is really much better. Thank you very much. (Yes I also forgot
to add the $root at the end.)

Best,
Prakasit


On Tue, Feb 19, 2013 at 10:51 PM, Berend Hasselman b...@xs4all.nl wrote:


 On 19-02-2013, at 09:55, Prakasit Singkateera asltjoey.rs...@gmail.com
 wrote:

  Hi everyone,
 
  From your helps of giving me several ideas, eventually I can solve the
 posted problem. Here is the R code. It can be done by applying the
 uniroot.all to the data frame together with the proper form of equation
 (slightly modification of the original equation).
 
  #Generate the sample data frame
  customer.name = c(John,Mike,Peter)
  product = c(Toothpaste,Toothpaste,Toothpaste)
  cost = c(30,45,40)
  mydata = data.frame(customer.name,product,cost)
 
  #Original cost function - not used
  #fcost = function(orders) 3.40 + (1.20 * orders^2)
 
  #Slightly modification of the cost function to be a proper form for root
 finding
  #This is basically to set == 3.40 + (1.20 * orders^2) - fcost = 0
  f.to.findroot = function(orders,fcost) 3.40 + (1.20 * orders^2) - fcost
 
  #Using rootSolve package which contains uniroot.all function
  library(rootSolve)
 
  #Using plyr package which contains adply function
  library(plyr)
 
  #Use uniroot function to find the 'orders' variable (from the
 f.to.findroot function) for each customer and put it into no.of.orders
 column in mysolution data frame
  #Replace 'fcost' with 'cost' column from mydata
  #Interval of 0 to 1,000 is to make the f.to.findroot function have both
 negative and positive sign, otherwise uniroot.all will give an error
  mysolution = data.frame(adply(mydata, 1, summarize, no.of.orders =
 uniroot.all(f.to.findroot,interval = c(0,1000),fcost=cost)))
  mysolution
 
  #Remove the redundant mydata as mysolution it is an extended version of
 mydata
  rm(mydata)
 
  #Note uniroot.all can be used for both linear (e.g.orders^1) and
 non-linear (e.g.orders^2) equations.


 1. You don't need rootSolve. uniroot is sufficient in your case. You don't
 have multiple roots for each element of cost.

 2. You are now storing more information than you require into the
 resulting dataframe. Use uniroot(…)$root to store only the root of the
 equation.

 3. you don't need plyr. You can do it like this

 mysolution - within(mydata,
 no.of.orders - sapply(seq_len(length(cost)),function(k)
 uniroot(f.to.findroot,interval = c(0,1000),fcost=cost[k])$root )
 )
 # for printing the dataframe

 mysolution

 Berend



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with line types in plots saved as PDF files

2013-02-19 Thread Ista Zahn
On Tue, Feb 19, 2013 at 10:20 PM, Ian Renner ian_ren...@yahoo.com wrote:
 Hi Ista,

 I'm using Adobe Reader XI. It's good to hear that the plot was produced
 correctly and that it is Adobe that is failing to represent it properly.

Right, well I just installed acroread (adobe reader for linux) and I
do see the problem you originally described. So all is not well,
unless you convince people not to use Adobe reader to view the
document. Sorry I can't be of more help at the moment, as I'm headed
to bed. If no one else beats me to it I'll take another look in the
morning.

Best,
Ista

 Thanks!

 Ian

 
 From: Ista Zahn istaz...@gmail.com
 To: Ian Renner ian_ren...@yahoo.com
 Cc: r-help@r-project.org r-help@r-project.org
 Sent: Wednesday, 20 February 2013 2:14 PM
 Subject: Re: [R] Problems with line types in plots saved as PDF files

 Hi,

 On Tue, Feb 19, 2013 at 7:09 PM, Ian Renner ian_ren...@yahoo.com wrote:
 Hi,

 I am trying to save a plot as a PDF with different line types. In the
 example below, R displays the plot correctly with 2 curves in solid lines, 2
 curves in dashed lines and 1 curve as a dotted line. However, when I save
 the image as PDF (as attached), the dashed lines become solid.

 I see two solid lines, two dashed lines, and one dotted, using okular
 version 0.16.0. I suspect your pdf viewer is buggy. What are you using
 to view the pdf?

 Best,
 Ista

 This also happens if I use the pdf command directly (by removing the #
 symbols).

 Is there any way around this? Saving the image as a JPEG or PNG file works
 fine but the image quality is not desirable.

 b.hat = 6
 a.1 = -12
 a.2 = 0
 a.3 = 200

 b = seq(-10, 10, 0.0002)
 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

 lambda = 20

 p = -lambda*abs(b)

 pen.like = l + p

 y.min = 3*min(p)
 y.max = max(c(l, p, pen.like))

 #pdf(file = TestPlot.pdf, 6, 6)
 #{
 plot(b, l, type = l, ylim = c(y.min, y.max), lwd = 2, xlab =
 expression(beta), ylab = , col = green, yaxt = n, xaxt = n)
 points(b, p, type = l, lty = dotted, lwd = 2, col = red)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = green)

 axis(1, at = c(0))
 axis(2, at = c(0))

 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)

 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

 b.hat = -3
 a.1 = -1.5
 a.2 = 0
 a.3 = 120

 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

 pen.like = l + p

 points(b, l, type = l, lwd = 2, col = blue)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = blue)

 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)

 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

 abline(h = 0)
 abline(v = 0)

 #}

 #dev.off()


 Thanks,

 Ian

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with line types in plots saved as PDF files

2013-02-19 Thread Pascal Oettli

Hi,

I also see the expected lines on your pdf file with Evince (3.2.1) using 
poppler/cairo (0.18.0).


Regards,
Pascal


Le 20/02/2013 09:09, Ian Renner a écrit :

Hi,

I am trying to save a plot as a PDF with different line types. In the example 
below, R displays the plot correctly with 2 curves in solid lines, 2 curves in 
dashed lines and 1 curve as a dotted line. However, when I save the image as 
PDF (as attached), the dashed lines become solid. This also happens if I use 
the pdf command directly (by removing the # symbols).

Is there any way around this? Saving the image as a JPEG or PNG file works fine 
but the image quality is not desirable.

b.hat = 6
a.1 = -12
a.2 = 0
a.3 = 200

b = seq(-10, 10, 0.0002)
l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

lambda = 20

p = -lambda*abs(b)

pen.like = l + p

y.min = 3*min(p)
y.max = max(c(l, p, pen.like))

#pdf(file = TestPlot.pdf, 6, 6)
#{
plot(b, l, type = l, ylim = c(y.min, y.max), lwd = 2, xlab = expression(beta), ylab = , col = 
green, yaxt = n, xaxt = n)
points(b, p, type = l, lty = dotted, lwd = 2, col = red)
points(b, pen.like, type = l, lwd = 2, lty = dashed, col = green)

axis(1, at = c(0))
axis(2, at = c(0))

lambda.hat = which.max(pen.like)
lambda.glm = which(b == b.hat)

points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

b.hat = -3
a.1 = -1.5
a.2 = 0
a.3 = 120

l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3

pen.like = l + p

points(b, l, type = l, lwd = 2, col = blue)
points(b, pen.like, type = l, lwd = 2, lty = dashed, col = blue)

lambda.hat = which.max(pen.like)
lambda.glm = which(b == b.hat)

points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)

abline(h = 0)
abline(v = 0)

#}

#dev.off()


Thanks,

Ian



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with line types in plots saved as PDF files

2013-02-19 Thread David Winsemius

On Feb 19, 2013, at 7:40 PM, Ista Zahn wrote:

 On Tue, Feb 19, 2013 at 10:20 PM, Ian Renner ian_ren...@yahoo.com wrote:
 Hi Ista,
 
 I'm using Adobe Reader XI. It's good to hear that the plot was produced
 correctly and that it is Adobe that is failing to represent it properly.
 
 Right, well I just installed acroread (adobe reader for linux) and I
 do see the problem you originally described. So all is not well,
 unless you convince people not to use Adobe reader to view the
 document. Sorry I can't be of more help at the moment, as I'm headed
 to bed. If no one else beats me to it I'll take another look in the
 morning.

Just for the record the Apple pdf viewer, Preview.app, displays it as 
described: two solid lines, two dashed lines and one dotted line.

-- 
David.
 
 Best,
 Ista
 
 Thanks!
 
 Ian
 
 
 From: Ista Zahn istaz...@gmail.com
 To: Ian Renner ian_ren...@yahoo.com
 Cc: r-help@r-project.org r-help@r-project.org
 Sent: Wednesday, 20 February 2013 2:14 PM
 Subject: Re: [R] Problems with line types in plots saved as PDF files
 
 Hi,
 
 On Tue, Feb 19, 2013 at 7:09 PM, Ian Renner ian_ren...@yahoo.com wrote:
 Hi,
 
 I am trying to save a plot as a PDF with different line types. In the
 example below, R displays the plot correctly with 2 curves in solid lines, 2
 curves in dashed lines and 1 curve as a dotted line. However, when I save
 the image as PDF (as attached), the dashed lines become solid.
 
 I see two solid lines, two dashed lines, and one dotted, using okular
 version 0.16.0. I suspect your pdf viewer is buggy. What are you using
 to view the pdf?
 
 Best,
 Ista
 
 This also happens if I use the pdf command directly (by removing the #
 symbols).
 
 Is there any way around this? Saving the image as a JPEG or PNG file works
 fine but the image quality is not desirable.
 
 b.hat = 6
 a.1 = -12
 a.2 = 0
 a.3 = 200
 
 b = seq(-10, 10, 0.0002)
 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3
 
 lambda = 20
 
 p = -lambda*abs(b)
 
 pen.like = l + p
 
 y.min = 3*min(p)
 y.max = max(c(l, p, pen.like))
 
 #pdf(file = TestPlot.pdf, 6, 6)
 #{
 plot(b, l, type = l, ylim = c(y.min, y.max), lwd = 2, xlab =
 expression(beta), ylab = , col = green, yaxt = n, xaxt = n)
 points(b, p, type = l, lty = dotted, lwd = 2, col = red)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = green)
 
 axis(1, at = c(0))
 axis(2, at = c(0))
 
 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)
 
 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)
 
 b.hat = -3
 a.1 = -1.5
 a.2 = 0
 a.3 = 120
 
 l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3
 
 pen.like = l + p
 
 points(b, l, type = l, lwd = 2, col = blue)
 points(b, pen.like, type = l, lwd = 2, lty = dashed, col = blue)
 
 lambda.hat = which.max(pen.like)
 lambda.glm = which(b == b.hat)
 
 points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
 points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)
 
 abline(h = 0)
 abline(v = 0)
 
 #}
 
 #dev.off()
 
 
 Thanks,
 
 Ian
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with line types in plots saved as PDF files

2013-02-19 Thread Duncan Mackay
Ian

No differences with Adobe X  with the following

windows(6,6)
#pdf(file = TestPlot.pdf, 6, 6)
#{
plot(b, l, type = l, ylim = c(y.min, y.max), lwd = 2, xlab = 
expression(beta), ylab = , col = green, yaxt = n, xaxt = n)
points(b, p, type = l, lty = dotted, lwd = 2, col = red)
points(b, pen.like, type = l, lwd = 2, lty = dashed, col = green)
axis(1, at = c(0))
axis(2,  at = c(0))
lambda.hat = which.max(pen.like)
lambda.glm = which(b == b.hat)
points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)
b.hat = -3
a.1 = -1.5
a.2 = 0
a.3 = 120
l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3
pen.like = l + p
points(b, l, type = l, lwd = 2, col = blue)
points(b, pen.like, type = l, lwd = 2, lty = dashed, col = blue)
lambda.hat = which.max(pen.like)
lambda.glm = which(b == b.hat)
points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)
abline(h = 0)
abline(v = 0)
#}
#dev.off()
dev.copy2pdf(file = TestPlot_devcopy2pdf.pdf)
as well as the pdf()  and manual Save As

sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] 
LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252 
LC_MONETARY=English_Australia.1252 
LC_NUMERIC=C   LC_TIME=English_Australia.1252

attached base packages:
[1] datasets  utils stats graphics  grDevices 
grid  methods   base

other attached packages:
[1] 
lhs_0.10R.oo_1.10.1 R.methodsS3_1.4.2 
foreign_0.8-51  chron_2.3-43MASS_7.3-22 
latticeExtra_0.6-24 RColorBrewer_1.0-5
[9] lattice_0.20-10

loaded via a namespace (and not attached):
[1] tools_2.15.2

Regards

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mac...@northnet.com.au


At 10:09 20/02/2013, you wrote:
Hi, I am trying to save a plot as a PDF with different line types. 
In the example below, R displays the plot correctly with 2 curves in 
solid lines, 2 curves in dashed lines and 1 curve as a dotted line.
However, when I save the image as PDF (as attached), the dashed 
lines become solid.
This also happens if I use the pdf command directly (by removing the 
# symbols). Is there any way around this?
  Saving the image as a JPEG or PNG file works fine but the image 
 quality is not desirable.

b.hat = 6
a.1 = -12
a.2 = 0
a.3 = 200
b = seq(-10, 10, 0.0002)
l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3
lambda = 20
p = -lambda*abs(b)
pen.like = l + p
y.min = 3*min(p)
y.max = max(c(l, p, pen.like))

#pdf(file = TestPlot.pdf, 6, 6)
#{
plot(b, l, type = l, ylim = c(y.min, y.max), lwd = 2, xlab = 
expression(beta), ylab = , col = green, yaxt = n, xaxt = n)
points(b, p, type = l, lty = dotted, lwd = 2, col = red)
points(b, pen.like, type = l, lwd = 2, lty = dashed, col = green)
axis(1, at = c(0))
axis(2,  at = c(0))
lambda.hat = which.max(pen.like)
lambda.glm = which(b == b.hat)
points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)
b.hat = -3
a.1 = -1.5
a.2 = 0
a.3 = 120
l = a.1*(b - b.hat)^2 + a.2*(b - b.hat) + a.3
pen.like = l + p
points(b, l, type = l, lwd = 2, col = blue)
points(b, pen.like, type = l, lwd = 2, lty = dashed, col = blue)
lambda.hat = which.max(pen.like)
lambda.glm = which(b == b.hat)
points(b[lambda.glm], l[lambda.glm], pch = 16, cex = 1.5)
points(b[lambda.hat], l[lambda.hat], pch = 17, cex = 1.5)
abline(h = 0)
abline(v = 0)
#}
#dev.off()

Thanks, Ian

file://g:\eudora\attach\TestPlot.pdffile://g:\eudora\attach\TestPlot.pdf 
TestPlot.pdf
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] generate variable y to produce excess zero in ZIP analysis

2013-02-19 Thread lili puspita rahayu


Dear Mr/Mrs

I am Lili Puspita Rahayu, student from magister third level of Statistics in 
Bogor Agriculture University. 
Mr/
Mrs, now I'm analyzing the Zero inflated Poisson (ZIP), which is a solution of
the Poisson regression
where the response variable (Y) has zero excess. ZIP now
I was doing did not use real data, but using simulated data in R. Simulations
by generating data on variables x1, x2, x3 with each size 
n = 100, after which generate
data on response variable (Y). However,
when I generate the variable y, after generating variables x1, x2, x3, then the
simulation result in the variable y that does not have a zero excess. Sometimes
just a coincidence there are 23%, 25% the proportion of zero on the variable Y. 
This
is because I generate variables x1, x2, x3 with a distribution that has a small
parameter values​​. I've
been consulting with my lecturer, and suggested to generate variable Y that
can control the proportion of zero on ​​ZIP analysis. I've
been trying to make the syntax, but has not succeeded.I would like to ask for 
assistance
to R to make the syntax to generate simulated Y variables that can control the
proportion of zeros after generating variables x1, x2, x3 on ZIP analysis.Thus, 
I can examine more deeply to determine how much the proportion of zeros on 
response variable (Y) that
can be used in the Poisson regression analysis, parametric ZIP and ZIP 
semiparametric.

syntax that I made previously by generating variable y without being controlled 
to produce zero excess in R :

 b0=1.5
 b1=-log(2)
 b2=log(3)
 b3=log(4)
 n=100
 x1-rnorm(n, mean=5, sd=2)
 x2-runif(n, min=1, max=2)
 x3-rnorm(n, mean=10, sd=15)
 
 y-seq(1,n)
 for(i in 1:n)
+ {
+ m-exp(b0+b1*x1[i]+b2*x2[i]+b3*x3[i])
+ yp-rpois(1,m)
+ y[i]-yp
+ }


I am very
grateful for the assistance of R.
I am looking forward to hearing from you. Thank you very much.

Sincerely yours
Lili Puspita Rahayu 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare posterior samples from R2OpenBugs and R function bugs{R2WinBUGS}

2013-02-19 Thread Jia Liu
On Tue, Feb 19, 2013 at 7:18 AM, Uwe Ligges lig...@statistik.tu-dortmund.de
 wrote:



 On 18.02.2013 05:24, Jia Liu wrote:

 Hi all,

 I used both OpenBugs and R function bugs{R2WinBUGS} to run a linear mixed
 effects model based on the same data set and initial values. I got the
 same
 summary statistics but different posterior samples. However, if I order
 these two sets of samples, one is generated from OpenBugs and the other is
 generated from R, they turn to be the same. And the samples from R do not
 have any autocorrelation. I do not know why and how this R function
 destroy
 the orders of posterior samples. Have anyone ever met this situation
 before? Any idea is appreciated.


 Not sure what you are looking at, since there is no reproducible example
 nor any code in your message.


Sorry for the inconvenience, I should have posted a sample code with it.


 However, I guess you came across a specific design decision by Andrew
 Gelman, who wrote some code of R2WinBUGS before it was turned into an R
 package. That feature is documented on the ?bugs help page:

 for convenience, the n.keep*n.chains simulations in sims.matrix and
 sims.list (but NOT sims.array) have been randomly permuted.

 It is exactly what I asked. I noticed that the posterior sample have been
randomly permuted. But I am curious the reason for doing that. So if the
autocorrelation is of my interest, I should use sims.array, not sims.list.

Thanks,

Jia



 Best,
 Uwe Ligges







 Thanks,

 Jia


  sessionInfo()R version 2.15.1 (2012-06-22)

 Platform: x86_64-pc-mingw32/x64 (64-bit)

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252
 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] R2WinBUGS_2.1-18 BRugs_0.8-0  coda_0.15-2  lattice_0.20-6

 loaded via a namespace (and not attached):
 [1] grid_2.15.1  tools_2.15.1

 [[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.