date:20131113

Re: [R] r package to solve for Nash equilibrium

2013-11-13 Thread Rolf Turner


On 11/12/13 02:49, Dereje Fentie wrote:

Is there an r package out there that solves for pure strategy* Nash
equilibrium of a two-person game*? A search for Nash equilibrium in r
provides a link to the *GNE* package which solves for the Generalized Nash
equilibrium. But what I would like to solve is a pure strategy Nash
equilibrium.


Please be aware that R is spelt with an upper case R!!!

I can't answer your question, but you should note two things (of which 
you may

not be aware).

(1) There may not *be* a pure strategy Nash equilibrium.

(2) There may be more than one Nash equilibrium.

If the GNE package finds *all* Nash equilibria for a given game
(I don't know whether it does, or if it stops with the first equilibrium
it comes to) and *if* there is a pure strategy equilibrium, then this
will be amongst the solutions found, and you just have to pick it out.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bar Graph

2013-11-13 Thread Keniajin Wambui

The serialno represents each individual in the data set.The total
count of the serialno will represent the whole sample

 I want to do a graph to compare the total data (serialno) vs each of
the remaining five variables yearly. i.e to show the total data
(serialno) vs available data for one of the variables in the above
case temp axilla.

The above code plot count of temp_axilla yearly,how can I include
serialno to be part of the plot?

I have used
library(ggplot2)
library(foreign)
utils package

On Tue, Nov 12, 2013 at 11:53 AM, jwd j...@surewest.net wrote:
 On Mon, 11 Nov 2013 18:02:19 +0300
 Keniajin Wambui kiang...@gmail.com wrote:

 I am using R 3.0.2 on a 64 bit machine

 I have a data set from 1989-2002. The data has four variables
 serialno, date, admission ward, temperature and bcg scar.

 serialno admin_ward date_admn bcg_scar temp_axilla yr
 70162Ward2 11-Oct-89   y   38.9 1989
 70163 Ward111-Oct-91   y 37.2 1991
 70164 Ward2   11-Oct-92n   37.3 1992
 70165 Ward111-Oct-93y38.9 1993
 70166 Ward1   11-Oct-94  y  37.7 1994
 70167  Ward1   11-Oct-95  y  40 1995


 I want to do a bar graph of total data (serialno) vs *(data of one of
 the variables) to show the available data vs total data over the years

 i am using

 gplot(dta, aes(temp_axilla, fill=admin_ward)) + geom_bar() +
   facet_grid(. ~ yr, scales = free,margins=F) +
 geom_histogram(binwidth=300)

 But can include the serialno which shows the data. how can I achieve
 this


 You really need to pay better attention somewhere.  There are six
 variables in your example table for instance, not four.  You state you
 want to do something with the serialno variable, but it is not used
 in your bit of code.  Also, given that serialno appears to be
 unique in your example, a bar graph would be singularly uninformative.

 You need to provide a better example of the data, and an actual example
 of the code you are trying to use, including the libraries you've
 loaded.

 jwdougherty

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Mega Six Solutions
Web Designer and Research Consultant
Kennedy Mwai
25475211786

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Setting x-axis of a plot on each loop

2013-11-13 Thread Baro

I have a block of code:

window-5
start-3
n-1

seq1 - seq(1:40)
mat-matrix(seq1,40)


while(1+window=length(mat[,1]))
{
  kd-matrix(as.integer(mat[n:(n+window-1),1]))
  Sys.sleep(0.2)

plot(kd,col=blue,xlab=Rohdaten,ylab=values,xlim=c(start+n,start+n+window-1)
)

  n-n+1
}

I have this expectation, that on each loop two x-axis and y-axis are
changed and see the values on the plot. but I cant see the value.What
should I do to have values too?. If I am changing this my code to

plot(kd,col=blue,xlab=Rohdaten,ylab=values)

I can see the values but on the x-axis I have no the correct values

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different output from lm() and lmPerm lmp() if categorical variables are included in the analysis

2013-11-13 Thread Rolf Turner



RTFM!!! :-)

The help explicitly says The default contrasts are set internally to 
(contr.sum, contr.poly) .


Set

options(contrasts=c(contr.sum,contr.poly))

before your call to lm() and atest will agree with aptest all down 
the line.


cheers,

Rolf Turner

On 11/08/13 21:35, Agustin Lobo wrote:

I've found a problem when using
categorical variables in lmp() from package lmPerm

According to help(lmp): This function will behave identically to lm()
if the following parameters are set: perm=, seq=TRUE,
center=FALSE.)
But not in the case of including categorical variables:

require(lmPerm)
set.seed(42)
testx1 - rnorm(100,10,5)
testx2 - c(rep(a,50),rep(b,50))
testy - 5*testx1 + 3 + runif(100,-20,20)
test - data.frame(x1=testx1,x2=
testx2,y=testy)
atest - lm(y ~ x1*x2,data=test)
aptest - lmp(y ~ x1*x2,data=test,perm = , seqs = TRUE, center = FALSE)
summary(atest)

Call:
lm(formula = y ~ x1 * x2, data = test)
Residuals:
 Min   1Q   Median   3Q  Max
-17.1777  -9.5306  -0.9733   7.6840  22.2728

Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept)  -2.0036 3.2488  -0.6170.539
x15.3346 0.2861  18.646   2e-16 ***
x2b   2.4952 5.2160   0.4780.633
x1:x2b   -0.3833 0.4568  -0.8390.404

summary(aptest)

Call:
lmp(formula = y ~ x1 * x2, data = test, perm = , seqs = TRUE,
center = FALSE)

Residuals:
 Min   1Q   Median   3Q  Max
-17.1777  -9.5306  -0.9733   7.6840  22.2728

Coefficients:
Estimate Std. Error t value Pr(|t|)
x1   5.1429 0.2284  22.516   2e-16 ***
x21 -1.2476 2.6080  -0.4780.633
x1:x21   0.1917 0.2284   0.8390.404

It looks like lmp() is internally coding dummy variables in a different way, so
lmp results are for a (named 1 by lmp) while lm results are for
b ?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bar Graph

2013-11-13 Thread Jim Lemon


On 11/13/2013 07:20 PM, Keniajin Wambui wrote:

The serialno represents each individual in the data set.The total
count of the serialno will represent the whole sample

  I want to do a graph to compare the total data (serialno) vs each of
the remaining five variables yearly. i.e to show the total data
(serialno) vs available data for one of the variables in the above
case temp axilla.

The above code plot count of temp_axilla yearly,how can I include
serialno to be part of the plot?

I have used
library(ggplot2)
library(foreign)
utils package


Hi Keniajim,
It is not easy to work out what you want. If each serial number is an 
individual and you want the total number of individuals (dataset is 
mydata):


length(unique(mydata$serialno))

If you want the total number of individuals per year:

nind-function(x) return(length(unique(x)))
year-as.numeric(format(as.Date(mydata$date_admn,%d-%b-%y),%Y))
by(mydata$serialno,mydata$year,nind)

Perhaps this will give you a start.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generalized Additive Models - gamma against overfitting

2013-11-13 Thread Mike.lang

Dear listeners, 

in order to analyze a dataset, which includes more than 30k records, I'm
using a general GAMM like y~s(x)+s(y)+z+e with a underlying normal
distribution. Unfortunately, I have some problems with over- and
underfittings in smooth-functions at the same time. 
In general there is the possibility to set up a gamma like gamma=1.4 in
order to avoid overfitting in the smooth-functions. In my case I would need
different gammas for different smooth-functions, because s(x) is overfitted
and s(y) underfitted currently. 
So, is there any possibility to set up a gamma for each smooth-function
separately? 

I really would appreciate your responses! 

Thanks a lot!
Best 
Mike



--
View this message in context: 
http://r.789695.n4.nabble.com/Generalized-Additive-Models-gamma-against-overfitting-tp4680364.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Negative binomial parameterisation in mgcv

2013-11-13 Thread Mark Payne

Dear R-help,

The negative binomial distribution has several different
parameterisations, but I can't seem to figure out what the exact one
used in mgcv's negbin family is? negbin() requires a theta argument,
but its not clear anywhere in the documentation (that I can find), how
this parameter should be interpreted - for example, can it be less
than 1?

Best wishes,

MArk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Negative binomial parameterisation in mgcv

2013-11-13 Thread Prof Brian Ripley


On 13/11/2013 09:38, Mark Payne wrote:

Dear R-help,

The negative binomial distribution has several different
parameterisations, but I can't seem to figure out what the exact one
used in mgcv's negbin family is? negbin() requires a theta argument,
but its not clear anywhere in the documentation (that I can find), how
this parameter should be interpreted - for example, can it be less
than 1?


I believe it is the same as MASS (the package) explained in MASS (the 
book).  As its help says 




Best wishes,

MArk




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Grid type of sampling in geodata

2013-11-13 Thread Alaios

Hi all,
I have spatial field created with grf and I was wondering if I can sample in 
lines, something that can resemble sampling outdoors at the streets.
Random sampling looks to far of what I want to have.

there is in geodata package the sample.geodata but this looks like to be random 
samples.

I would like to thank you in advance for your help

Regards
A.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] making a barplot with table of experimental conditions underneath (preferably ggplot2)

2013-11-13 Thread N.Hubner

Dear all,



my data looks the following:



df - data.frame (experiment=c(E1,E2,E3,E4), mean = c(3,4,5,6), 
stdev=c(0.1,0.1,0.05,0.2), method = c(STD,STD, FP, FP), enzyme =c 
(T,T/L,T,T/L), denaturation=c(U,U,0.05%RG, 0.1%RG))



I would like to make a bar plot with standard deviation which I solved the 
following way:



x - df$experiment
y - df$mean
sd - df$stdev



df.plot - qplot(x,y,xlab=, ylab=# peptides identified)+
  geom_bar(colour=black, fill=darkgrey)+
  geom_errorbar(aes(x=x, ymin=y-sd, ymax=y+sd), width=0.25)

df.plot



However, as the labels for the x-axis (the bars) I do not want the experiment 
number, as now, but instead a table containing the other columns of my 
data.frame (method, enzyme, denaturation) with the description in the front and 
the certain 'value' below the bars.



I am looking forward to your suggestions!



With best wishes,

Nina
__

Dr. Nina C. Hubner
scientist quantitative proteomics

Dep. of Molecular Biology, Radboud University Nijmegen, The Netherlands
e-mail: n.hub...@ncmls.ru.nl
tel: +31-24-3613655

Visiting address:
NCMLS, Dept of Molecular Biology (route 274)
Geert Grooteplein 26/28
6525 GA Nijmegen
The Netherlands




The Radboud University Medical Centre is listed in the Commercial Register of 
the Chamber of Commerce under file number 41055629.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] issues with calling predict.coxph.penal (survival) inside a function - subset-vector not found. Because of NextMethod?

2013-11-13 Thread julian.bothe

Hello everyone, 

 

I got an issue with calling predict.coxph.penal inside a function. 

 

Regarding the context: My original problem is that I wrote a function that
uses predict.coxph and survfit(model) to predict

a lot of survival-curves using only the basis-curves for the strata (as
delivered by survfit(model) ) and then adapts them with 

the predicted risk-scores. Because there are cases where my new data has
strata which didn't exist in the original model I exclude 

them, using a Boolean vector inside the function.

I end up with a call like this: predict (coxph_model,
newdata[subscript_vector,] ) 

 

This works fine for coxph.model, but when I fit a model with a spline
(class coxph.penal), I get an error: 

Error in `[.data.frame`(newdata, [subscript_vector, ) : object
'[subscript_vector ' not found

 

I suppose this is because of NextMethod, but I am not sure how to work
around it. I also read a little bit about all those
matching-and-frame-issues, 

But must confess I am not really into it. 

 

I attach a reproducible example. 

Any help or suggestions of work-arounds will be appreciated. 

 

Thanks 

Julian

 

 version

   _   

platform   x86_64-w64-mingw32  

arch   x86_64  

os mingw32 

system x86_64, mingw32 

status 

major  3   

minor  0.1 

year   2013

month  05  

day16  

svn rev62743   

language   R   

version.string R version 3.0.1 (2013-05-16)

nickname   Good Sport

 

 

##TEST-DATA

 

# Create the simplest test data set 

test1 - data.frame(time=c(4,3,1,1,2,2,3), 

  status=c(1,1,1,0,1,1,0), 

  x=c(0,2,1,1,1,0,0), 

  sex=c(0,0,0,0,1,1,1)) 

 

# Fit a stratified model 

fit1 - coxph(Surv(time, status) ~ x + strata(sex), test1) 

summary(fit1)

 

#fit stratified wih spline

fit2 - coxph(Surv(time, status) ~ pspline(x, df=2) + strata(sex), test1) 

summary(fit2)

 

#function to predict within

 

predicting_function - function(model, newdata){

  subs -vector(mode='logical', length=nrow(newdata))

  subs[1:length(subs)]- TRUE

  

  ret - predict (model, newdata=newdata[subs,])

  return(ret)

}

 

predicting_function(fit1, test1) # works

 

predicting_function(fit2,test1) #doesnt work - Error in
`[.data.frame`(newdata, subs, ) : object 'subs' not found

# probably because of NextMethod

 

#

 traceback()

#12: `[.data.frame`(newdata, subs, )

#11: newdata[subs, ]

#10: is.data.frame(data)

#9: model.frame.default(data = newdata[subs, ], formula = ~pspline(x, 

#   df = 2) + strata(sex), na.action = function (object, ...) 

#   object)

#8: model.frame(data = newdata[subs, ], formula = ~pspline(x, df = 2) + 

#   strata(sex), na.action = function (object, ...) 

#   object)

#7: eval(expr, envir, enclos)

#6: eval(tcall, parent.frame())

#5: predict.coxph(model, newdata = newdata[subs, ])

#4: NextMethod(predict, object, ...)

#3: predict.coxph.penal(model, newdata = newdata[subs, ])

#2: predict(model, newdata = newdata[subs, ]) at #5

#1: predicting_function(fit2, test1)

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] issues with calling predict.coxph.penal (survival) inside a function - subset-vector not found. Because of NextMethod?

2013-11-13 Thread Simon Zehnder

Works for me:

predicting_function(fit2,test1)
 1  2  3  4  5  6  7
-1.0481141  0.1495946  0.4492597  0.4492597  0.9982492 -0.4991246 -0.4991246

Best

Simon

On 13 Nov 2013, at 15:46, julian.bo...@elitepartner.de wrote:

 Hello everyone, 
 
 
 
 I got an issue with calling predict.coxph.penal inside a function. 
 
 
 
 Regarding the context: My original problem is that I wrote a function that
 uses predict.coxph and survfit(model) to predict
 
 a lot of survival-curves using only the basis-curves for the strata (as
 delivered by survfit(model) ) and then adapts them with 
 
 the predicted risk-scores. Because there are cases where my new data has
 strata which didn't exist in the original model I exclude 
 
 them, using a Boolean vector inside the function.
 
 I end up with a call like this: predict (coxph_model,
 newdata[subscript_vector,] ) 
 
 
 
 This works fine for coxph.model, but when I fit a model with a spline
 (class coxph.penal), I get an error: 
 
 Error in `[.data.frame`(newdata, [subscript_vector, ) : object
 '[subscript_vector ' not found
 
 
 
 I suppose this is because of NextMethod, but I am not sure how to work
 around it. I also read a little bit about all those
 matching-and-frame-issues, 
 
 But must confess I am not really into it. 
 
 
 
 I attach a reproducible example. 
 
 Any help or suggestions of work-arounds will be appreciated. 
 
 
 
 Thanks 
 
 Julian
 
 
 
 version
 
   _   
 
 platform   x86_64-w64-mingw32  
 
 arch   x86_64  
 
 os mingw32 
 
 system x86_64, mingw32 
 
 status 
 
 major  3   
 
 minor  0.1 
 
 year   2013
 
 month  05  
 
 day16  
 
 svn rev62743   
 
 language   R   
 
 version.string R version 3.0.1 (2013-05-16)
 
 nickname   Good Sport
 
 
 
 
 
 ##TEST-DATA
 
 
 
 # Create the simplest test data set 
 
 test1 - data.frame(time=c(4,3,1,1,2,2,3), 
 
  status=c(1,1,1,0,1,1,0), 
 
  x=c(0,2,1,1,1,0,0), 
 
  sex=c(0,0,0,0,1,1,1)) 
 
 
 
 # Fit a stratified model 
 
 fit1 - coxph(Surv(time, status) ~ x + strata(sex), test1) 
 
 summary(fit1)
 
 
 
 #fit stratified wih spline
 
 fit2 - coxph(Surv(time, status) ~ pspline(x, df=2) + strata(sex), test1) 
 
 summary(fit2)
 
 
 
 #function to predict within
 
 
 
 predicting_function - function(model, newdata){
 
  subs -vector(mode='logical', length=nrow(newdata))
 
  subs[1:length(subs)]- TRUE
 
 
 
  ret - predict (model, newdata=newdata[subs,])
 
  return(ret)
 
 }
 
 
 
 predicting_function(fit1, test1) # works
 
 
 
 predicting_function(fit2,test1) #doesnt work - Error in
 `[.data.frame`(newdata, subs, ) : object 'subs' not found
 
# probably because of NextMethod
 
 
 
 #
 
 traceback()
 
 #12: `[.data.frame`(newdata, subs, )
 
 #11: newdata[subs, ]
 
 #10: is.data.frame(data)
 
 #9: model.frame.default(data = newdata[subs, ], formula = ~pspline(x, 
 
 #   df = 2) + strata(sex), na.action = function (object, ...) 
 
 #   object)
 
 #8: model.frame(data = newdata[subs, ], formula = ~pspline(x, df = 2) + 
 
 #   strata(sex), na.action = function (object, ...) 
 
 #   object)
 
 #7: eval(expr, envir, enclos)
 
 #6: eval(tcall, parent.frame())
 
 #5: predict.coxph(model, newdata = newdata[subs, ])
 
 #4: NextMethod(predict, object, ...)
 
 #3: predict.coxph.penal(model, newdata = newdata[subs, ])
 
 #2: predict(model, newdata = newdata[subs, ]) at #5
 
 #1: predicting_function(fit2, test1)
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation to list for event occurence

2013-11-13 Thread William Dunlap

Or,

f3 - function (dat1)  {
i - dat1$Event_Occurence == 1
split(dat1$Week[i], dat1$ID[i])
}

in addition to the previously mentioned
f1 - function(dat1) {
with(dat1,tapply(as.logical(Event_Occurence),ID,FUN=which ))
}
f2 - function(dat1){
 lapply(split(dat1,dat1$ID),function(x) which(!!x[,3]))
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of arun
 Sent: Tuesday, November 12, 2013 2:13 PM
 To: R help
 Subject: Re: [R] Data transformation to list for event occurence
 
 
 
 Hi Anindya,
 
 You may try:
 dat1 - read.table(text=ID   Week    Event_Occurence
 A 1 0
 A 2 0
 A 3 1
 A 4 0
 B 1 1
 B 2 0
 B 3 0
 B 4 1,sep=,header=TRUE,stringsAsFactors=FALSE)
 
  with(dat1,tapply(as.logical(Event_Occurence),ID,FUN=which ))
 #or
 lapply(split(dat1,dat1$ID),function(x) which(!!x[,3]))
 A.K.
 
 
 
 
 
 On Tuesday, November 12, 2013 4:58 PM, Anindya Sankar Dey 
 anindy...@gmail.com
 wrote:
 Hi,
 
 Say I have a following data
 
 ID   Week    Event_Occurence
 A 1 0
 A 2 0
 A 3 1
 A 4 0
 B 1 1
 B 2 0
 B 3 0
 B 4 1
 
 that whether an individual experienced an event in a particular week.
 
 I wish to create list such as the first element of the list will be a
 vector listing the week number when the event has occurred for A, followed
 by that of B.
 
 Can you help creating this?
 
 --
 Anindya Sankar Dey
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Negative binomial parameterisation in mgcv

2013-11-13 Thread Simon Wood


var(y) = mu + mu^2/theta where E(y) = mu.

best,
Simon

On 13/11/13 10:38, Mark Payne wrote:

Dear R-help,

The negative binomial distribution has several different
parameterisations, but I can't seem to figure out what the exact one
used in mgcv's negbin family is? negbin() requires a theta argument,
but its not clear anywhere in the documentation (that I can find), how
this parameter should be interpreted - for example, can it be less
than 1?

Best wishes,

MArk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603   http://people.bath.ac.uk/sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Double Pareto Log Normal Distribution DPLN

2013-11-13 Thread b. alzahrani


Hi
 
I found this paper http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf that 
models the DPLN distribution as in equation 8. I implemented this in R but 
cannot get the same curve as in Figure 4. can you please check if my code below 
is correct: e.g. is the use of pnorm() correct here?
 
ddlpn - function(x){
  a=2.8
  b=0.01
  v=0.45
  t=6.5
  j - (a * b /(a+b))
  
  norm1-pnorm((log(x)-v-(a*t^2))/t)
  expo1- a*v+(a^2*t^2/2)
  
  z-exp(expo1)*(x^(-a-1))*(norm1)
  
  norm2-pnorm((log(x)-v+(b*t^2))/t)
 expo2- -b*t+(b^2*t^2/2)

  y- x^(b-1)*exp(expo2)*(1-norm2)  # 1-norm is the complementary CDF of N(0,1)
  j*(z+y)
}
**
Bander Alzahrani, Teacher Assistant
Information Systems Department
Faculty of Computing  Information Technology
King Abdulaziz University

 *



 From: d...@vims.edu
 To: cs_2...@hotmail.com
 CC: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Double Pareto Log Normal Distribution
 Date: Tue, 12 Nov 2013 16:51:22 +
 
 
 http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf is the original ref and has 
 the equations.
 
 library(VGAM) for *pareto() and library(stats) for *lnorm() should get you 
 most of the way there.
 
 On Nov 12, 2013, at 10:47 AM, b. alzahrani cs_2...@hotmail.com
  wrote:
 
  Hi guys
  I would like to generate random number Double Pareto Log Normal 
  Distribution (DPLN). does anyone know how to do this in R or if there is 
  any built-in function.
  
  Thanks
  
  **
  Bander 
  *
  
  

  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 --
 Dr. David Forrest
 d...@vims.edu
 
 
 
 
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting x-axis of a plot on each loop

2013-11-13 Thread David Winsemius


On Nov 13, 2013, at 12:24 AM, Baro wrote:

 I have a block of code:
 
 window-5
 start-3
 n-1
 
 seq1 - seq(1:40)
 mat-matrix(seq1,40)
 
 
 while(1+window=length(mat[,1]))
 {
  kd-matrix(as.integer(mat[n:(n+window-1),1]))
  Sys.sleep(0.2)
 
 plot(kd,col=blue,xlab=Rohdaten,ylab=values,xlim=c(start+n,start+n+window-1)
 )
 
  n-n+1
 }
 
 I have this expectation, that on each loop two x-axis and y-axis are
 changed and see the values on the plot. but I cant see the value.What
 should I do to have values too?. If I am changing this my code to
 
 plot(kd,col=blue,xlab=Rohdaten,ylab=values)
 
 I can see the values but on the x-axis I have no the correct values

You see nothing (except perhaps in the first few plots where I suspect points 
are being plotted within the plot area) because by offdering only one unnamed 
numeric  argument to plot you implicitly use 1:5 as your x value in all plots. 
You say you have qn expectation that the and y values are changing but you 
don't change the x-values in your code. You have not described what you are 
trying to do so giving further advice is not possible. I take that back; 
further advice:  It's always a good idea to name your arguments.

 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] What is the difference between Mean Decrease Accuracy produced by importance(foo) vs foo$importance in a Random Forest Model?

2013-11-13 Thread Lopez, Dan

Hi R Expert Community,

My question: What is the difference between Mean Decrease Accuracy produced by 
importance(foo) vs foo$importance in a Random Forest Model?

I ran a Random Forest classification model where the classifier is binary. I 
stored the model in object FOREST_model. I than ran importance(FOREST_model) 
and FOREST_model$importance. I usually use the prior but decided to learn more 
about what is in summary(randomForest ) so I ran the latter. I expected both to 
produce identical output. Mean Decrease Gini is the only thing that is 
identical in both.

I looked at ? Random Forest and Package 'randomForest' documentation and didn't 
find any info explaining this difference.

I am not including a reproducible example because this is most likely 
something, perhaps simple, such as one  is divided by something (if so, what?), 
that I am just not aware of.


importance(FOREST_model)

 HC  TER MeanDecreaseAccuracy MeanDecreaseGini
APPT_TYP_CD_LL0.16025157 -0.521041660   0.1567029712.793624
ORG_NAM_LL0.20886631 -0.952057325   0.20208393   107.137049
NEW_DISCIPLINE0.20685079 -0.960719435   0.2007676286.495063


FOREST_model$importance


  HC   TER MeanDecreaseAccuracy MeanDecreaseGini

APPT_TYP_CD_LL0.0049473962 -3.727629e-03 0.0045949805
12.793624

ORG_NAM_LL0.0090715845 -2.401016e-02 0.0077298067   
107.137049

NEW_DISCIPLINE0.0130672572 -2.656671e-02 0.0114583178
86.495063

Dan Lopez
LLNL, HRIM, Workforce Analytics  Metrics


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] codenls

2013-11-13 Thread Prof J C Nash (U30A)

The expression has b[1] and b[2] while start has b[2] and b[3].

The expression needs a different form, for example:

#   fit-nlrob(y ~ x1 / (1+ b[1]*x2^b[2]),data = xx, start =
#   list(b[2],b[3]))
   fit-nlrob(y ~ x1 / (1+ b1*x2^b2),data = xx, start =
   list(b1=b[2],b2=b[3]))

This works, though I have no idea if the results make sense.

JN

On 13-11-13 06:00 AM, r-help-requ...@r-project.org wrote:
 Message: 6
 Date: Tue, 12 Nov 2013 06:03:02 -0800 (PST)
 From: IZHAK shabsogh ishaqb...@yahoo.com
 To: r-help@r-project.org r-help@r-project.org
 Subject: [R] codenls
 Message-ID:
   1384264982.50617.yahoomail...@web142502.mail.bf1.yahoo.com
 Content-Type: text/plain
 
  kindly help correct this program as given below i run and run is given me 
 some error
 
 rm(list=ls())
  require(stats)
  require(robustbase) 
  x1-as.matrix(c(5.548,4.896,1.964,3.586,3.824,3.111,3.607,3.557,2.989))
  y-as.matrix(c(2.590,3.770,1.270,1.445,3.290,0.930,1.600,1.250,3.450))
  x2-as.matrix(c(0.137,2.499,0.419,1.699,0.605,0.677,0.159,1.699,0.340))
  k-rep(1,9)
  x-data.frame(k,x1,x2)
  xx-data.frame(y,x1,x2)
  
  
  freg-function(y,x1,x2){
reg- lm(y ~ x1 + x2 , data=x)
   
  return(reg)
  }
  
  fit-freg(y,x1,x2)
  b-as.matrix((coef(fit)))
  
 
  f-function(b,x1,x2){
fit-nlrob(y ~ x1 / (1+ b[1]*x2^b[2]),data = xx, start =
list(b[2],b[3]))
return(fit)
  }
  fit1-f(b,x1,x2)
 Error in nlrob(y ~ x1/(1 + b[1] * x2^b[2]), data = xx, start = list(b[2],  : 
   'start' must be fully named (list or numeric vector)
   [[alternative HTML version deleted]]
 
 
 
 --
 
 Message: 7
 Date: Tue, 12 Nov 2013 06:34:44 -0800 (PST)
 From: Alaios ala...@yahoo.com
 To: R-help@r-project.org R-help@r-project.org
 Subject: [R] Handle Gps coordinates
 Message-ID:
   1384266884.5102.yahoomail...@web125305.mail.ne1.yahoo.com
 Content-Type: text/plain
 
 Dear all,
 I would like to ask you if there are any gps libraries.
 
 
 I would like to be able to handle them,
 -like calculate distances in meters between gps locations,
 -or find which gps location is closer to a list of gps locations.
 
 Is there something like that in R?
 
 I would like to tthank you in advance for your reply
 
 Regards
 Alex
   [[alternative HTML version deleted]]
 
 
 
 --
 
 Message: 8
 Date: Tue, 12 Nov 2013 06:59:53 -0800
 From: Baro babak...@gmail.com
 To: R help r-help@r-project.org
 Subject: [R] Having a relative x-axis in a plot
 Message-ID:
   CAF-JZQupPPoyGfd-1k0ztn7bHmFTxdXUjtMGHub9D3TYLh=2...@mail.gmail.com
 Content-Type: text/plain
 
 I would like to have a relative x-axis in r. I am reading time seris from
 an excel file and I want to show in a plot and also I want to have a window
 which moves over the values.
 
 My aim ist to see which point belongs to which time(row number in excel
 file). i.e I am reading from 401th row in 1100th column in excel file and
 my window length is 256. here is my code:
 
 #which column in excel filespalte-1100
 #start row and end row
 start-401end-600
 
 window-256
 n-1
 
 wb - loadWorkbook(D:\\MA\\excel_mix_1100.xls)
 dat -readWorksheet(wb, sheet=getSheets(wb)[1], startRow=start,
 endRow=end, startCol=spalte, endCol=spalte,header=FALSE)
 datalist-dat[seq(1, length(dat[,1]), by = 2),]
 datalist-matrix(datalist)while(1+window=length(datalist)){
   kd-matrix(as.integer(datalist[n:(n+window-1),1]))
   Sys.sleep(0.2)
   
 plot(kd,type=l,col=blue,xlab=Rohdaten,ylab=values,xlim=c(start+n,start+n+window-1))
   n-n+1}
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2: geom_boxplot. Mapping aes factor but with different color scale and hatching

2013-11-13 Thread Anna Zakrisson Braeunlich

Dear all,

I have a geom_boxplot where I have the boxes colored according to a factor. The 
color scale given is very lovely and colorful, but I need it in grayscale. how 
do I change this?
I also am worried that the grayscale may be too similar (black/white is also 
not an option as in my real data I have much more than the two dummy factors I 
have put in the dummy data). Therefore, I would like to hatch the boxes also 
mapped according to the factors (see script and data below). How can I 
accomplish this?
Also, I have a problem that I though would be easy to solve over Stack 
Overflow, but turn out to be more complex. i would like to have one of my 
legend titles in italics. Sounds basic, but can't solve it somehow. I have 
marked in the script below where this is.

Many thank's for your help!

Dummydata:
mydata - data.frame(
  D15N = c(runif(100, min = -2), runif(100), runif(100, max = 2), runif(100)),
  Year = rep(c('2007', '2008'), each = 100),
  SampleType = rep(c('cyano', 'seston'), each = 200),
  Station = sample(c('B1', 'H2', 'H3', 'H4'), 400, replace = TRUE),
  Week = sample(c('19', '21', '23', '25'), 400, replace = TRUE))

library(ggplot2)

Summdata - ddply(mydata, .(Week, SampleType, Year, Station), summarise, median 
= median(D15N))

ggplot(Summdata, aes(x = Station, y = median)) +
  (mapping=aes(group=interaction(Station, SampleType)))+
  theme_bw() +
  geom_boxplot(mapping=aes(fill=SampleType), #I would like to have grayscale 
and hatched - how?
   outlier.shape=1,
   outlier.size=3)+
  theme(strip.background = element_blank())+
  ylab(expression(paste(,delta^{15}, N)))+
  xlab(Station) +
  scale_shape(solid = FALSE)+
  geom_hline(yintercept=0, linetype=3)+
  scale_fill_discrete(guide = guide_legend(my stuff),
  labels=c(cyanobacteria, seston)) #if I wanted to have 
seston in italics - how would I do that?


with kind regards

Anna Zakrisson
PhD student

Department of Ecology, Environment and Plant Sciences
Stockholm University
Svante Arrheniusv. 21A
SE-106 91 Stockholm
Sweden/Sverige

Lives in Berlin.
For paper mail:
Katzbachstr. 21
D-10965, Berlin - Kreuzberg
Germany/Deutschland

E-mail: anna.zakris...@su.se
Tel work: +49-(0)3091541281
Mobile: +49-(0)15777374888
LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b

º`. .  `. . `. . º`. .  `. . `. .º`. .  `. . 
`. .º

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] .First in R.app (on Mac) vs R in Terminal

2013-11-13 Thread Ricardo Saporta

Hello,
I am noticing different behavior with respect to .First when working in
Terminal or working with the R gui on Mac OSX, and I am wondering if
someone might clarify for me the intention behind this.  In both cases, I
am running 3.0.2  on the same machine.

In my ~/.Rprofile I have a function .First( )
I then create a new function .First in my workspace, and save the image on
exit.

If I do this in Terminal, the .First from my saved image will execute,
while the .First from my .Rprofile will not.

If I do this in the R gui, BOTH .First functions execute. (obviously, if
the two functions are identical, the same function executes twice.)

It appears that there is a function .First in tools:RGUI (see below). The
one saved here is the .First loaded from .Rprofile.  Inspecting the two
reveals that the one in .GlobalEnv is the function manually created.

My questions are:
(1)  Is the intended behavior that a .First in a restored image will take
precedence over a .First in .Rprofile.  (I assume so, since according to
?Startup the image loads after the .Rprofile is sourced)
(2)  Why is the .First function from .Rprofile saved to tools:RGUI?
(3)  If a package contains a function .First, will both that function and
the user's .First (assuming it exists) be executed at startup?

Below are sessionInfo and some additional details:

Thank you,
Rick

Mac OSX 10.9
I created a new user.  In that users ~/.Rprofile, I have only the following
function
 ## .Rprofile
.First - function() {
cat(\n\nThis 'First' is from .Rprofile\n\n)
 }

In my workspace, I then create a different function .First
## .Rprofile
.First - function() {
 cat(\n\nThis 'First' is manually created in the workspace\n\n)
}
quit(yes)

Then re open R. Between execution in R.app  R in Terminal, I rm ~/.Rdata

# ---
THIS IS FROM R.app
 for (i in seq(search()))
 +  cat(sprintf(%20s : %17s, search()[[i]], paste(ls(pattern=\\.First,
all=TRUE, pos=i), collapse=, )), \n)

  .GlobalEnv :.First
  tools:RGUI :  .__RGUI__..First
   package:stats :
package:graphics :
   package:grDevices :
   package:utils :
package:datasets :
 package:methods :
   Autoloads :
package:base :.First.sys

THIS IS FROM TERMINAL
 for (i in seq(search()))
 +  cat(sprintf(%20s : %17s, search()[[i]], paste(ls(pattern=\\.First,
all=TRUE, pos=i), collapse=, )), \n)

  .GlobalEnv :.First
   package:stats :
package:graphics :
   package:grDevices :
   package:utils :
package:datasets :
 package:methods :
   Autoloads :
package:base :.First.sys



# -- SESSION INFO --- #
## TERMINAL
 sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

## R.app
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

# --  END --- #

Ricardo Saporta
Graduate Student, Data Analytics
Rutgers University, New Jersey
e: sapo...@rutgers.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fitting arbitrary curve to 1D data with error bars

2013-11-13 Thread Erkcan Özcan

Dear R experts,

I was wondering how I could do a fit (say minimum chi2) to a 1D data with error 
bars. I searched quite a lot on the web, found out about the fitdistr() 
function in MASS, etc., but none of the things I found gives what I am really 
looking for. Perhaps I do not exactly know the proper terminology and this is 
why I cannot find the solution. We could also probably describe it as 
parameter estimation using the minimum chi2 method using data with error bars 
and an arbitrary fitting function.

Anyway, in order not to confuse people with possibly incorrect terminology, let 
me show you what I am really looking for. Here is a page that shows what I want 
in Matlab (see the section on weighted fit):

http://www.mathworks.com/matlabcentral/fileexchange/10176-ezyfit-2-41/content/ezyfit/demo/html/efdemo.html#19

Could you please help?

Thanks in advance,
e.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linux Java and R not working together

2013-11-13 Thread Uwe Bohne


   Dear community,

   I almost tried for 3 days now to install rJava and package FSelector on my
   Linux machine but couldn't do so.
   I  searched all the forums and tried some tricks ... but unfortunately
   couldn't find any solution.
   First of all I did try to install rJava in my rkward without any changes ,
   that gave me the following error :

   configure: error: One or more JNI types differ from the corresponding native
   type.  You  may need to use non-standard compiler flags or a different
   compiler in order to fix this.

   ERROR: configuration failed for package ‘rJava’

   * removing ‘/home/uwe/.rkward/library/rJava’

   Warnmeldung:

   In install.packages(pkgs = c(rJava), lib = /home/uwe/.rkward/library, :

   Installation des Pakets ‘rJava’ hatte Exit-Status ungleich 0


   So then I started searching and found a tip. I did run the following;



   uwe@linux-k2a8:~ sudo R CMD javareconf
   Java interpreter : /usr/bin/java
   Java version : 1.7.0_45
   Java home path   : /usr/lib/jdk1.7.0_45/jre
   Java compiler: /usr/bin/javac
   Java headers gen.: /usr/bin/javah
   Java archive tool: /usr/bin/jar
   NOTE: Your JVM has a bogus java.library.path system property!
 Trying a heuristic via sun.boot.library.path to find jvm library...
   Java library path: $(JAVA_HOME)/lib/i386/client
   JNI linker flags : -L$(JAVA_HOME)/lib/i386/client -ljvm
   JNI cpp flags: -I$(JAVA_HOME)/../include -I$(JAVA_HOME)/../include/linux
   Updating Java configuration in /usr/lib/R
   Done.



   And also did check 


   uwe@linux-k2a8:~ java -version
   java version 1.7.0_45
   Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
   Java HotSpot(TM) Server VM (build 24.45-b08, mixed mode)
   uwe@linux-k2a8:~ sudo /usr/sbin/update-alternatives --config java
   There are 4 choices for the alternative java (providing /usr/bin/java).
 SelectionPath Priority   Status
   
 0/usr/lib/jvm/jre-1.7.0-openjdk/bin/java   17147 auto mode
 1/usr/java/latest/bin/java 1 manual
   mode
   * 2/usr/lib/jdk_Oracle/bin/java  3 manual
   mode
 3/usr/lib/jvm/jre-1.5.0-gcj/bin/java   1500  manual
   mode
 4/usr/lib/jvm/jre-1.7.0-openjdk/bin/java   17147 manual
   mode
   Press enter to keep the current choice[*], or type selection number:



   I absolutely don't know what is wrong with that setup since my Java seems to
   have JDK and all the other things needed for R.

   For your Information here my system:
   uwe@linux-k2a8:~ uname -rm
   3.4.63-2.44-desktop i686

   uwe@linux-k2a8:~ cat /proc/version
   Linux version 3.4.63-2.44-desktop (geeko@buildhost) (gcc version 4.7.1
   20120723 [gcc-4_7-branch revision 189773] (S Linux) ) #1 SMP PREEMPT Wed Oct
   2 11:18:32 UTC 2013 (d91a619)



   I would be really thankful for any advise to get rJava (and FSelector)
   running soon.

   Best wishes

   Uwe
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Survival analysis with truncated data.

2013-11-13 Thread Nicolas Palix

Hi,

I would like to know how to handle truncated data.
My intend is to have the survival curve of a software fault in order
to have some information
about fault lifespan.

I have some observations of a software system between 2004 and 2010.
The system was first released in 1994.
The event considered is the disappearance of a software fault. The
faults can have been
introduced at any time, between 1994 and 2010. But for fault
introduced before 2004,
there is not mean to know their age.

I used the Surv and survfit functions with type interval2.
For the faults that are first observed in 2004, I set the lower bound
to the lifespan
observed between 2004 and 2010.

How could I set the upper bound ? Using 1994 as a starting point to not seems
to be meaningful. Neither is using only the lower bound.

Should I consider another survival estimator ?

Thanks in advance.
-- 
Nicolas Palix
Tel: +33 4 76 51 46 27
http://membres-liglab.imag.fr/palix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simplex

2013-11-13 Thread arun



Hi Sven,
May be this helps:

If you look into ?simplex.object 



solved This indicates whether the problem was solved.  A value of
  ‘-1’ indicates that no feasible solution could be found.  A
  value of ‘0’ that the maximum number of iterations was
  reached without termination of the second stage.  This may
  indicate an unbounded function or simply that more iterations
  are needed. A value of ‘1’ indicates that an optimal solution
  has been found.



 s1 - simplex(a=price * -1, A1=A, b1=b, A2=A, b2=b2, maxi=T) # This cant find 
a solution
s1$solved
#[1] -1

A.K.




Dear All, 

I do have a problem with a linear optimisation I am trying to achieve: 

library(boot) 
price - c(26.93, 26.23, 25.64, 25.97, 26.12, 26.18, 26.49, 27, 27.32, 27.93, 
27.72, 27.23) 
A - matrix(0, nrow=length(price)+1, ncol=length(price)) 
A[1,] - 1 
for(i in 1:length(price)) A[i+1,i] - 1 
b - c(864000.01, 288000, 288000, 288000, 288000, 288000, 288000, 288000, 
288000, 288000, 288000, 288000, 288000) 
b2 - c(864000, 216000, 216000, 216000, 216000, 216000, 216000, 216000, 216000, 
216000, 216000, 216000, 216000) 


simplex(a=price, A1=A, b1=b) # This works 
simplex(a=price, A1=A, b1=b, maxi=T) # This is the maxing function and works as 
well 
simplex(a=price * -1, A1=A, b1=b, A2=A, b2=b2, maxi=T) # This cant find a 
solution 


The result I would expect is that it picks the 4 
highest(note I multiplied price with -1) numbers and multiplies them 
with constraint in b2. 

For example this works: 


library(boot) 
price - c(1, 2, 3 , 4) 
A - matrix(0, nrow=length(price)+1, ncol=length(price)) 
A[1,] - 1 
for(i in 1:length(price)) A[i+1,i] - 1 
b - c(8.01, 2.5, 2.5, 2.5, 2.5) 
b2 - c(8, 2, 2, 2, 2) 

simplex(a=price, A1=A, b1=b) 
simplex(a=price, A1=A, b1=b, maxi=T) 
simplex(a=price * -1, A1=A, b1=b, A2=A, b2=b2, maxi=T) 


I cant see a logical difference between the two, why would it not find a 
solution for the first problem? 

Thank you. 


Sven

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Double Pareto Log Normal Distribution DPLN

2013-11-13 Thread David R Forrest


Looks like there are typos in equation 8 of 
http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf  (expo2 doesn't depend 
on 'v') or equation 9 of http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf ('a' 
is not specified).

Dave

On Nov 13, 2013, at 11:43 AM, b. alzahrani cs_2...@hotmail.com
 wrote:

 
 Hi
 
 I found this paper http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf 
 that models the DPLN distribution as in equation 8. I implemented this in R 
 but cannot get the same curve as in Figure 4. can you please check if my code 
 below is correct: e.g. is the use of pnorm() correct here?
 
 ddlpn - function(x){
  a=2.8
  b=0.01
  v=0.45
  t=6.5
  j - (a * b /(a+b))
 
  norm1-pnorm((log(x)-v-(a*t^2))/t)
  expo1- a*v+(a^2*t^2/2)
 
  z-exp(expo1)*(x^(-a-1))*(norm1)
 
  norm2-pnorm((log(x)-v+(b*t^2))/t)
 expo2- -b*t+(b^2*t^2/2)
 
  y- x^(b-1)*exp(expo2)*(1-norm2)  # 1-norm is the complementary CDF of N(0,1)
  j*(z+y)
 }
 **
 Bander Alzahrani, Teacher Assistant
 Information Systems Department
 Faculty of Computing  Information Technology
 King Abdulaziz University
 
 *
 
 
 
 From: d...@vims.edu
 To: cs_2...@hotmail.com
 CC: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Double Pareto Log Normal Distribution
 Date: Tue, 12 Nov 2013 16:51:22 +
 
 
 http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf is the original ref and has 
 the equations.
 
 library(VGAM) for *pareto() and library(stats) for *lnorm() should get you 
 most of the way there.
 
 On Nov 12, 2013, at 10:47 AM, b. alzahrani cs_2...@hotmail.com
 wrote:
 
 Hi guys
 I would like to generate random number Double Pareto Log Normal 
 Distribution (DPLN). does anyone know how to do this in R or if there is 
 any built-in function.
 
 Thanks
 
 **
 Bander 
 *
 
 
   
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 --
 Dr. David Forrest
 d...@vims.edu
 
 
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Dr. David Forrest
d...@vims.edu




signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Beginner - Need Perhaps 5 - 10 Minutes of R User Time to Learn Few Basics

2013-11-13 Thread Zach Feinstein

I have finally decided that I will learn R and learn it very well. For now I am 
using a program that a friend of mine developed to do some advanced statistical 
analyses. I downloaded RStudio to my machine. [Perhaps RStudio is not the best 
platform to work from -  I have heard that Rattle is sort of the new standard.] 
I have so far been able to highlight the rows of the code that I wish to run, 
but then I somehow turned off seeing the output. I also cannot find where I 
would locate the output window. Yes, frustrated.

Would any kind soul be interested in helping kickstart my R learning? I have 
JoinMe installed on my machine so I figure we can do it interactively. It 
should not take more than a few minutes. I am already very experienced with 
both C and VBA languages as well as SPSS syntax so there is not much need to 
worry about me being too much of a novice.

Thank you very much in advance.

Zach Feinstein
zfeinst...@isgmn.commailto:zfeinst...@isgmn.com
(952) 277-0162
(612) 590-4813 (mobile)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] kernlab multicore usage

2013-11-13 Thread Liam Roche

I am hoping there are other users of the kernlab package in R who will be able 
to solve a puzzle. In the past, I've used the relevance vector machine engine 
(rvm) in kernlab and was pleased to see it use all four cores on my PC (running 
Windows 8).
But now it only runs on one core and I can't figure out what changed. I've 
tried switching R versions from 3.0.2 to 2.15.3 and back and installing 
packages relating to multicore use, but haven't managed to get more than one 
core running any more.
Anyone know what is necessary?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Readjusting frequencies

2013-11-13 Thread arun

Hi Katherine,

Check if this works:
fraud_data = data.frame(no_of_frauds = c(1, 2, 4, 6, 7, 9, 10), frequency = 
c(3, 1, 7, 11, 13, 1, 4))
fraud_data1 - rbind(fraud_data,c(11,10))

fun1 - function(dat) {
 sum1 - 0
 for(i in 1:nrow(dat)){
 sum1 - sum1 + dat[i, frequency]
 dat[i,frequency] - sum1
 if(sum1 5){
 sum1 - 0
 }
 }
indx - which(dat$frequency 5)
if(max(indx)==nrow(dat))
{res - dat[dat$frequency 5,]
}
else{
dat[max(indx),frequency] - sum(dat[c(max(indx),nrow(dat)),frequency])
res - dat[dat$frequency  5,]
}
res
}
fun1(fraud_data)
#  no_of_frauds frequency
#3    4    11
#4    6    11
#5    7    18
 fun1(fraud_data1)
#  no_of_frauds frequency
#3    4    11
#4    6    11
#5    7    13
#8   11    15


A.K.

On Tuesday, November 12, 2013 5:12 AM, Katherine Gobin 
katherine_go...@yahoo.com wrote:

Dear Mr Arun,

Hi!
 Sorry to bother you. Yesterday I had raised one query to the forum but 
haven't received any response. If the time permits, will it be possible 
for you to at least have a look at it?


I have
following data.frame as

fraud_data
= data.frame(no_of_frauds = c(1, 2, 4, 6, 7, 9, 10), frequency = c(3, 1, 7, 11,
13, 1, 4))


fraud_data

no_of_frauds frequency
1
           1         3
2
           2         1
3
           4         7
4
           6        11
5
           7        13
6
           9         1
7
          10         4

I need
to regroup the data in such a way that if the frequency is less than 5, the
corresponding class data gets merged to next class (or at times with previous 
class too) i.e. the frequencies get
added till the added frequencies exceed 5. 

Thus, in above data.frame
since frequencies pertaining to no_of_frauds 1 and 2 are 3 and 1 respectively,
these get added to class 4 and the frequency of this class now becomes 3+1+7 =
11. Likewise, frequency of classes 9 and 10 are 1 and 4 and when these are
added still it is 5 i.e. doesn't exceed 5. Thus, these should get added to the
previous class i.e. 7.

Thus I
need to have

no_of_frauds
      frequency

      4                
  11            #  ( 3 + 1 + 7)

      6                
  11           

      7                
  18            #  (13 + 1 + 4)


If possible, pl go through it. Sorry to bother you.

Regards

Katherine


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Beginner - Need Perhaps 5 - 10 Minutes of R User Time to Learn Few Basics

2013-11-13 Thread Bert Gunter

Type the following into your console window:

install.packages(fortunes)
library(fortunes)
fortune(brain surgery)

(It's not quite apposite, but close enough).

Cheers,
Bert


On Wed, Nov 13, 2013 at 5:57 AM, Zach Feinstein zfeinst...@isgmn.com wrote:
 I have finally decided that I will learn R and learn it very well. For now I 
 am using a program that a friend of mine developed to do some advanced 
 statistical analyses. I downloaded RStudio to my machine. [Perhaps RStudio is 
 not the best platform to work from -  I have heard that Rattle is sort of the 
 new standard.] I have so far been able to highlight the rows of the code that 
 I wish to run, but then I somehow turned off seeing the output. I also cannot 
 find where I would locate the output window. Yes, frustrated.

 Would any kind soul be interested in helping kickstart my R learning? I have 
 JoinMe installed on my machine so I figure we can do it interactively. It 
 should not take more than a few minutes. I am already very experienced with 
 both C and VBA languages as well as SPSS syntax so there is not much need to 
 worry about me being too much of a novice.

 Thank you very much in advance.

 Zach Feinstein
 zfeinst...@isgmn.commailto:zfeinst...@isgmn.com
 (952) 277-0162
 (612) 590-4813 (mobile)


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Beginner - Need Perhaps 5 - 10 Minutes of R User Time to Learn Few Basics

2013-11-13 Thread Mitchell Maltenfort

Bert:

 Yet another reason I'm a fan of Frank Harrell. Does anyone know when
I get to buy the next edition of Regression Modeling Strategies?

Zach: Check www.coursera.org. They have some nice R-centric classes.
I signed up myself since my own R skills are self-taught.  Also
consider investing in some of the excellent R texts available such as
The R Book or The R Primer





Ersatzistician and Chutzpahthologist
I can answer any question.  I don't know is an answer. I don't know
yet is a better answer.


On Wed, Nov 13, 2013 at 2:08 PM, Bert Gunter gunter.ber...@gene.com wrote:
 Type the following into your console window:

 install.packages(fortunes)
 library(fortunes)
 fortune(brain surgery)

 (It's not quite apposite, but close enough).

 Cheers,
 Bert


 On Wed, Nov 13, 2013 at 5:57 AM, Zach Feinstein zfeinst...@isgmn.com wrote:
 I have finally decided that I will learn R and learn it very well. For now I 
 am using a program that a friend of mine developed to do some advanced 
 statistical analyses. I downloaded RStudio to my machine. [Perhaps RStudio 
 is not the best platform to work from -  I have heard that Rattle is sort of 
 the new standard.] I have so far been able to highlight the rows of the code 
 that I wish to run, but then I somehow turned off seeing the output. I also 
 cannot find where I would locate the output window. Yes, frustrated.

 Would any kind soul be interested in helping kickstart my R learning? I have 
 JoinMe installed on my machine so I figure we can do it interactively. It 
 should not take more than a few minutes. I am already very experienced with 
 both C and VBA languages as well as SPSS syntax so there is not much need to 
 worry about me being too much of a novice.

 Thank you very much in advance.

 Zach Feinstein
 zfeinst...@isgmn.commailto:zfeinst...@isgmn.com
 (952) 277-0162
 (612) 590-4813 (mobile)


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 (650) 467-7374

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Double Pareto Log Normal Distribution DPLN

2013-11-13 Thread David R Forrest

...Additionally...your set of parameters match none of the curves in figure 4.

I think the ordering of the parameters as listed on the graphs is different 
than in the text of the article.

The 'v' parameter controls the location of the 'elbow' and should be near 
log(x) in each graph, while the 't' parameter is the sharpness of the elbow.  
So eyeballing the elbows:

   log(c(80,150,300))= 4.382027 5.010635 5.703782

These appear to match positional parameter #4 in the legends, which you apply 
as parameter t in your function.  


# editing your function a bit:

ddpln - function(x, a=2.5, b=0.01, v=log(100), t=v/10){
  # Density of Double Pareto LogNormal distribution
  # from b. alzahrani cs_2...@hotmail.com email of 2013-11-13
  # per formula 8 from http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf

   c - (a * b /(a+b))

   norm1-pnorm((log(x)-v-(a*t^2))/t)
   norm2-pnorm((log(x)-v+(b*t^2))/t)
   expo1-  a*v+(a^2*t^2/2)
   expo2- -b*v+(b^2*t^2/2)  # edited from the paper's eqn 8  with: s/t/v/

   z- (x^(-a-1)) * exp(expo1)*(norm1)
   y- (x^(b-1))  * exp(expo2)*(1-norm2)  # 1-norm is the complementary CDF of 
N(0,1)

   c*(z+y)
}

x-10^seq(0,5,by=0.1) # 0-10k

plot(x,ddpln(x,a=2.8,b=.01,v=3.8,t=0.35),log='xy',type='l')  # Similar to fig 4 
left.

plot(x,ddpln(x,a=2.5,b=.01,v=log(2)),log='xy',type='l')
plot(x,ddpln(x,a=2.5,b=.01,v=log(10)),log='xy',type='l')
plot(x,ddpln(x,a=2.5,b=.01,v=log(100)),log='xy',type='l')
plot(x,ddpln(x,a=2.5,b=.01,v=log(1000)),log='xy',type='l')
plot(x,ddpln(x,a=2.5,b=.01,v=log(1)),log='xy',type='l')

Dave

On Nov 13, 2013, at 11:43 AM, b. alzahrani cs_2...@hotmail.com
 wrote:

 
 Hi
 
 I found this paper http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf 
 that models the DPLN distribution as in equation 8. I implemented this in R 
 but cannot get the same curve as in Figure 4. can you please check if my code 
 below is correct: e.g. is the use of pnorm() correct here?
 
 ddlpn - function(x){
  a=2.8
  b=0.01
  v=0.45
  t=6.5
  j - (a * b /(a+b))
 
  norm1-pnorm((log(x)-v-(a*t^2))/t)
  expo1- a*v+(a^2*t^2/2)
 
  z-exp(expo1)*(x^(-a-1))*(norm1)
 
  norm2-pnorm((log(x)-v+(b*t^2))/t)
 expo2- -b*t+(b^2*t^2/2)
 
  y- x^(b-1)*exp(expo2)*(1-norm2)  # 1-norm is the complementary CDF of N(0,1)
  j*(z+y)
 }
 **
 Bander Alzahrani, Teacher Assistant
 Information Systems Department
 Faculty of Computing  Information Technology
 King Abdulaziz University
 
 *
 
 
 
 From: d...@vims.edu
 To: cs_2...@hotmail.com
 CC: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Double Pareto Log Normal Distribution
 Date: Tue, 12 Nov 2013 16:51:22 +
 
 
 http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf is the original ref and has 
 the equations.
 
 library(VGAM) for *pareto() and library(stats) for *lnorm() should get you 
 most of the way there.
 
 On Nov 12, 2013, at 10:47 AM, b. alzahrani cs_2...@hotmail.com
 wrote:
 
 Hi guys
 I would like to generate random number Double Pareto Log Normal 
 Distribution (DPLN). does anyone know how to do this in R or if there is 
 any built-in function.
 
 Thanks
 
 **
 Bander 
 *
 
 
   
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 --
 Dr. David Forrest
 d...@vims.edu
 
 
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Dr. David Forrest
d...@vims.edu






signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to sum a function over a specific range in R?

2013-11-13 Thread Adams, Jean

On Tue, Nov 12, 2013 at 11:45 AM, umair durrani umairdurr...@outlook.comwrote:

 I am new to R and have already posted this question on stack overflow. The
 problem is that I did not understand the answers as the R documentation
 about the discussed functions (e.g. 'convolve') is quite complicated for a
 newbie like me. Here's the question:
 I have a big text file with more than 3 million rows. The following is the
 example of the three columns I want to use:
 indxvehID   LocalY
 1   2   35.381
 2   2   39.381
 3   2   43.381
 4   2   47.38
 5   2   51.381
 6   2   55.381
 7   2   59.381
 8   2   63.379
 9   2   67.383
 10  2   71.398
 where,indx = IndexvehID = Vehicle ID (Here only '2' is shown but infact
 there are 2169 vehicle IDs and each one repeats several times because the
 data was collected at every 0.1 seconds)LocalY = The y coordinate of the
 vehicle at a particular time (The time column is not shown here)
 What I want to do is to create a new column of 'SmoothedY' using the
 following formula:
 SmoothedY = 1/Z * Summation from (i-15) to (i+15) (LocalY *
 exp(-abs(i-k))/5))
 where,i = indxZ = Summation from (k =i-15) to (k = i+15) (
 exp(-abs(i-k))/5))
 How can I apply this formula to create the new column 'SmoothedY'? This is
 actually a data smoothing problem but default smoothing algorithms in R are
 not suitable for my data and I have to use this custom formula.
 Thanks in advance.

 Umair Durrani


I have never tried this myself, but it appears as if you can define your
own smoothing function using Simon Wood's mgcv package.  Check out
http://www.maths.bath.ac.uk/~sw283/talks/snw-R-talk.pdf for more
information.

Jean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Update a variable in a dataframe based on variables in another dataframe of a different size

2013-11-13 Thread Lopez, Dan

This is a great solution! Love the conciseness of your solution. And easy to 
understand.

Thanks again.
Dan


-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Monday, November 11, 2013 6:31 PM
To: Lopez, Dan
Subject: Re: [R] Update a variable in a dataframe based on variables in another 
dataframe of a different size

Hi,
You could use:
H_DF[match(with(T_DF,paste(FY,ID,sep=_)), 
with(H_DF,paste(FY,ID,sep=_))),3]- TER
A.K.




On Monday, November 11, 2013 8:51 PM, Lopez, Dan lopez...@llnl.gov wrote:
Below is how I am currently doing this. Is there a more efficient way to do 
this?
The scenario is that I have two dataframes of different sizes. I need to update 
one binary factor variable in one of those dataframes by matching on two 
variables. If there is no match keep as is otherwise update. Also the variable 
being update, TT in this case should remain a binary factor variable 
(levels='HC','TER')

HTDF2-merge(H_DF,T_DF,by=c(FY,ID),all.x=T)
HTDF2$TT-factor(ifelse(is.na(HTDF2$TT.y),HTDF2$TT.x,HTDF2$TT.y),labels=c(HC,TER))
HTDF2-HTDF2[,-(3:4)]


# REPRODUCIBLE EXAMPLE DATA FOR ABOVE..
 dput(H_DF)
structure(list(FY = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L), .Label = 
c(FY09, FY10, FY11, FY12, FY13), class = factor),
    ID = c(1, 1, 1, 1, 2, 2, 2, 2, 2), TT = structure(c(1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(HC, TER), class = factor)), 
.Names = c(FY, ID, TT), class = data.frame, row.names = c(1L, 2L, 3L, 
4L, 6L, 7L, 9L, 10L, 11L))
 dput(T_DF)
structure(list(FY = structure(c(4L, 2L, 5L), .Label = c(FY09, FY10, FY11, 
FY12, FY13), class = factor), ID = c(1, 2, 2), TT = structure(c(2L, 2L, 
2L), .Label = c(HC, TER), class = factor)), .Names = c(FY, ID, TT), 
row.names = c(5L, 8L, 12L), class = data.frame)

Dan Lopez
LLNL, HRIM - Workforce Analytics  Metrics

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Real frequencies in a 'set' problem

2013-11-13 Thread Stefan Petersson

I have three media channels where I can push public health information (tv,
radio and newspaper). Any given citizen can be touched by the information
from one, two or three channels (let's ignore for the moment that citizens
might miss the information all together). Hence these sets:

 library(sets)

 tv - set(1, 4, 7, 6)
 radio - set(6, 7, 5, 2)
 newspaper - set(4, 7, 5, 3)

Now I want to know how many citizens that newspapers bring to the complete
body of informed citizens. E.g. how many people gets the information ONLY
from a newspaper:

 (tv | radio | t) - (tv | radio)

But how do I solve this if I know that

220 people is reached via tv
349 people is reached vi radio
384 people is reached via newspaper

97 people is reached via tv and radio
173 people is reached via radio and newspaper
134 people is reached via newspaper and tv

It's the grey area of the (dummy) plot below that I would like to put a
frequency in

 library(VennDiagram)

 draw.triple.venn(area1 = 10,
   area2 = 10,
   area3 = 10,
   n12 = 3,
   n13 = 3,
   n23 = 3,
   n123 = 1,
   category = c('TV', 'Radio', 'Newspaper'),
   fill = c('white', 'white', 'grey'),
   alpha = c(1, 1, 0)
 )

I don't understand how the frequencies that I know is supposed to be used
in the 'sets' notation. Or am I completely misunderstanding something here?
Wouldn't be the first time if I did...

TIA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] making a barplot with table of experimental conditions underneath (preferably ggplot2)

2013-11-13 Thread Jim Lemon


On 11/14/2013 01:24 AM, n.hub...@ncmls.ru.nl wrote:

Dear all,



my data looks the following:



df- data.frame (experiment=c(E1,E2,E3,E4), mean = c(3,4,5,6), stdev=c(0.1,0.1,0.05,0.2), method = c(STD,STD, FP, FP), enzyme =c 
(T,T/L,T,T/L), denaturation=c(U,U,0.05%RG, 0.1%RG))



I would like to make a bar plot with standard deviation which I solved the 
following way:



x- df$experiment
y- df$mean
sd- df$stdev



df.plot- qplot(x,y,xlab=, ylab=# peptides identified)+
   geom_bar(colour=black, fill=darkgrey)+
   geom_errorbar(aes(x=x, ymin=y-sd, ymax=y+sd), width=0.25)

df.plot



However, as the labels for the x-axis (the bars) I do not want the experiment 
number, as now, but instead a table containing the other columns of my 
data.frame (method, enzyme, denaturation) with the description in the front and 
the certain 'value' below the bars.



I am looking forward to your suggestions!




Hi Nina,
This isn't in ggplot2, but it might help:

library(plotrix)
plotbg-rect(0,0,5,7,col=\lightgray\);grid(col=\white\,lty=1)
exp_con-paste(df$method,df$enzyme,df$denaturation,sep=\n)
barp(df$mean,names.arg=rep(,4),col=darkgray,
 ylab=# peptides identified,do.first=plotbg)
mtext(exp_con,side=1,at=1:4,line=3)
dispersion(1:4,df$mean,df$stdev)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Real frequencies in a 'set' problem

2013-11-13 Thread David Winsemius


On Nov 13, 2013, at 1:35 PM, Stefan Petersson wrote:

 I have three media channels where I can push public health information (tv,
 radio and newspaper). Any given citizen can be touched by the information
 from one, two or three channels (let's ignore for the moment that citizens
 might miss the information all together). Hence these sets:
 
 library(sets)
 
 tv - set(1, 4, 7, 6)
 radio - set(6, 7, 5, 2)
 newspaper - set(4, 7, 5, 3)
 
 Now I want to know how many citizens that newspapers bring to the complete
 body of informed citizens. E.g. how many people gets the information ONLY
 from a newspaper:
 
 (tv | radio | t) - (tv | radio)
 
 But how do I solve this if I know that
 
 220 people is reached via tv
 349 people is reached vi radio
 384 people is reached via newspaper
 
 97 people is reached via tv and radio
 173 people is reached via radio and newspaper
 134 people is reached via newspaper and tv
 
 It's the grey area of the (dummy) plot below that I would like to put a
 frequency in
 
 library(VennDiagram)
 
 draw.triple.venn(area1 = 10,
   area2 = 10,
   area3 = 10,
   n12 = 3,
   n13 = 3,
   n23 = 3,
   n123 = 1,
   category = c('TV', 'Radio', 'Newspaper'),
   fill = c('white', 'white', 'grey'),
   alpha = c(1, 1, 0)
 )
 
 I don't understand how the frequencies that I know is supposed to be used
 in the 'sets' notation. Or am I completely misunderstanding something here?
 Wouldn't be the first time if I did...
 

Is there any reason for us to think this isn't homework? Have you read the 
Posting Guide?


 TIA
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to sum a function over a specific range in R?

2013-11-13 Thread Bert Gunter

?filter

perhaps.

-- Bert

On Wed, Nov 13, 2013 at 1:10 PM, Adams, Jean jvad...@usgs.gov wrote:
 On Tue, Nov 12, 2013 at 11:45 AM, umair durrani 
 umairdurr...@outlook.comwrote:

 I am new to R and have already posted this question on stack overflow. The
 problem is that I did not understand the answers as the R documentation
 about the discussed functions (e.g. 'convolve') is quite complicated for a
 newbie like me. Here's the question:
 I have a big text file with more than 3 million rows. The following is the
 example of the three columns I want to use:
 indxvehID   LocalY
 1   2   35.381
 2   2   39.381
 3   2   43.381
 4   2   47.38
 5   2   51.381
 6   2   55.381
 7   2   59.381
 8   2   63.379
 9   2   67.383
 10  2   71.398
 where,indx = IndexvehID = Vehicle ID (Here only '2' is shown but infact
 there are 2169 vehicle IDs and each one repeats several times because the
 data was collected at every 0.1 seconds)LocalY = The y coordinate of the
 vehicle at a particular time (The time column is not shown here)
 What I want to do is to create a new column of 'SmoothedY' using the
 following formula:
 SmoothedY = 1/Z * Summation from (i-15) to (i+15) (LocalY *
 exp(-abs(i-k))/5))
 where,i = indxZ = Summation from (k =i-15) to (k = i+15) (
 exp(-abs(i-k))/5))
 How can I apply this formula to create the new column 'SmoothedY'? This is
 actually a data smoothing problem but default smoothing algorithms in R are
 not suitable for my data and I have to use this custom formula.
 Thanks in advance.

 Umair Durrani


 I have never tried this myself, but it appears as if you can define your
 own smoothing function using Simon Wood's mgcv package.  Check out
 http://www.maths.bath.ac.uk/~sw283/talks/snw-R-talk.pdf for more
 information.

 Jean

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] On ^ returning a matrix when operated on a data.frame

2013-11-13 Thread Arunkumar Srinivasan

Dear R-users, 

I am wondering why ^ operator alone returns a matrix, when operated on a 
data.frame (as opposed to all other arithmetic operators). Here's an example:

DF - data.frame(x=1:5, y=6:10)
class(DF*DF) # [1] data.frame
class(DF^2) # [1] matrix

I posted here on SO: 
http://stackoverflow.com/questions/19964897/why-does-on-a-data-frame-return-a-matrix-instead-of-a-data-frame-like-do
 and got a very nice answer - it happens because a matrix is returned (obvious 
by looking at `Ops.data.frame`). However, what I'd like to understand is, *why* 
a matrix is returned for ^ alone? Here's an excerpt from Ops.data.frame 
(Thanks to Neal Fultz):

if (.Generic %in% c(+, -, *, /, %%, %/%)) {
names(value) - cn
data.frame(value, row.names = rn, check.names = FALSE, 
check.rows = FALSE)
}
else matrix(unlist(value, recursive = FALSE, use.names = FALSE), 
nrow = nr, dimnames = list(rn, cn))


It's clear that a matrix will be returned unless `.Generic` is one of those 
arithmetic operators. My question therefore is, is there any particular reason 
why ^ operator is being missed in the if-statement here? I can't think of a 
reason where this would break. Also ?`^` doesn't seem to mention anything about 
this coercion.

Please let me know if I should be posting this to R-devel list instead.

Thank you very much,
Arun


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (sin asunto)

2013-11-13 Thread Elisa Frutos Bernal

Hi!

I need to print a graph that I have in a window. Previously I used:

try(win.print(), silent = TRUE)
if (geterrmessage() != Error in win.print() : unable to start
device devWindows\n) {
plotFunctiond(screen = FALSE)
dev.off()

but now it is not possible.
Can you help me?


-- 
Elisa.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] volume of ellipsoid

2013-11-13 Thread Yuanzhi Li


Hi, everyone!

I have a matrix X(n*p) which is n samples from a p-dimensional normal  
distribution. Now I want to make the ellipsoid containing (1-α)% of  
the probability in the distribution based on Mahalanobis distance:


μ:(x-μ)'Σ^(-1)(x-μ)≤χ2p(α)

where x and Σ is the mean and variance-covariance matrix of X. Are  
there some functions in R which can plot the ellipsoid and calculate  
the volume (area for p=2, hypervolume for p3) of the ellipsoid? I  
only find the ellipsoidhull function in cluster package, but it  
deal with ellipsoid hull containing X, which obvious not the one I want.


After that, a more difficult is the following. If we have a series of  
matrix X1, X2, X3,… Xm which follow different p-dimensional normal  
distributions. And we make m ellipsoids (E1, E2, … Em) for each matrix  
like the before. How can we calculate the volume of union of the m  
ellipsoids? One possible solution for this problem is the  
inclusion-exclusion principle:


V(⋃Ei)(1≤i≤m)=
V(Ei)(1≤i≤m)-∑V(Ei⋂Ej)(1≤ij≤m)+∑V(Ei⋂Ej⋂Ek)(1≤ijk≤m)+(-1)^(m-1)V(E1⋂…⋂Em)

But the problem is how to calculate the volume of intersection between  
2, 3 or more ellipsoids. Are there some functions which can calculate  
the volume of intersection between two region or functions which  
directly calculate the volume of a union of two region(the region here  
is ellipsoid). OR yo you have any good ideas solving this problem in  
R? Thank you all in advance!


Yuanzhi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On ^ returning a matrix when operated on a data.frame

2013-11-13 Thread Duncan Murdoch


On 13-11-13 6:00 PM, Arunkumar Srinivasan wrote:

Dear R-users,

I am wondering why ^ operator alone returns a matrix, when operated on a 
data.frame (as opposed to all other arithmetic operators). Here's an example:

DF - data.frame(x=1:5, y=6:10)
class(DF*DF) # [1] data.frame
class(DF^2) # [1] matrix

I posted here on SO: 
http://stackoverflow.com/questions/19964897/why-does-on-a-data-frame-return-a-matrix-instead-of-a-data-frame-like-do
 and got a very nice answer - it happens because a matrix is returned (obvious by looking 
at `Ops.data.frame`). However, what I'd like to understand is, *why* a matrix is returned 
for ^ alone? Here's an excerpt from Ops.data.frame (Thanks to Neal Fultz):

if (.Generic %in% c(+, -, *, /, %%, %/%)) {
 names(value) - cn
 data.frame(value, row.names = rn, check.names = FALSE,
 check.rows = FALSE)
}
else matrix(unlist(value, recursive = FALSE, use.names = FALSE),
 nrow = nr, dimnames = list(rn, cn))


It's clear that a matrix will be returned unless `.Generic` is one of those arithmetic 
operators. My question therefore is, is there any particular reason why ^ 
operator is being missed in the if-statement here? I can't think of a reason where this 
would break. Also ?`^` doesn't seem to mention anything about this coercion.


It's not just ^ that is missing, the logical relations like , ==, etc 
also return matrices.  This is very old code (I think from 1999), but I 
would guess that the reason is that the ^ and  operators always return 
values of a single type (numeric and logical respectively), whereas the 
other operators can take mixed type inputs and return mixed type outputs.


Duncan Murdoch


Please let me know if I should be posting this to R-devel list instead.

Thank you very much,
Arun


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On ^ returning a matrix when operated on a data.frame

2013-11-13 Thread Arunkumar Srinivasan

Duncan, 
Thank you. What I meant was that ^ is the only *arithmetic operator* to 
result in a matrix on operating in a data.frame. I understand it's quite old 
code. Also, your explanation makes sense, with the exception of / operator, I 
suppose (I could be wrong here). 

Arun


On Thursday, November 14, 2013 at 12:32 AM, Duncan Murdoch wrote:

 
 It's not just ^ that is missing, the logical relations like , ==, etc 
 also return matrices. This is very old code (I think from 1999), but I 
 would guess that the reason is that the ^ and  operators always return 
 values of a single type (numeric and logical respectively), whereas the 
 other operators can take mixed type inputs and return mixed type outputs.
 
 Duncan Murdoch
 
  Please let me know if I should be posting this to R-devel list instead.
  
  Thank you very much,
  Arun
  
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org (mailto:R-help@r-project.org) mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] On ^ returning a matrix when operated on a data.frame

2013-11-13 Thread Duncan Murdoch


On 13-11-13 6:52 PM, Arunkumar Srinivasan wrote:

Duncan,
Thank you. What I meant was that ^ is the only *arithmetic operator*
to result in a matrix on operating in a data.frame. I understand it's
quite old code. Also, your explanation makes sense, with the exception
of / operator, I suppose (I could be wrong here).


You're right, /, %/% and %% also return consistent types.  So my 
explanation is wrong.  The NEWS entry for this change appears to be in 
the 0.63 release,


o   Ops.data.frame :  things like  d.fr  anow return a matrix

That doesn't give much of a hint for why ^ is handled differently than 
/.


Duncan Murdoch


Arun

On Thursday, November 14, 2013 at 12:32 AM, Duncan Murdoch wrote:



It's not just ^ that is missing, the logical relations like , ==, etc
also return matrices. This is very old code (I think from 1999), but I
would guess that the reason is that the ^ and  operators always return
values of a single type (numeric and logical respectively), whereas the
other operators can take mixed type inputs and return mixed type outputs.

Duncan Murdoch


Please let me know if I should be posting this to R-devel list instead.

Thank you very much,
Arun


[[alternative HTML version deleted]]

__
R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Update a variable in a dataframe based on variables in another dataframe of a different size

2013-11-13 Thread MacQueen, Don

Dan,

Gabor's solution is of course good, but here's a solution that uses only
base R capabilities, and doesn't sort as merge() does. Essentially the
same as A.K.'s, but slightly more general.

tmp1 - match( paste(T_DF$FY,T_DF$ID) , paste(H_DF$FY,H_DF$ID) )
H_DF$TT[tmp1] - T_DF$TT

gg - sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join
T_DF t using(FY, ID))

 for (nm in names(H_DF)) print(all.equal(H_DF[[nm]], gg[[nm]]))
[1] TRUE
[1] TRUE
[1] TRUE


It could be made into a one-liner.It would probably break if TT doesn't
have the same factor levels in both H_DF and T_DF.

As an aside, I suspect that nowadays match() is generally
under-appreciated among R users as a whole.

-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 11/11/13 5:20 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote:

On Mon, Nov 11, 2013 at 8:04 PM, Lopez, Dan lopez...@llnl.gov wrote:
 Below is how I am currently doing this. Is there a more efficient way
to do this?
 The scenario is that I have two dataframes of different sizes. I need
to update one binary factor variable in one of those dataframes by
matching on two variables. If there is no match keep as is otherwise
update. Also the variable being update, TT in this case should remain a
binary factor variable (levels='HC','TER')

 HTDF2-merge(H_DF,T_DF,by=c(FY,ID),all.x=T)
 
HTDF2$TT-factor(ifelse(is.na(HTDF2$TT.y),HTDF2$TT.x,HTDF2$TT.y),labels=c
(HC,TER))
 HTDF2-HTDF2[,-(3:4)]


 # REPRODUCIBLE EXAMPLE DATA FOR ABOVE..
 dput(H_DF)
 structure(list(FY = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L,
 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class =
factor),
 ID = c(1, 1, 1, 1, 2, 2, 2, 2, 2), TT = structure(c(1L, 1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(HC, TER), class =
factor)), .Names = c(FY,
 ID, TT), class = data.frame, row.names = c(1L, 2L, 3L,
 4L, 6L, 7L, 9L, 10L, 11L))
 dput(T_DF)
 structure(list(FY = structure(c(4L, 2L, 5L), .Label = c(FY09,
 FY10, FY11, FY12, FY13), class = factor), ID = c(1,
 2, 2), TT = structure(c(2L, 2L, 2L), .Label = c(HC, TER), class =
factor)), .Names = c(FY,
 ID, TT), row.names = c(5L, 8L, 12L), class = data.frame)


Here is an sqldf solution:

 library(sqldf)
 sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join
T_DF t using(FY, ID))
FY ID  TT
1 FY09  1  HC
2 FY10  1  HC
3 FY11  1  HC
4 FY12  1 TER
5 FY09  2  HC
6 FY10  2 TER
7 FY11  2  HC
8 FY12  2  HC
9 FY13  2 TER


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Analysis

2013-11-13 Thread Jim Silverton

Hi,



I have 187 urine cultures which were subjected to culture and microscopy
methods. Video were used to 'verify' the findings. Culture is considered
the gold method. But Microscopy is another method which may be cheaper. I
checked the videos to determine whether bacteria was growing on both the
culture and microscopy readings to ' verify' what was really happening.

My question is this...How can I show that Microscopy is superior?
Is there a special test available?



 *Microscopy**Positive* *Negative**Culture Positive* *Video
Positive* *47* *5* *52*  Video Negative 0 *2* *2*  *Culture Negative* Video
Positive  18 *0* *18*  *Video Negative* *5* *110* *115*  *70* *117*
*187*









-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] oblique.scores argument for fa function in psych package

2013-11-13 Thread Mark Seeto

Dear R help,

I'm using version 1.3.10.12 of the psych package. The help page for fa
says that when calculating factor scores, oblique.scores=FALSE causes
the pattern matrix to be used, and oblique.scores=TRUE causes the
structure matrix to be used. However, when I try it, it seems to be
the other way around.

Example:

###
library(psych)

set.seed(1)

n - 100

Y - data.frame(y1 = rnorm(n),
y2 = rnorm(n),
y3 = rnorm(n),
y4 = rnorm(n),
y5 = rnorm(n),
y6 = rnorm(n))

fa1 - fa(Y, nfactors=2, rotate=oblimin, fm=minres, oblique.scores=FALSE)
fa2 - fa(Y, nfactors=2, rotate=oblimin, fm=minres, oblique.scores=TRUE)

fa1$scores[1:3, ]
fa2$scores[1:3, ]

(scale(Y) %*% solve(cor(Y)) %*% fa1$loadings)[1:3, ]   # not the same
as fa1$scores[1:3, ]
(scale(Y) %*% solve(cor(Y)) %*% fa1$Structure)[1:3, ]  # same as
fa1$scores[1:3, ]

(scale(Y) %*% solve(cor(Y)) %*% fa2$loadings)[1:3, ]   # same as
fa2$scores[1:3, ]
(scale(Y) %*% solve(cor(Y)) %*% fa2$Structure)[1:3, ]  # not the same
as fa2$scores[1:3, ]

###

Have I misunderstood something?

Thanks,
Mark Seeto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting arbitrary curve to 1D data with error bars

2013-11-13 Thread Suzen, Mehmet

If you are after adding error bars in a scatter plot; one example is
given below :

#some example data
set.seed(42)
df - data.frame(x = rep(1:10,each=5), y = rnorm(50))

#calculate mean, min and max for each x-value
library(plyr)
df2 - ddply(df,.(x),function(df)
c(mean=mean(df$y),min=min(df$y),max=max(df$y)))

#plot error bars
library(Hmisc)
with(df2,errbar(x,mean,max,min))
grid(nx=NA,ny=NULL)

(From: http://stackoverflow.com/questions/13032777/scatter-plot-with-error-bars)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Analysis

2013-11-13 Thread Ranjan Maitra

Don't know if the data will support the result that you want but the
only way to get some sort of answer is to hire a statistician.

ranjan

On Wed, 13 Nov 2013 20:58:30 -0400 Jim Silverton
jim.silver...@gmail.com wrote:

 Hi,
 
 
 
 I have 187 urine cultures which were subjected to culture and microscopy
 methods. Video were used to 'verify' the findings. Culture is considered
 the gold method. But Microscopy is another method which may be cheaper. I
 checked the videos to determine whether bacteria was growing on both the
 culture and microscopy readings to ' verify' what was really happening.
 
 My question is this...How can I show that Microscopy is superior?
 Is there a special test available?
 
 
 
  *Microscopy**Positive* *Negative**Culture Positive* *Video
 Positive* *47* *5* *52*  Video Negative 0 *2* *2*  *Culture Negative* Video
 Positive  18 *0* *18*  *Video Negative* *5* *110* *115*  *70* *117*
 *187*
 
 
 
 
 
 
 
 
 
 -- 
 Thanks,
 Jim.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-- 
Important Notice: This mailbox is ignored: e-mails are set to be
deleted on receipt. Please respond to the mailing list if appropriate.
For those needing to send personal or professional e-mail, please use
appropriate addresses.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Update a variable in a dataframe based on variables in another dataframe of a different size

2013-11-13 Thread arun

Hi Don,

In cases like:
H_DF - H_DF[-4,]
tmp1 - match( paste(T_DF$FY,T_DF$ID) , paste(H_DF$FY,H_DF$ID) )

 H_DF$TT[tmp1] - T_DF$TT 
#Error in `[-.factor`(`*tmp*`, tmp1, value = c(2L, 2L, 2L)) : 

Probably, this works:
H_DF$TT[tmp1[!is.na(tmp1)]] - unique(T_DF$TT) 
A.K.




On Wednesday, November 13, 2013 7:59 PM, MacQueen, Don macque...@llnl.gov 
wrote:
Dan,

Gabor's solution is of course good, but here's a solution that uses only
base R capabilities, and doesn't sort as merge() does. Essentially the
same as A.K.'s, but slightly more general.

tmp1 - match( paste(T_DF$FY,T_DF$ID) , paste(H_DF$FY,H_DF$ID) )
H_DF$TT[tmp1] - T_DF$TT

gg - sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join
T_DF t using(FY, ID))

 for (nm in names(H_DF)) print(all.equal(H_DF[[nm]], gg[[nm]]))
[1] TRUE
[1] TRUE
[1] TRUE


It could be made into a one-liner.It would probably break if TT doesn't
have the same factor levels in both H_DF and T_DF.

As an aside, I suspect that nowadays match() is generally
under-appreciated among R users as a whole.

-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 11/11/13 5:20 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote:

On Mon, Nov 11, 2013 at 8:04 PM, Lopez, Dan lopez...@llnl.gov wrote:
 Below is how I am currently doing this. Is there a more efficient way
to do this?
 The scenario is that I have two dataframes of different sizes. I need
to update one binary factor variable in one of those dataframes by
matching on two variables. If there is no match keep as is otherwise
update. Also the variable being update, TT in this case should remain a
binary factor variable (levels='HC','TER')

 HTDF2-merge(H_DF,T_DF,by=c(FY,ID),all.x=T)
 
HTDF2$TT-factor(ifelse(is.na(HTDF2$TT.y),HTDF2$TT.x,HTDF2$TT.y),labels=c
(HC,TER))
 HTDF2-HTDF2[,-(3:4)]


 # REPRODUCIBLE EXAMPLE DATA FOR ABOVE..
 dput(H_DF)
 structure(list(FY = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L,
 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class =
factor),
     ID = c(1, 1, 1, 1, 2, 2, 2, 2, 2), TT = structure(c(1L, 1L,
     1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(HC, TER), class =
factor)), .Names = c(FY,
 ID, TT), class = data.frame, row.names = c(1L, 2L, 3L,
 4L, 6L, 7L, 9L, 10L, 11L))
 dput(T_DF)
 structure(list(FY = structure(c(4L, 2L, 5L), .Label = c(FY09,
 FY10, FY11, FY12, FY13), class = factor), ID = c(1,
 2, 2), TT = structure(c(2L, 2L, 2L), .Label = c(HC, TER), class =
factor)), .Names = c(FY,
 ID, TT), row.names = c(5L, 8L, 12L), class = data.frame)


Here is an sqldf solution:

 library(sqldf)
 sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join
T_DF t using(FY, ID))
    FY ID  TT
1 FY09  1  HC
2 FY10  1  HC
3 FY11  1  HC
4 FY12  1 TER
5 FY09  2  HC
6 FY10  2 TER
7 FY11  2  HC
8 FY12  2  HC
9 FY13  2 TER


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] MM estmator

2013-11-13 Thread IZHAK shabsogh

hi

I have a nonlinear regression model with two parameter, i have estimated using 
nls in R
and i want to also find the estimate using MM, someone refer me to this 
function nlrob
but this is main for only M estimate. can you please help me findout  a 
function in R for MM nonlinear regression

thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Replace NA's with value in the next row

2013-11-13 Thread dila radi

Hi all,

I have a data set which treat missing value as NA and now I need to replace
all these NA's by using number in the same row but different column.

Here is the part of my data:
 V1 V2 V3 V4 V5 V6 V7  0 0 0 1.2 0 0 0.259  0 0 12.8 0 23.7 0 8.495  6 0
81.7 0.2 0 20 19.937  0 1.5 60.9 0 0 15.5 13.900  1 13 56.8 17.5 32.8 6.4
27.654  4 3 66.4 2 0.3 NA 17.145


I want to replace (V6, 6) with (V7, 6). I have about 1000 NA's in V6 which
I want to replace  with the same row in V7. The other values in V6, I want
to keep remain the same.

How to achieve this? Assuming my data is called Targetstation,  I have
tried this:

Targetstation - within(Targetstation, v6 - replace(v6, is.na(v6), v7[is.na
(v6)]))

But R gives me this:

Warning messages:

1: In is.na(v6) : is.na() applied to non-(list or vector) of type 'NULL'

2: In is.na(v6) : is.na() applied to non-(list or vector) of type 'NULL'


How to solve this?

Thank you in advance.

Regards,

Dila.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MM estmator

2013-11-13 Thread Simon Zehnder

Check the gmm package with a weighting matrix equal to the identity.

Best

Simon
 
On 14 Nov 2013, at 04:37, IZHAK shabsogh ishaqb...@yahoo.com wrote:

 hi
 
 I have a nonlinear regression model with two parameter, i have estimated 
 using nls in R
 and i want to also find the estimate using MM, someone refer me to this 
 function nlrob
 but this is main for only M estimate. can you please help me findout  a 
 function in R for MM nonlinear regression
 
 thanks
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

52 matches

Mail list logo