Re: [R] trouble with Rcmdr

2009-08-19 Thread Liviu Andronic
Hello,

On 8/18/09, Erin Hodgess erinm.hodg...@gmail.com wrote:
  I used it on Saturday, and it worked fine.

  When I used it today, the window did not appear.

  This is version 2.9.1 on Ubuntu Jaunty Jackalope.

Did you make any system updates between Sat and today?
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot(x,y)

2009-08-19 Thread milton ruser
Hi malcolm,

sorry but your post continue with pour quality, because it is not
reproducible, it is!?
By the way,

gavin.simp...@ucl.ac.uk wrote:
. You really shouldn't call your data, 'data' as it will get very
confusing when you start using modelling functions and using the data
argument, or using inbuilt data sets and thus using the data() function.
For a funny aside on this, do the following:
 install.packages(fortunes)
 require(fortunes)
Loading required package: fortunes
 fortune(dog)
Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'?
Anyway, it might clash with the function 'matrix'.
  -- Barry Rowlingson
 R-help (October 2004)


So not deal your dataset object named data :-)

cheers

milton

On Sun, Aug 16, 2009 at 3:19 PM, malcolm Crouch
malcolm.crouc...@gmail.comwrote:

 My apologies for the poor quality of the last post.


 Query


  plot(V6,V5, col=red)
 or
  plot(V6,V5)

 the graph is attached.

 Data

 V5   V6
 2063  0.009179148 -8117667.885
 2064   0.03854349 -8117963.045
 2065 -0.018345998 -8117980.935
 2066  0.023662906 -8118013.245
 2067  0.011029293 -8118253.705
 2068   0.01772285 -8118287.025
 2069  0.013767122 -8118297.385
 2070  0.005579858 -8118832.465
 2071 -0.011432977 -8118905.025
 2072  0.017675178 -8118990.045
 2073  0.037850056 -8118993.905
 2074 -0.002187153 -8119290.345
 2075   0.03605708 -8119671.645
 2076  0.029300877 -8119714.665
 2077 -0.008533494 -8119716.585
 2078  0.016006132 -8119989.885
 2079  0.005745295 -8120185.745
 2080   0.00444089 -8120193.425
 2081 -0.006769723 -8120293.745
 2082 -0.058917022 -8121208.925
 2083  0.016650977 -8121585.625
 2084  0.013880279 -8123869.985
 2085  0.036072047 -8124184.185
 2086  0.030211163 -8124396.485
 2087  0.036530434 -8124771.525
 2088  0.039814047 -8125542.285
 2089  0.039217756 -8126208.525
 2090  0.044641797 -8126532.865
 2091 -0.007088835 -8126578.105
 2092  0.043141658 -8126678.505
 2093  0.053877516 -8131283.565
 2094  0.055036843 -8143173.245
 2095  0.128109562 -8160652.125
 V1 V2
 V3  V4
 2074   SHIELD MARKETING C AND C INDEPENDANTS
 REDISTRIBUTION
 2075  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2076PEP  NON FOOD RETAIL
 CONVENIENCE
 2077  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2078PEP  NON FOOD RETAIL
 CONVENIENCE
 2079  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2080  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2081   SHIELD MARKETING C AND C INDEPENDANTS
 REDISTRIBUTION
 2082  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2083  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2084  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2085  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2086  INDEPENDENT CAFES  IMPULSE
 CONVENIENCE
 2087  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2088   SHIELD MARKETING C AND C INDEPENDANTS
 REDISTRIBUTION
 2089   SHIELD MARKETING C AND C INDEPENDANTS
 REDISTRIBUTION
 2090  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2091  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2092  ACKERMANS  NON FOOD RETAIL
 CONVENIENCE
 2093INDEPENDENT WHOLESALERS C AND C INDEPENDANTS
 REDISTRIBUTION
 2094   SHOPRITE  GROCERY
 RETAIL
 2095  OK FRANCHISE DIVISION   GROCERY TOP UP
 RETAIL


 Regards

 Malcolm

 On Sun, Aug 16, 2009 at 8:47 PM, malcolm Crouch
 malcolm.crouc...@gmail.comwrote:

  Hi ,
 
  I am using the plot function for some data , and the plot is coming back
  pure black , with scales on the side .
 
  Regards
 
  Malcolm
 
 
 


 --
 Malcolm A.B Croucher

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot 2 semi-transparency error

2009-08-19 Thread Chris Friedl

I really don't know anything much about this but I tried device=pdf
(no-quotes) and I got a pdf file with transparent plots.

?device.print gives help with some links to other devices

hth


rajesh j wrote:
 
 Hi,
 I used the command ggplot as follows...
 p-ggplot(a,aes(x=a$V1,colour=a$V2,fill=a$V2))
 p +  geom_density(alpha = 0.2,xlim=c(-10,10),ylim=c(0,0.5))
 
 when I say,
 dev.print(device=postscript,file=/alpha/dct.pdf)
 I get
 
 Warning message:
 In grid.Call.graphics(L_polygon, x$x, x$y,
 list(as.integer(1L:length(x$x :
   semi-transparency is not supported on this device: reported only once
 per
 page
 
 the pdf has error
 any help?
 -- 
 Rajesh.J
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/ggplot-2-semi-transparency-error-tp25038236p25038580.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] histogram scaling

2009-08-19 Thread rajesh j
Hi,

I'm drawing two histograms in the same plot.However, my point of comparison
is the difference in their x coordinates.But my problem is one histogram is
much taller than the other.How can I get them both to the same height?

-- 
Rajesh.J

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] histogram scaling

2009-08-19 Thread Chris Friedl

Hey, Rajesh

I bit light on detail here. Being a mind reader is not an R-help
prerequisite. However since I have been working on histograms today and
you've just posted a question using ggplot, let me guess that its ggplot you
are refering to. Then here is an example, which you can find in my post a
few posts previous to yours. I've added scale_ commands to restrict the x
and y scales. Actually in this example both histograms get plotted on the
same y-scale automatically so that command can be removed.

Check out the ggplot on line reference http://had.co.nz/ggplot2/ and book
http://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf

hth


require(ggplot2)
x - data.frame(value=rnorm(5000, mean=0), case=A)
y - data.frame(value=rnorm(5000, mean=3), case=B)
xy - rbind(x, y)
ggplot(xy, aes(x=value, fill=case, group=case)) + geom_histogram(alpha=0.5,
binwidth=0.1, position=identity) + scale_x_continuous(limits=c(-2,3)) +
scale_y_continuous(limits=c(0,250))




rajesh j wrote:
 
 Hi,
 
 I'm drawing two histograms in the same plot.However, my point of
 comparison
 is the difference in their x coordinates.But my problem is one histogram
 is
 much taller than the other.How can I get them both to the same height?
 
 -- 
 Rajesh.J
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/histogram-scaling-tp25038602p25038798.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why are there small circles in my plot

2009-08-19 Thread Mao Jianfeng
Dear R-listers,

There is my data and my codes to create a plot. I want to know why
there are two small circles in the upper right and lower left of the
plot respectively. Could you please share your experience or advice
with me?

# dummy data
factor-rep(c(Alice,Jone,Mike),each=100)
factor-factor(factor)
traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3),
rnorm(100, mean=6, sd=6))
myda-data.frame(factor,traits)

# my plot
plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State',
ylab='ylab')
lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2)
lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3)
lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4)
points(myda$traits[factor==c(Alice)], rep(-0.01,
length(myda$traits[factor==c(Alice)])), pch=|, col=2)
points(myda$traits[factor==c(Jone)], rep(-0.02,
length(myda$traits[factor==c(Jone)])), pch=|, col=3)
points(myda$traits[factor==c(Mike)], rep(-0.03,
length(myda$traits[factor==c(Mike)])), pch=|, col=4)


Best Regards,

Yours,
Mao J-F

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Remove columns

2009-08-19 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 18.08.2009 10:14:26:

 Hi Everbody
 
 Could somebody help me.?
 
 I need to remove the columns where the sum of it components is equal to
 zero.
 
 For example
 
  a-matrix(c(0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,0,0,1,0), ncol=4)
  a
  [,1] [,2] [,3] [,4]
 [1,]0001
 [2,]0101
 [3,]0000
 [4,]0100
 [5,]0001
 [6,]0000
 
 Columns 1 and 3 should be removed
 
 the result should be the dollowing matrix
 
  [,2]  [,4]
 [1,]01
 [2,]11
 [3,]00
 [4,]10
 [5,]01
 [6,]00


a[,!colSums(a)==0]

Beware of == and finite precision of floating point numbers (see FAQ)

Regards
Petr


 
 Thanks again
 
 
 -- 
 Alberto Lora Michiels
 Rue du Progrčs,  6B
 7860 Lessines
 GSM 32(0)496659457
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] moving color key in heatmap

2009-08-19 Thread Dieter Menne



Booboo Qu wrote:
 
 
 I have a question on moving color keys when side color bars are added to a
 heatmap.
 The R code below produces the color key in the upper left corner. Notice I
 have added side bars to the heatmap, but how could I move the color key
 below the image? 
 

Thanks for the self-contained example. heatmap internally uses layout (see
docs) to arrange the plot, and its parameters can be controlled by lwid and
lhei.
So give the following a try that makes the histogram and the key same size.
However, this does not help in putting the heatmap below, because everything
is plotted in an array arrangement, as you can see from the example. You
would have to change the code to put the Key in a third row; see the region
around layout in function heatmap, but it is not easy.

res=heatmap.3(x, RowSideColors=rc, ColSideColors=cc,lwid=c(1.5,1.5),
  lhei=c(1.5,1.5))



-- 
View this message in context: 
http://www.nabble.com/moving-color-key-in-heatmap-tp25036103p25038995.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot 2 semi-transparency error

2009-08-19 Thread Dieter Menne



rajesh j wrote:
 
 I used the command ggplot as follows...
 p-ggplot(a,aes(x=a$V1,colour=a$V2,fill=a$V2))
 p +  geom_density(alpha = 0.2,xlim=c(-10,10),ylim=c(0,0.5))
 
 when I say,
 dev.print(device=postscript,file=/alpha/dct.pdf)
 I get
 
 Warning message:
 In grid.Call.graphics(L_polygon, x$x, x$y,
 list(as.integer(1L:length(x$x :
   semi-transparency is not supported on this device: reported only once
 per
 page
 
 

It is a bit unconventional to expect a pdf when using the postscript device,
that might explain the error when you open it, because you created a ps
file.

My personal preference is not to use dev.print at all, but to open the
output device before doing the plot.

A self-contained example would be nice.

Dieter


-- 
View this message in context: 
http://www.nabble.com/ggplot-2-semi-transparency-error-tp25038236p25039037.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why are there small circles in my plot

2009-08-19 Thread Stephan Kolassa

Hi,

Mao Jianfeng schrieb:

plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State',
ylab='ylab')


Here, you are plotting two data points: (min(myda$traits),-0.03) and 
(max(myda$traits),0.5). Try this:


 plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State',
 ylab='ylab',type='n')

HTH,
Stephan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Polygon function

2009-08-19 Thread Jim Lemon

Cetinyürek Aysun wrote:

Dear all,
I would like to plot credible interval for a function estimate in R. I
would like to plot the credible intervals as shaded region using polygon
function. Does anyone ever used that? I tried several times but I could
not obtain the right figure.

xis=sort(xi,decreasing=TRUE)

plot(xi,fm)
polygon(c(xi,xis),c(f05m,f95m))

The above piece of code produces something else.

  

Hi Aysun,
Have a look at the dispersion function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Performance measure for probabilistic predictions

2009-08-19 Thread Noah Silverman

Hello,

I'm using an SVM for predicting a model, but I'm most interested in the 
probability output.  This is easy enough to calculate.


My challenge is how to measure the relative performance of the SVM for 
different settings/parameters/etc.


An AUC curve comes to mind, but I'm NOT interested in predicting true vs 
false.  I am interested in finding the most accurate probability 
predictions possible.


I've seen some literature where the probability range is cut into 
segments and then the predicted probability is compared to the actual.  
This looks nice, but I need a more tangible numeric measure.  One 
thought was a measure of probability accuracy for each range, but how 
to calculate this.


Any thoughts?

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to fill the area under the density line with semitransparent colors

2009-08-19 Thread Mao Jianfeng
Dear R-listers,

I have created a plot to display the density lines for the same
variable by different entities. Now, I want to fill the area under the
density lines with semitransparent colors.
Though I have checked that in web-searching and book-reading, I still
do not perform that.

Could anyone please give me any helps or advice? Thank you in advance.

The data and code I used listed below:

# dummy data
factor-rep(c(Alice,Jone,Mike),each=100)
factor-factor(factor)
traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3),
rnorm(100, mean=6, sd=6))
myda-data.frame(factor,traits)

# my plot
plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State',
ylab='ylab')
lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2)
lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3)
lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4)
points(myda$traits[factor==c(Alice)], rep(-0.01,
length(myda$traits[factor==c(Alice)])), pch=|, col=2)
points(myda$traits[factor==c(Jone)], rep(-0.02,
length(myda$traits[factor==c(Jone)])), pch=|, col=3)
points(myda$traits[factor==c(Mike)], rep(-0.03,
length(myda$traits[factor==c(Mike)])), pch=|, col=4)

Regards,

Yours
Mao J-F

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Move legend text to the left legend box border

2009-08-19 Thread Sigbert Klinke
Hi,

in legend I'am coloring my text rather than using symbols or lines:

legend(bottomleft, txt, text.col=col, cex=0.7)

However, between the left legend box border and the text in txt is a
large empty space. Can I somehow move the text more to the left and get
also a smaller legend box?

Thanx in advance

Sigbert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot 2 semi-transparency error

2009-08-19 Thread ONKELINX, Thierry
ggplot2 has it's own build in wrapper-function to store plots: ggsave.

p - ggplot(a,aes(x=V1,colour=V2,fill=V2)) + geom_density(alpha =
0.2,xlim=c(-10,10),ylim=c(0,0.5))
ggsave(p, filename = /alpha/dtc.pdf)

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens rajesh j
Verzonden: woensdag 19 augustus 2009 7:47
Aan: r-help@r-project.org
Onderwerp: [R] ggplot 2 semi-transparency error

Hi,
I used the command ggplot as follows...
p-ggplot(a,aes(x=a$V1,colour=a$V2,fill=a$V2))
p +  geom_density(alpha = 0.2,xlim=c(-10,10),ylim=c(0,0.5))

when I say,
dev.print(device=postscript,file=/alpha/dct.pdf)
I get

Warning message:
In grid.Call.graphics(L_polygon, x$x, x$y,
list(as.integer(1L:length(x$x :
  semi-transparency is not supported on this device: reported only once
per page

the pdf has error
any help?
--
Rajesh.J

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Move legend text to the left legend box border

2009-08-19 Thread Stefan Grosse
On Wed, 19 Aug 2009 10:20:03 +0200 Sigbert Klinke
sigb...@wiwi.hu-berlin.de wrote:

SK in legend I'am coloring my text rather than using symbols or lines:
SK 
SK legend(bottomleft, txt, text.col=col, cex=0.7)
SK 
SK However, between the left legend box border and the text in txt is a
SK large empty space. Can I somehow move the text more to the left and
SK get also a smaller legend box?

try
legend(bottomleft, txt, text.col=col, cex=0.7,inset=-0.05)
and play around with the inset value.

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 transparent pdf

2009-08-19 Thread rajesh j
Hi,
I plotted a histogram using ggplot2 and saved it as a pdf.However, the
portions outside the histogram dont appear transparent when I use a
non-white bg colour in my latex document.What can I do to make them
transparent?

-- 
Rajesh.J

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 transparent pdf

2009-08-19 Thread baptiste auguie
Try this,

print(p+ opts(plot.background= theme_rect(fill=NA)))

HTH,

baptiste

2009/8/19 rajesh j akshay.raj...@gmail.com:
 Hi,
 I plotted a histogram using ggplot2 and saved it as a pdf.However, the
 portions outside the histogram dont appear transparent when I use a
 non-white bg colour in my latex document.What can I do to make them
 transparent?

 --
 Rajesh.J

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer with random slopes for 2 or more first-level factors?

2009-08-19 Thread ONKELINX, Thierry
Dear Jason,

Both models have a correct syntax. Although I would write the last model
rather as lmer(DV ~ IV1 + IV2 + (1|Subject) + (IV1 - 1| Subject) + (IV2
- 1| Subject))

The only difference is indeed the correlation between the random
effects. The random effects in the model I wrote are all independent
(not correlated). In your first model all random effects are correlated.
In your second model the first random intercept is correlated with the
random slope of IV1. The second random intercept with the random slope
of IV2.

Depending on if you want your interaction to be indepentend or not your
can use either

lmer(DV ~ IV1 + IV2 + (IV1 * IV2 | Subject))
lmer(DV ~ IV1 + IV2 + (1 | Subject) + (IV1 -1 | Subject) + (IV2 - 1|
Subject) + (IV1:IV2 - 1| Subject))


Note that IV1 * IV2 is equivalent to 1 + IV1 + IV2 + IV1:IV2

HTH,

Thierry

PS R-Sig-mixed-models is a better list for this kind of questions. 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Jason R. Finley
Verzonden: woensdag 19 augustus 2009 3:47
Aan: r-help@r-project.org
Onderwerp: [R] lmer with random slopes for 2 or more first-level
factors?

I have data from a design in which items are completely nested within
subjects.  Subject is the only second-level factor, but I have multiple
first-level factors (IVs).  Say there are 2 such independent variables
that I am interested in.  What is the proper syntax to fit a
mixed-effects model with a by-subject random intercept, and by-subject
random slopes for both the 2 IVs?

I can think of at least two possibilities:

lmer(DV ~ IV1 + IV2 + (1 + IV1 + IV2 | Subject))

lmer(DV ~ IV1 + IV2 + (1 + IV1 | Subject) + (1 + IV2 | Subject))

Or maybe there is some other way to do it?  Maybe the correct syntax
depends on whether the random effect of subjects on the intercept and
slopes are correlated or not?  (If so, how do I proceed?)

Finally, what would be the syntax if I wanted to include a random
subject effect for the INTERACTION of IV1 and IV2?

Thanks very much,
~jason

PS: additional search terms: multi-level linear model, MLM,
hierarchical, repeated measures

~~~
Jason R. Finley
Department of Psychology
University of Illinois, Urbana-Champaign
~~~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package read large file

2009-08-19 Thread Mohamed Lajnef

Dear R-Users,

I am looking for packages that could read large files in R?
any suggestions are welcome.
Regards,
ML

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to fill the area under the density line with semitransparent colors

2009-08-19 Thread baptiste auguie
Hi,

One way using ggplot2,

library(ggplot2)

ggplot(data=myda, mapping=aes(x=traits, y=..density..)) +
stat_density(aes(fill=factor), alpha=0.5, col=NA, position = 'identity') +
stat_density(aes(colour = factor), geom=path, position = 'identity', size=2)

HTH,

baptiste


2009/8/19 Mao Jianfeng jianfeng@gmail.com:
 Dear R-listers,

 I have created a plot to display the density lines for the same
 variable by different entities. Now, I want to fill the area under the
 density lines with semitransparent colors.
 Though I have checked that in web-searching and book-reading, I still
 do not perform that.

 Could anyone please give me any helps or advice? Thank you in advance.

 The data and code I used listed below:

 # dummy data
 factor-rep(c(Alice,Jone,Mike),each=100)
 factor-factor(factor)
 traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3),
 rnorm(100, mean=6, sd=6))
 myda-data.frame(factor,traits)

 # my plot
 plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State',
 ylab='ylab')
 lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2)
 lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3)
 lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4)
 points(myda$traits[factor==c(Alice)], rep(-0.01,
 length(myda$traits[factor==c(Alice)])), pch=|, col=2)
 points(myda$traits[factor==c(Jone)], rep(-0.02,
 length(myda$traits[factor==c(Jone)])), pch=|, col=3)
 points(myda$traits[factor==c(Mike)], rep(-0.03,
 length(myda$traits[factor==c(Mike)])), pch=|, col=4)

 Regards,

 Yours
 Mao J-F

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to Transpose a dataset

2009-08-19 Thread Ottorino-Luca Pantani

rajclinasia ha scritto:

Hi everyone,
 
How to a transpose a R dataset with a specified variable. If possible send

the code. it will be very helpful for us.

Thanks in Advance.
  

A written code of your problem would be useful for us to help you.

If you have a dataframe you may want to take a look also at
?stack
?reshape

--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 
olpant...@unifi.it  http://www4.unifi.it/dssnp/


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Move legend text to the left legend box border

2009-08-19 Thread baptiste auguie
I believe you want x.intersp,

txt - c(Setosa Petals, Versicolor Sepals)

plot(1,1,t=n)

legend(top, txt, text.col=1:2, cex=0.7,
inset=c(0,1/3))

legend(center, txt, text.col=1:2, cex=0.7,
x.intersp = -0.5)

HTH,

baptiste

2009/8/19 Stefan Grosse singularit...@gmx.net:
 On Wed, 19 Aug 2009 10:20:03 +0200 Sigbert Klinke
 sigb...@wiwi.hu-berlin.de wrote:

 SK in legend I'am coloring my text rather than using symbols or lines:
 SK
 SK legend(bottomleft, txt, text.col=col, cex=0.7)
 SK
 SK However, between the left legend box border and the text in txt is a
 SK large empty space. Can I somehow move the text more to the left and
 SK get also a smaller legend box?

 try
 legend(bottomleft, txt, text.col=col, cex=0.7,inset=-0.05)
 and play around with the inset value.

 hth
 Stefan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] radial.plot() in plotrix module

2009-08-19 Thread Jim Lemon

Jon Case wrote:

Hello,
Is it possible to plot differently colored points with the radial.plot()
function in the plotrix package? For example, the code:

radial.plot( c(.5,.6, .7), c(1,2,3), rp.type = s,
point.col=c(red, green, blue) )

produces a plot whose three points are all red, instead of one red, one
green, and one blue. Any suggestions?

  

Hi Jon,
When I try your code, I get the points in the requested colors. Of 
course, I am running the very latest version of plotrix


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why are there small circles in my plot

2009-08-19 Thread S Ellison


 Mao Jianfeng jianfeng@gmail.com 19/08/2009 08:10:54 
There is my data and my codes to create a plot. I want to know why
there are two small circles in the upper right and lower left of the
plot respectively. 
Because your first plot() command put them there.

add type=n to the first plot command?



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Embedding lists in matrices and matrices in lists

2009-08-19 Thread Michael Kogan
Unfortunately the  matrix(list(),x,y) way seems to have some 
limitations. I want to continue working with the matrices which are 
saved in the database matrix. But for example the following doesn't work:


tetrahedron=matrix(c(
0,1,1,1,
1,0,1,1,
1,1,0,1,
1,1,1,0
),nrow=4, byrow=TRUE) # an example matrix

database=matrix(list(),5,5) # create database matrix
database[[4,4]]=list(tetrahedron) # save example matrix in database matrix

## try to access the first row of the saved example matrix ##
database[[4,4]][1] # prints the whole example matrix
database[[4,4]][1][1,] # should have printed the first row of the 
example matrix, but gives:


Error in database[[4, 4]][1][1,] : incorrect number of dimensions
Execution halted

The same happens with database[[4,4]][1,]... Is there any way to access 
the saved matrices like normal ones?


Thanks,
Michael

jim holtman schrieb:

Is this what you want:

  

x - matrix(list(),3,3)  # create a matrix of lists
# create matrices for testing
for(i in 1:3){


+ for (j in 1:3){
+ x[[i,j]] - matrix(runif((i+1) * (j+1)), i+1)
+ }
+ }
  

x


 [,1]  [,2]   [,3]
[1,] Numeric,4 Numeric,6  Numeric,8
[2,] Numeric,6 Numeric,9  Numeric,12
[3,] Numeric,8 Numeric,12 Numeric,16
  

x[[2,2]]  # extract one of them


   [,1]  [,2]  [,3]
[1,] 0.26722067 0.3823880 0.4820801
[2,] 0.38611409 0.8696908 0.5995658
[3,] 0.01339033 0.3403490 0.4935413
  

str(x)  # structure of the data


List of 9
 $ : num [1:2, 1:2] 0.266 0.372 0.573 0.908
 $ : num [1:3, 1:2] 0.38 0.777 0.935 0.212 0.652 ...
 $ : num [1:4, 1:2] 0.7894 0.0233 0.4772 0.7323 0.6927 ...
 $ : num [1:2, 1:3] 0.202 0.898 0.945 0.661 0.629 ...
 $ : num [1:3, 1:3] 0.2672 0.3861 0.0134 0.3824 0.8697 ...
 $ : num [1:4, 1:3] 0.2448 0.0707 0.0995 0.3163 0.5186 ...
 $ : num [1:2, 1:4] 0.206 0.177 0.687 0.384 0.77 ...
 $ : num [1:3, 1:4] 0.186 0.827 0.668 0.794 0.108 ...
 $ : num [1:4, 1:4] 0.258 0.4785 0.7663 0.0842 0.8753 ...
 - attr(*, dim)= int [1:2] 3 3
  



On Tue, Aug 18, 2009 at 4:48 AM, Michael Koganmichael.ko...@gmx.net wrote:
  

Hi,

I'm new to programming, new to R and even new to mailing lists so please be
patient with me. I need to manage many matrices generated by an R program.
These matrices have different dimensions and I'd like to group them somehow.
The best way would be to have a big matrix (let's call it database) where
every element database[x,y] consists of a list of matrices that all have the
dimensions ncol(matrix1)=x and nrow(matrix1)=y. So those matrices have to be
embedded into lists and the lists have to be embedded in the big database
matrix. If I simply try

  database=matrix(0,10,10)
  database[4,4]=c(matrix1,matrix2)

I get

  Error in database[4, 4] = c(matrix1, matrix2) :
number of items to replace is not a multiple of replacement length
  Execution halted

which makes sense of course... Is there any possibility to make this work?
Or maybe there is a better way to organize those matrices?

Regards,
Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.









__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting a logistic regression

2009-08-19 Thread Dani Valverde
Hello,
I have this data:
  Time  AMP
 0 0.200
10 0.1958350
20 0.2914560
40 0.6763628
60 0.8494534
90 0.9874526
   120 1.0477692

where AMP is the concentration of this metabolite with time. If you plot
the data, you can see that it could be fitted using a logistic
regression. For this purpose, I used this code:

AMP.nls - nls(AMP~SSlogis(Time,Asym, xmid, scal), data =
concentrations,model=T)

When plotting the fitted function, it seems that it fits quite well at
the end of the time. However, at the beginning it seems that the fit is
not so good. How can I achieve a better fit? Forgive me if it is a
stupid question, but I am just starting with non linear regression.
Thank you,

Dani
-- 
[?]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] font size on graphics

2009-08-19 Thread Tom Willems
Dear R users,

My question is about finding the proper font size for graphics.

For this i had written a code that creats 4 diferent graphics and saves 
them as a png file.

From these PNG.graphics , i select one of the proper size and past it to a 
word document.
I have experimented with lots of settings yet:nd lost my track a bit.
there are cex; cexaxis cexlabes and so on, i lost track of wich cex does 
what exactly.

Fruthermore, these graphs work well in an R window, yet when i open the 
png file it did not print the labels of bottom x axis.

One other problem i have is that often '10' on the botom x axis is not 
printed.
I tried to patch it up this way (below), yet its unsatisfactory, sometimes 
it works sometimes it don't. 
 axis(1,at=c(28:30),c('','10','') , padj=-1.5 , 
cex.axis=ifelse((rol==1),setting[rol,8]-0.15,setting[rol,8]-0.1))

And finaly, when creating the graphs, i prefer the size of  graf_1_small 
it pastes easaly to a word documentn ( 2 one one line),
yet the labels are not always clear to read, so i copy the one without 
suffix to word then i size it to 75%, wich means it loses some quality.
The question here is, can anyone help me with the right cex sizes?

I gues that that is the question in general, since it are the cex that are 
eighter printed or not, on all these graph's.

Thaks in advance,
T

here is my example:

# saving Directory

WinDir - C:/Documents and Settings/towil/Projecten



# data

trials - rep(1:10,3)
test - c(rep(Low,10),rep(Normal,10),rep(Randomised,10))
means - rbeta(1:30,11,3)+rbeta(1:30,8,3)
ci - rbeta(1:30,1,2)

meanstable - data.frame( Trial=trials,Test=test,Means= 
means,Upper=means+ci ,Lower=means-ci)

graf_1 - Means 1

graf_2 - Means 2

# settings for different graph's

setting - 
data.frame('adname'=c(paste(graf_1,'_small'),graf_1,paste(graf_1,'_big')),'width'=c(300,400,500),'height'
 
= c(300,400,500), 'pointsetting' = c(10,12,14),'Directory'=rep(WinDir,3), 
'cex.lab'=  c(0.85,0.9,1),'cexsize'=  c(0.8,1,1.2), 'cex.axis'= 
c(0.6,0.9,1))

# loop for different graph's

for(rol in 1:3){
save_at -WinDir
setwd( save_at)
x11()
png(filename = paste(graf_1,setting[rol,1],.png,sep=),width = 
setting[rol,2], height = setting[rol,3],pointsize = setting[rol,4], bg = 
white, res = NA)

 plot(1:nrow(meanstable),meanstable$Mean, xlim=c(0,nrow(meanstable)),ylim= 
ylimits, xaxt='n', pch = 19, 
frame.plot=F,xlab='',ylab=lg10_label,cex.lab=setting[rol,6],cex.axis=setting[rol,8])
  arrows(1:nrow(meanstable) , meanstable$Upper , 1:nrow(meanstable) , 
meanstable$Lower, lty=3, code = 3,  angle = 45, length = .1)
 
  axis(3,at=1:2,unique(meanstable$Test)[c(1,2)],las=1 ,hadj=0.5, 
padj=c(1,0),cex.axis=setting[rol,8])   # upper X axis
  axis(3,at=2:3,unique(meanstable$Test)[c(2,2)],las=1 ,hadj=0.5, 
padj=c(0,-1),cex.axis=setting[rol,8])
  axis(3,at=3,unique(meanstable$Test)[3],las=1, 
hadj=0,padj=-1,cex.axis=setting[rol,8])
 
  axis(1,at=c(1:3),c('','1','') , padj=-1 , cex.axis=setting[rol,8])   # 
lopwer x axis this does not show on the png files, yet it works in an r 
graphic
  axis(1,at=c(4:6),c('','2','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(7:9),c('','3','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(10:12),c('','4','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(13:15),c('','5','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(16:18),c('','6','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(19:21),c('','7','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(22:24),c('','8','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(25:27),c('','9','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(28:30),c('','10','') , padj=-1 , cex.axis=setting[rol,8])


dev.off()
}

#

graphics.off()

End of example.




Disclaimer: click here
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to fill the area under the density line with semitransparent colors

2009-08-19 Thread David Winsemius


On Aug 19, 2009, at 4:05 AM, Mao Jianfeng wrote:


Dear R-listers,

I have created a plot to display the density lines for the same
variable by different entities. Now, I want to fill the area under the
density lines with semitransparent colors.
Though I have checked that in web-searching and book-reading, I still
do not perform that.

Could anyone please give me any helps or advice? Thank you in advance.

The data and code I used listed below:

# dummy data
factor-rep(c(Alice,Jone,Mike),each=100)
factor-factor(factor)
traits-c(rnorm(100, mean=1, sd=1), rnorm(100, mean=3, sd=3),
rnorm(100, mean=6, sd=6))
myda-data.frame(factor,traits)

# my plot
plot(c(min(myda$traits),max(myda$traits)),c(-0.03,0.5), xlab='State',
ylab='ylab')
lines(density(myda$traits[factor==c(Alice)]), lwd=2,col=2)
lines(density(myda$traits[factor==c(Jone)]), lwd=2,col=3)
lines(density(myda$traits[factor==c(Mike)]), lwd=2,col=4)
points(myda$traits[factor==c(Alice)], rep(-0.01,
length(myda$traits[factor==c(Alice)])), pch=|, col=2)
points(myda$traits[factor==c(Jone)], rep(-0.02,
length(myda$traits[factor==c(Jone)])), pch=|, col=3)
points(myda$traits[factor==c(Mike)], rep(-0.03,
length(myda$traits[factor==c(Mike)])), pch=|, col=4)


Searching with the terms polygon density( produces a worked example  
of filling the area under a density plot at this help page:


http://finzi.psych.upenn.edu/R/library/mclust02/html/density.html

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Embedding lists in matrices and matrices in lists

2009-08-19 Thread Michael Kogan

Strange, it doesn't work for me:

Error in database[4, 4][[1]][1, ] : incorrect number of dimensions
Execution halted

R version 2.9.0 (2009-04-17) on Arch Linux, no additional packages 
installed.


David Winsemius schrieb:


On Aug 19, 2009, at 6:02 AM, Michael Kogan wrote:

Unfortunately the  matrix(list(),x,y) way seems to have some 
limitations. I want to continue working with the matrices which are 
saved in the database matrix. But for example the following doesn't 
work:


tetrahedron=matrix(c(
0,1,1,1,
1,0,1,1,
1,1,0,1,
1,1,1,0
),nrow=4, byrow=TRUE) # an example matrix

database=matrix(list(),5,5) # create database matrix
database[[4,4]]=list(tetrahedron) # save example matrix in database 
matrix


## try to access the first row of the saved example matrix ##
database[[4,4]][1] # prints the whole example matrix


I was surprised that this worked. I would have expected that you would 
have used:
database[4,4][[1]]   since database is a matrix and generally one 
accesses matrix elements
with [n,m] addressing, while the element is a list and would 
need [[n]] addressing.


database[[4,4]][1][1,] # should have printed the first row of the 
example matrix, but gives:


So I tried:

database[4,4][[1]][1,]
# [1] 0 1 1 1 Success




Error in database[[4, 4]][1][1,] : incorrect number of dimensions
Execution halted

The same happens with database[[4,4]][1,]... Is there any way to 
access the saved matrices like normal ones?


Thanks,
Michael



David Winsemius, MD
Heritage Laboratories
West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R function for Probabilities for the (standard) normal distribution

2009-08-19 Thread Rene
Dear All,

I need to write an R function which computes values of  Probabilities for
the (standard) normal distribution, ¦µ(z) for z  0 by summing this power
series. (We should accumulate terms until the sum doesn't change). I also
need to make the function work for vector of values z.

The initial fomular is

 

¦µ(z) = ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt   (¡Ò is from -¡Þ, z)

 = 1/ 2 + ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt (¡Ò is from 0 to z)

 

I can substituting x = -t^2/2 into the series expansion for the exponential
function

 

 

e^x = ¡Æ x^n/n! (¡Æ is from n=0 to ¡Þ)

 

I can obtain the series

 

e^(-t^2/2) = ¡Æ (-1)^k*t^2k / 2^k*k!   (¡Æ is from n=0 to ¡Þ)

 

 

 

This series can be integrated term by term to obtain the formula

 

¦µ(z) = 1/ 2 + ( 1/ sqrt(2¦Ð) ) * ¡Æ (-1)^k*z^(2k+1) / (2^k*k! *(2k+1))
(¡Æ is from n=0 to ¡Þ)

 

 

I know how to write the R function for exponential function e^x 

 

 expf  = function (x)

{

x=ifelse((neg=(x0)),-x,x)

n=0;term=1

 oldsum=0; newsum=1

while(any(newsum != oldsum)) {

oldsum=newsum

n=n+1

term = term*x/n

newsum = newsum+term}

ifelse(neg, 1/newsum, newsum)

}

 

 

I know it will be similar to the above coding, but I don¡¯t know exactly how
should we  modify the above coding in order to get Probabilities for the
(standard) normal distribution, ¦µ(z) for z  0.  

 

Can anybody advise me on this??

 

 

Thanks a lot.

 

Rene.

 

 

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] font size on graphics question (correction in example,sorry)

2009-08-19 Thread Tom Willems
Dear R users,

My question is about finding the proper font size for graphics.

For this i had written a code that creats 4 diferent graphics and saves 
them as a png file.

From these PNG.graphics , i select one of the proper size and past it to a 
word document.
I have experimented with lots of settings yet:nd lost my track a bit.
there are cex; cexaxis cexlabes and so on, i lost track of wich cex does 
what exactly.

Fruthermore, these graphs work well in an R window, yet when i open the 
png file it did not print the labels of bottom x axis.

One other problem i have is that often '10' on the botom x axis is not 
printed.
I tried to patch it up this way (below), yet its unsatisfactory, sometimes 
it works sometimes it don't. 
 axis(1,at=c(28:30),c('','10','') , padj=-1.5 , 
cex.axis=ifelse((rol==1),setting[rol,8]-0.15,setting[rol,8]-0.1))

And finaly, when creating the graphs, i prefer the size of  graf_1_small 
it pastes easaly to a word documentn ( 2 one one line),
yet the labels are not always clear to read, so i copy the one without 
suffix to word then i size it to 75%, wich means it loses some quality.
The question here is, can anyone help me with the right cex sizes?

I gues that that is the question in general, since it are the cex that are 
eighter printed or not, on all these graph's.

Thaks in advance,
T

here is my example:


# saving Directory

WinDir - C:/Documents and Settings/Project
dir.create(file.path(C:/Documents and Settings,Project))



# data

trials - rep(1:10,3)
test - c(rep(Low,10),rep(Normal,10),rep(Randomised,10))
means - rbeta(1:30,11,3)+rbeta(1:30,8,3)
ci - rbeta(1:30,1,2)

meanstable - data.frame( Trial=trials,Test=test,Means= 
means,Upper=means+ci ,Lower=means-ci)

graf_1 - Means 1

graf_2 - Means 2

# settings for different graph's

setting - 
data.frame('adname'=c(paste(graf_1,'_small'),graf_1,paste(graf_1,'_big')),'width'=c(300,400,500),'height'
 
= c(300,400,500), 'pointsetting' = c(10,12,14),'Directory'=rep(WinDir,3), 
'cex.lab'=  c(0.85,0.9,1),'cexsize'=  c(0.8,1,1.2), 'cex.axis'= 
c(0.6,0.9,1))

# loop for different graph's

for(rol in 1:3){
save_at -WinDir
setwd( save_at)
x11()
png(filename = paste(graf_1,setting[rol,1],.png,sep=),width = 
setting[rol,2], height = setting[rol,3],pointsize = setting[rol,4], bg = 
white, res = NA)

 plot(1:nrow(meanstable),meanstable$Mean, xlim=c(0,nrow(meanstable)),ylim= 
ylimits, xaxt='n', pch = 19, 
frame.plot=F,xlab='',ylab=lg10_label,cex.lab=setting[rol,6],cex.axis=setting[rol,8])
  arrows(1:nrow(meanstable) , meanstable$Upper , 1:nrow(meanstable) , 
meanstable$Lower, lty=3, code = 3,  angle = 45, length = .1)

  axis(3,at=1:2,unique(meanstable$Test)[c(1,2)],las=1 ,hadj=0.5, 
padj=c(1,0),cex.axis=setting[rol,8])   # upper X axis
  axis(3,at=2:3,unique(meanstable$Test)[c(2,2)],las=1 ,hadj=0.5, 
padj=c(0,-1),cex.axis=setting[rol,8])
  axis(3,at=3,unique(meanstable$Test)[3],las=1, 
hadj=0,padj=-1,cex.axis=setting[rol,8])

  axis(1,at=c(1:3),c('','1','') , padj=-1 , cex.axis=setting[rol,8])   # 
lopwer x axis this does not show on the png files, yet it works in an r 
graphic
  axis(1,at=c(4:6),c('','2','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(7:9),c('','3','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(10:12),c('','4','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(13:15),c('','5','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(16:18),c('','6','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(19:21),c('','7','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(22:24),c('','8','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(25:27),c('','9','') , padj=-1 , cex.axis=setting[rol,8])
  axis(1,at=c(28:30),c('','10','') , padj=-1 , cex.axis=setting[rol,8])


dev.off()
}

#

graphics.off()


End of example.




Disclaimer: click here


Disclaimer: click here
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R function for Probabilities for the (standard) normal distribution

2009-08-19 Thread David Winsemius


On Aug 19, 2009, at 7:04 AM, Rene wrote:


Dear All,

I need to write an R function which computes values of   
Probabilities for
the (standard) normal distribution, ¦µ(z) for z  0 by summing this  
power
series. (We should accumulate terms until the sum doesn't change). I  
also

need to make the function work for vector of values z.


Two problems:
One, this appears to be a homework exercise.


The initial fomular is

¦µ(z) = ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt   (¡Ò is from -¡Þ, z)

= 1/ 2 + ( 1/ sqrt(2¦Ð) )* ¡Ò e^(-t^2/2)dt (¡Ò is from 0  
to z)


Two, you are posting in HTML mail and the characters that you may  
think we should be seeing reading as Greek letters including I presume  
capital sigma for summation are  not displayed properly.




I can substituting x = -t^2/2 into the series expansion for the  
exponential

function

e^x = ¡Æ x^n/n! (¡Æ is from n=0 to ¡Þ)

I can obtain the series

e^(-t^2/2) = ¡Æ (-1)^k*t^2k / 2^k*k!   (¡Æ is from n=0 to ¡Þ)

This series can be integrated term by term to obtain the formula

¦µ(z) = 1/ 2 + ( 1/ sqrt(2¦Ð) ) * ¡Æ (-1)^k*z^(2k+1) / (2^k*k! *(2k 
+1))

(¡Æ is from n=0 to ¡Þ)


I know how to write the R function for exponential function e^x

expf  = function (x)

   {

   x=ifelse((neg=(x0)),-x,x)

   n=0;term=1

oldsum=0; newsum=1

   while(any(newsum != oldsum)) {

   oldsum=newsum

   n=n+1

   term = term*x/n

   newsum = newsum+term}

   ifelse(neg, 1/newsum, newsum)

   }

I know it will be similar to the above coding, but I don¡¯t know  
exactly how
should we  modify the above coding in order to get Probabilities for  
the

(standard) normal distribution, ¦µ(z) for z  0.

Can anybody advise me on this??

Thanks a lot.

Rene.
[[alternative HTML version deleted]]

__


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] three dimensions barchart

2009-08-19 Thread Yichih Hsieh
Thanks David and Berton,

But I am so sorry for my mistake.
I have just found what I want is a three dimension *histogram*.
So correct problem is:

I draw the Relative Distribution graph,
it looks like histogram(X,Y plot),
but I have ten(year) Relative Distribution graphs,
have any command can combine ten histograms(X,Y plot) to become a three
dimensions histogram(X,Y,Z plot)?


I check David provide' website
http://lmdvr.r-forge.r-project.org/figures/figures.html#%20Figure%206.15
but can't find three dimension histogram.

 Have any suggestions?

All help highly appreciated.


Best,
Yichih


2009/8/19 David Winsemius dwinsem...@comcast.net


 On Aug 18, 2009, at 7:18 AM, Yichih Hsieh wrote:

 Dear R community,

 I have one problem with figures.

 I draw the Relative Distribution graph,
 it looks like barchart(X,Y plot),
 but I have ten(year) Relative Distribution grapgs,
 have any command am can combine ten barcharts(X,Y plot) to become a three
 dimensions barchart(X,Y,Z plot)?


 It has been my experience that the R-graphics experts generally deprecate
 3-D bar graphs, but there are some worked examples in the supporting
 webpages for Sarkar's Lattice book:

 Fig 6.4 (image followed by code)
 http://dsarkar.fhcrc.org/lattice/book/images/Figure_06_04_stdBW.png

 cloud(density ~ long + lat, state.info,
  subset = !(name %in% c(Alaska, Hawaii)), type = h, lwd = 2,
 zlim = c(0,
max(state.info$density)), scales = list(arrows
 = FALSE))
 library(maps) state.map - map(state, plot=FALSE, fill = FALSE)
 panel.3dmap - function(..., rot.mat, distance, xlim, ylim, zlim,
 xlim.scaled,
ylim.scaled, zlim.scaled) {
   scaled.val - function(x, original, scaled) { scaled[1] +
(x - original[1]) * diff(scaled) / diff(original) }

  m - ltransform3dto3d(rbind(scaled.val(state.map$x, xlim, xlim.scaled),
 scaled.val(state.map$y, ylim, ylim.scaled), zlim.scaled[1]),
 rot.mat, distance) panel.lines(m[1,], m[2,], col = grey76) }


 Fig 6.15 (image )
 http://dsarkar.fhcrc.org/lattice/book/images/Figure_06_15_stdBW.png

 code at:

 http://lmdvr.r-forge.r-project.org/figures/figures.html#%20Figure%206.15

  All help highly appreciated


 Best,
 Yichih


 One Relative Distribution graph:

 fig2b-reldist(y=mu96$adjwage,yo=mu68$adjwage,ci=F,smooth=0.4,
 bar=TRUE,yolabs=seq(-1,3,by=0.5),
 ylim=c(0,7),cex=0.8,ylab=Relative Density,xlab=Proportion of the
 Original
 Cohort)
 title(main=Fig2(b):2007,cex=0.6)



 --
 Yichih Hsieh

 e-mail : yichih.hs...@gmail.com

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT




-- 
Yichih Hsieh

e-mail : yichih.hs...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Gam (mgcv) function

2009-08-19 Thread Cetinyürek Aysun
Dear all,

I have obtained a model for my data set using gam function in mgcv library.
The model is as follows:

Family: Binomial
Link function : Logit
RU ~ s(time, by = Status, bs = ps) + Region*Status+ Gender*Status

I have obtained the results.

While presenting the results for Region and Gender, do I consider the
model as usual logistic and estimate the ORS by exponentiating the
coefficient estimate (and also for 95% CIs of ORs)? or is there any change
in the calculation of OR?
I have been searching for that and the thing I have found was related to
model prediction and plotting of smooth terms.

Thank you in advance,

Kind Regards,

Aysun

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Embedding lists in matrices and matrices in lists

2009-08-19 Thread Gabor Grothendieck
Try this:

 database[[4,4]] - tetrahedron
 database[[4,4]][1,]
[1] 0 1 1 1


On Wed, Aug 19, 2009 at 6:02 AM, Michael Koganmichael.ko...@gmx.net wrote:
 Unfortunately the  matrix(list(),x,y) way seems to have some limitations. I
 want to continue working with the matrices which are saved in the database
 matrix. But for example the following doesn't work:

 tetrahedron=matrix(c(
 0,1,1,1,
 1,0,1,1,
 1,1,0,1,
 1,1,1,0
 ),nrow=4, byrow=TRUE) # an example matrix

 database=matrix(list(),5,5) # create database matrix
 database[[4,4]]=list(tetrahedron) # save example matrix in database matrix

 ## try to access the first row of the saved example matrix ##
 database[[4,4]][1] # prints the whole example matrix
 database[[4,4]][1][1,] # should have printed the first row of the example
 matrix, but gives:

 Error in database[[4, 4]][1][1,] : incorrect number of dimensions
 Execution halted

 The same happens with database[[4,4]][1,]... Is there any way to access the
 saved matrices like normal ones?

 Thanks,
 Michael

 jim holtman schrieb:

 Is this what you want:



 x - matrix(list(),3,3)  # create a matrix of lists
 # create matrices for testing
 for(i in 1:3){


 +     for (j in 1:3){
 +         x[[i,j]] - matrix(runif((i+1) * (j+1)), i+1)
 +     }
 + }


 x


     [,1]      [,2]       [,3]
 [1,] Numeric,4 Numeric,6  Numeric,8
 [2,] Numeric,6 Numeric,9  Numeric,12
 [3,] Numeric,8 Numeric,12 Numeric,16


 x[[2,2]]  # extract one of them


           [,1]      [,2]      [,3]
 [1,] 0.26722067 0.3823880 0.4820801
 [2,] 0.38611409 0.8696908 0.5995658
 [3,] 0.01339033 0.3403490 0.4935413


 str(x)  # structure of the data


 List of 9
  $ : num [1:2, 1:2] 0.266 0.372 0.573 0.908
  $ : num [1:3, 1:2] 0.38 0.777 0.935 0.212 0.652 ...
  $ : num [1:4, 1:2] 0.7894 0.0233 0.4772 0.7323 0.6927 ...
  $ : num [1:2, 1:3] 0.202 0.898 0.945 0.661 0.629 ...
  $ : num [1:3, 1:3] 0.2672 0.3861 0.0134 0.3824 0.8697 ...
  $ : num [1:4, 1:3] 0.2448 0.0707 0.0995 0.3163 0.5186 ...
  $ : num [1:2, 1:4] 0.206 0.177 0.687 0.384 0.77 ...
  $ : num [1:3, 1:4] 0.186 0.827 0.668 0.794 0.108 ...
  $ : num [1:4, 1:4] 0.258 0.4785 0.7663 0.0842 0.8753 ...
  - attr(*, dim)= int [1:2] 3 3


 On Tue, Aug 18, 2009 at 4:48 AM, Michael Koganmichael.ko...@gmx.net
 wrote:


 Hi,

 I'm new to programming, new to R and even new to mailing lists so please
 be
 patient with me. I need to manage many matrices generated by an R
 program.
 These matrices have different dimensions and I'd like to group them
 somehow.
 The best way would be to have a big matrix (let's call it database) where
 every element database[x,y] consists of a list of matrices that all have
 the
 dimensions ncol(matrix1)=x and nrow(matrix1)=y. So those matrices have to
 be
 embedded into lists and the lists have to be embedded in the big database
 matrix. If I simply try

  database=matrix(0,10,10)
  database[4,4]=c(matrix1,matrix2)

 I get

  Error in database[4, 4] = c(matrix1, matrix2) :
    number of items to replace is not a multiple of replacement length
  Execution halted

 which makes sense of course... Is there any possibility to make this
 work?
 Or maybe there is a better way to organize those matrices?

 Regards,
 Michael

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.







 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Canonical link for the GLM gamma model

2009-08-19 Thread Serguei Kaniovski
Hello!

When I estimate a glm gamma model using the canonical link (inverse), I 
get the opposite signs on almost all coefficients compared to the same 
model (i.e. with the same linear predictor) estimated using other suitable 
links (log, logit).

What confuses me is that most of sources quote the canonical link for the 
gamma model as 1/mu, but some quote -1/mu!

For example: 
http://support.sas.com/rnd/app/da/new/802ce/stat/chap5/sect19.htm

Which is the correct link, and can this explain the sign-flips I get in my 
models?

Thank you very much for your help / advice!
Serguei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odd results with Chi-square test. (Not an R problem, but general statistics, I think.)

2009-08-19 Thread mik07


Anybody any ideas?
Any help would be appreciated!

Cheers,
Mika
-- 
View this message in context: 
http://www.nabble.com/Odd-results-with-Chi-square-test.-%28Not-an-R-problem%2C-but-general-statistics%2C-I-think.%29-tp25026167p25041900.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mild and extreme outliers in boxplot

2009-08-19 Thread Rnewbie

dear all,

could somebody tell me how I can plot mild outliers as a circle(°) and
extreme outliers as an asterisk(*) in a box-whisker plot?

Thanks very much in advance
-- 
View this message in context: 
http://www.nabble.com/mild-and-extreme-outliers-in-boxplot-tp25040545p25040545.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 legend problem

2009-08-19 Thread hadley wickham
On Tue, Aug 18, 2009 at 11:10 PM, Chris Friedlcfrieda...@gmail.com wrote:

 Still struggling with this. A further example using a slightly different
 organisation of the data. The factors A and B are included in the
 dataframe in an attempt to get ggplot to generate a legend automatically.

 x - data.frame(value=rnorm(5000, mean=0), case=A)
 y - data.frame(value=rnorm(5000, mean=3), case=B)
 xy - rbind(x, y)
 ggplot(xy, aes(x=value, fill=case, group=case)) +
 geom_histogram(binwidth=0.1)
 ggplot(xy, aes(x=value, fill=case, group=case)) + geom_density(alpha=0.2)

 Whilst the legend is generated as expected the histogram and density plots
 are different. The density plots overlap each other whereas the histogram
 plots stack. I'm trying the get the histogram plots to overlap, and retain
 the legend. Is the histogram stacking by design? Can stacking be changed to
 overlapping?

I'm skeptical that this will create a useful plot, but

geom_histogram(binwidth=0.1, position = identity)

will do what you want.  You might also want to look at geom_freqpoly.

Alternatively, to use your previous approach, you just need to make a
couple of small changes:

g + geom_histogram(aes(x=X, fill = A), colour=black, binwidth = 0.1) +
   geom_histogram(aes(x=Y, fill = B), colour=black, binwidth = 0.1) +
  scale_fill_manual(Case, c(A = alpha(red, 0.5), B=alpha(blue,0.5)))

Previously you weren't supplying the fill aesthetic so the scale had
nothing to work with.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Performance measure for probabilistic predictions

2009-08-19 Thread Frank E Harrell Jr

Noah Silverman wrote:

Hello,

I'm using an SVM for predicting a model, but I'm most interested in the 
probability output.  This is easy enough to calculate.


My challenge is how to measure the relative performance of the SVM for 
different settings/parameters/etc.


An AUC curve comes to mind, but I'm NOT interested in predicting true vs 
false.  I am interested in finding the most accurate probability 
predictions possible.


I've seen some literature where the probability range is cut into 
segments and then the predicted probability is compared to the actual.  
This looks nice, but I need a more tangible numeric measure.  One 
thought was a measure of probability accuracy for each range, but how 
to calculate this.


Any thoughts?

-N


Noah,

This is a big area but I'm glad you are interested in probability 
accuracy rather than the more frequently (mis)-used classification 
accuracy.  There are many measures available.  For independent test 
samples the val.prob function in the Design package provides many.


When making a calibration plot to demonstrate absolute prediction 
accuracy, it is not a good idea to bin the predicted probabilities. 
val.prob uses loess to produce a smooth calibration curve.


Frank



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Petr PIKAL
Dear all

here is my data called rglp

structure(list(vzorek = structure(1:17, .Label = c(179/1/1, 
179/2/1, 180/1, 181/1, 182/1, 183/1, 184/1, 185/1, 
186/1, 187/1, 188/1, 189/1, 190/1, 191/1, 192/1, 
R310, R610L), class = factor), iep = c(7.51, 7.79, 5.14, 
6.35, 5.82, 7.13, 5.95, 7.27, 6.29, 7.5, 7.3, 7.27, 6.46, 6.95, 
6.32, 6.32, 6.34), skupina = c(7.34, 7.34, 5.14, 6.23, 6.23, 
7.34, 6.23, 7.34, 6.23, 7.34, 7.34, 7.34, 6.23, 7.34, 6.23, 6.23, 
6.23), sio2 = c(0.023, 0.011, 0.88, 0.028, 0.031, 0.029, 0.863, 
0.898, 0.95, 0.913, 0.933, 0.888, 0.922, 0.882, 0.923, 1, 1), 
p2o5 = c(0.78, 0.784, 1.834, 1.906, 1.915, 0.806, 1.863, 
0.775, 0.817, 0.742, 0.783, 0.759, 0.787, 0.758, 0.783, 3, 
2), al2o3 = c(5.812, 5.819, 3.938, 5.621, 3.928, 3.901, 5.621, 
5.828, 4.038, 5.657, 3.993, 5.735, 4.002, 5.728, 4.042, 6, 
5), dus = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c(ano, ne), class = 
factor)), .Names = c(vzorek, 
iep, skupina, sio2, p2o5, al2o3, dus), class = data.frame, 
row.names = c(NA, 
-17L))

and I try to do principal component analysis. Here is one without scaling

fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2)
biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8)

you can see that data make 3 groups according to variables sio2 and dus 
which seems to be reasonable as lowest group has different value of dus = 
ano while highest group has low value of sio2.

But when I do the same with scale=T

fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2, 
scale=T)
biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8)

I get completely different picture which is not possible to interpret in 
such an easy way.

So if anybody can advice me if I shall follow recommendation from help 
page (which says
The default is FALSE for consistency with S, but in general scaling is 
advisable.
or if I shall stay with scale = FALSE and with simply interpretable 
result?
 
Thank you.

Petr Pikal
petr.pi...@precheza.cz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Embedding lists in matrices and matrices in lists

2009-08-19 Thread Michael Kogan
This works, but then I can only save a single matrix in each 
database[x,y] while I need to save a list of matrices.


Gabor Grothendieck schrieb:

Try this:

  

database[[4,4]] - tetrahedron
database[[4,4]][1,]


[1] 0 1 1 1


On Wed, Aug 19, 2009 at 6:02 AM, Michael Koganmichael.ko...@gmx.net wrote:
  

Unfortunately the  matrix(list(),x,y) way seems to have some limitations. I
want to continue working with the matrices which are saved in the database
matrix. But for example the following doesn't work:

tetrahedron=matrix(c(
0,1,1,1,
1,0,1,1,
1,1,0,1,
1,1,1,0
),nrow=4, byrow=TRUE) # an example matrix

database=matrix(list(),5,5) # create database matrix
database[[4,4]]=list(tetrahedron) # save example matrix in database matrix

## try to access the first row of the saved example matrix ##
database[[4,4]][1] # prints the whole example matrix
database[[4,4]][1][1,] # should have printed the first row of the example
matrix, but gives:

Error in database[[4, 4]][1][1,] : incorrect number of dimensions
Execution halted

The same happens with database[[4,4]][1,]... Is there any way to access the
saved matrices like normal ones?

Thanks,
Michael








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Embedding lists in matrices and matrices in lists

2009-08-19 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 19.08.2009 13:04:39:

 Strange, it doesn't work for me:
 
 Error in database[4, 4][[1]][1, ] : incorrect number of dimensions
 Execution halted
 
 R version 2.9.0 (2009-04-17) on Arch Linux, no additional packages 
 installed.

database[4,4][[1]][,1]

does not work for me either but it is result of your complicated structure 
of nested lists

This works

 database[4,4][[1]][[1]][,1]
[1] 0 1 1 1

See the structure

Matrix of lists

 database
 [,1] [,2] [,3] [,4]   [,5]
[1,] NULL NULL NULL NULL   NULL
[2,] NULL NULL NULL NULL   NULL
[3,] NULL NULL NULL NULL   NULL
[4,] NULL NULL NULL List,1 NULL
[5,] NULL NULL NULL NULL   NULL

List of lists
 database[4,4]
[[1]]
[[1]][[1]]
 [,1] [,2] [,3] [,4]
[1,]0111
[2,]1011
[3,]1101
[4,]1110

List which contains your matrix

 database[4,4][[1]]
[[1]]
 [,1] [,2] [,3] [,4]
[1,]0111
[2,]1011
[3,]1101
[4,]1110

Here is your matrix

 database[4,4][[1]][[1]]
 [,1] [,2] [,3] [,4]
[1,]0111
[2,]1011
[3,]1101
[4,]1110


Regards
Petr


 
 David Winsemius schrieb:
 
  On Aug 19, 2009, at 6:02 AM, Michael Kogan wrote:
 
  Unfortunately the  matrix(list(),x,y) way seems to have some 
  limitations. I want to continue working with the matrices which are 
  saved in the database matrix. But for example the following doesn't 
  work:
 
  tetrahedron=matrix(c(
  0,1,1,1,
  1,0,1,1,
  1,1,0,1,
  1,1,1,0
  ),nrow=4, byrow=TRUE) # an example matrix
 
  database=matrix(list(),5,5) # create database matrix
  database[[4,4]]=list(tetrahedron) # save example matrix in database 
  matrix
 
  ## try to access the first row of the saved example matrix ##
  database[[4,4]][1] # prints the whole example matrix
 
  I was surprised that this worked. I would have expected that you would 

  have used:
  database[4,4][[1]]   since database is a matrix and generally one 
  accesses matrix elements
  with [n,m] addressing, while the element is a list and would 
  need [[n]] addressing.
 
  database[[4,4]][1][1,] # should have printed the first row of the 
  example matrix, but gives:
 
  So I tried:
 
  database[4,4][[1]][1,]
  # [1] 0 1 1 1 Success
 
 
 
  Error in database[[4, 4]][1][1,] : incorrect number of dimensions
  Execution halted
 
  The same happens with database[[4,4]][1,]... Is there any way to 
  access the saved matrices like normal ones?
 
  Thanks,
  Michael
 
 
  David Winsemius, MD
  Heritage Laboratories
  West Hartford, CT
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Duncan Murdoch

On 19/08/2009 8:31 AM, Petr PIKAL wrote:

Dear all

here is my data called rglp

structure(list(vzorek = structure(1:17, .Label = c(179/1/1, 
179/2/1, 180/1, 181/1, 182/1, 183/1, 184/1, 185/1, 
186/1, 187/1, 188/1, 189/1, 190/1, 191/1, 192/1, 
R310, R610L), class = factor), iep = c(7.51, 7.79, 5.14, 
6.35, 5.82, 7.13, 5.95, 7.27, 6.29, 7.5, 7.3, 7.27, 6.46, 6.95, 
6.32, 6.32, 6.34), skupina = c(7.34, 7.34, 5.14, 6.23, 6.23, 
7.34, 6.23, 7.34, 6.23, 7.34, 7.34, 7.34, 6.23, 7.34, 6.23, 6.23, 
6.23), sio2 = c(0.023, 0.011, 0.88, 0.028, 0.031, 0.029, 0.863, 
0.898, 0.95, 0.913, 0.933, 0.888, 0.922, 0.882, 0.923, 1, 1), 
p2o5 = c(0.78, 0.784, 1.834, 1.906, 1.915, 0.806, 1.863, 
0.775, 0.817, 0.742, 0.783, 0.759, 0.787, 0.758, 0.783, 3, 
2), al2o3 = c(5.812, 5.819, 3.938, 5.621, 3.928, 3.901, 5.621, 
5.828, 4.038, 5.657, 3.993, 5.735, 4.002, 5.728, 4.042, 6, 
5), dus = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c(ano, ne), class = 
factor)), .Names = c(vzorek, 
iep, skupina, sio2, p2o5, al2o3, dus), class = data.frame, 
row.names = c(NA, 
-17L))


and I try to do principal component analysis. Here is one without scaling

fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2)
biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8)

you can see that data make 3 groups according to variables sio2 and dus 
which seems to be reasonable as lowest group has different value of dus = 
ano while highest group has low value of sio2.


But when I do the same with scale=T

fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2, 
scale=T)

biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8)

I get completely different picture which is not possible to interpret in 
such an easy way.


So if anybody can advice me if I shall follow recommendation from help 
page (which says
The default is FALSE for consistency with S, but in general scaling is 
advisable.
or if I shall stay with scale = FALSE and with simply interpretable 
result?


I would say the answer depends on the meaning of the variables.  In the 
unusual case that they are measured in dimensionless units, it might 
make sense not to scale.  But if you are using arbitrary units of 
measurement, do you want your answer to depend on them?  For example, if 
you change from Kg to mg, the numbers will become much larger, the 
variable will contribute much more variance, and it will become a more 
important part of the largest principal component.  Is that sensible?


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package read large file

2009-08-19 Thread jim holtman
A little more detail would be appropriate.  How large is large?
What is the format of the file?  What do you wnat to do with the data?
 Do you have to have it all in memory at once?  Does it exist in a
data base already?  What type of system are you running on?  How much
memory do you have?  

On Wed, Aug 19, 2009 at 4:51 AM, Mohamed Lajnefmohamed.laj...@inserm.fr wrote:
 Dear R-Users,

 I am looking for packages that could read large files in R?
 any suggestions are welcome.
 Regards,
 ML

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Petr PIKAL
Thank you

Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52:

 On 19/08/2009 8:31 AM, Petr PIKAL wrote:
  Dear all
  

snip

 
 I would say the answer depends on the meaning of the variables.  In the 
 unusual case that they are measured in dimensionless units, it might 
 make sense not to scale.  But if you are using arbitrary units of 
 measurement, do you want your answer to depend on them?  For example, if 

 you change from Kg to mg, the numbers will become much larger, the 
 variable will contribute much more variance, and it will become a more 
 important part of the largest principal component.  Is that sensible?

Basically variables are in percentages (all between 0 and 6%) except dus 
which is present or not present (for the purpose of prcomp transformed to 
0/1 by as.numeric:). The only variable which is not such is iep which is 
basically in range 5-8. So ranges of all variables are quite similar. 

What surprises me is that biplot without scaling I can interpret by used 
variables while biplot with scaling is totally different and those two 
pictures does not match at all. This is what surprised me as I would 
expected just a small difference between results from those two settings 
as all numbers are quite comparable and does not differ much.

Best regards
Petr

 
 Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 legend problem

2009-08-19 Thread ONKELINX, Thierry
Dear Chris,

First of all I would go for the density plot. The 'extra' info from the
histogram is just noise. So I guess you are not interessed in it.

ggplot(xy, aes(x=value, colour=case, group=case)) + geom_density()

But is you want to stick with a histogram then I would use one of the
two below

ggplot(xy, aes(x=value, fill=case, group=case)) +
geom_histogram(binwidth=0.1, position = identity, alpha = 0.2)
ggplot(xy, aes(x=value, fill=case, group=case)) +
geom_histogram(binwidth=0.1, position = dodge) 

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens hadley wickham
Verzonden: woensdag 19 augustus 2009 14:18
Aan: Chris Friedl
CC: r-help@r-project.org
Onderwerp: Re: [R] ggplot2 legend problem

On Tue, Aug 18, 2009 at 11:10 PM, Chris Friedlcfrieda...@gmail.com
wrote:

 Still struggling with this. A further example using a slightly 
 different organisation of the data. The factors A and B are 
 included in the dataframe in an attempt to get ggplot to generate a
legend automatically.

 x - data.frame(value=rnorm(5000, mean=0), case=A) y - 
 data.frame(value=rnorm(5000, mean=3), case=B) xy - rbind(x, y) 
 ggplot(xy, aes(x=value, fill=case, group=case)) +
 geom_histogram(binwidth=0.1)
 ggplot(xy, aes(x=value, fill=case, group=case)) + 
 geom_density(alpha=0.2)

 Whilst the legend is generated as expected the histogram and density 
 plots are different. The density plots overlap each other whereas the 
 histogram plots stack. I'm trying the get the histogram plots to 
 overlap, and retain the legend. Is the histogram stacking by design? 
 Can stacking be changed to overlapping?

I'm skeptical that this will create a useful plot, but

geom_histogram(binwidth=0.1, position = identity)

will do what you want.  You might also want to look at geom_freqpoly.

Alternatively, to use your previous approach, you just need to make a
couple of small changes:

g + geom_histogram(aes(x=X, fill = A), colour=black, binwidth = 0.1)
+
   geom_histogram(aes(x=Y, fill = B), colour=black, binwidth = 0.1)
+
  scale_fill_manual(Case, c(A = alpha(red, 0.5),
B=alpha(blue,0.5)))

Previously you weren't supplying the fill aesthetic so the scale had
nothing to work with.

Hadley

--
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] three dimensions barchart

2009-08-19 Thread David Winsemius

On Aug 19, 2009, at 7:44 AM, Yichih Hsieh wrote:


 Thanks David and Berton,

 But I am so sorry for my mistake.
 I have just found what I want is a three dimension histogram.
 So correct problem is:

 I draw the Relative Distribution graph,
 it looks like histogram(X,Y plot),
 but I have ten(year) Relative Distribution graphs,
 have any command can combine ten histograms(X,Y plot) to become a  
 three dimensions histogram(X,Y,Z plot)?


 I check David provide' website
 http://lmdvr.r-forge.r-project.org/figures/figures.html#%20Figure%206.15
 but can't find three dimension histogram.

I suggested looking at Figures 6.4 and  6.15.  (I did not say they  
would be labeled 3D histogram. ) Did the displayed webpage not offer  
those possibilities? I am using Firefox 3.5.2 and having no problems  
with displaying the figure or viewing the code when using the above URL.

-- 
DW

 2009/8/19 David Winsemius dwinsem...@comcast.net

 On Aug 18, 2009, at 7:18 AM, Yichih Hsieh wrote:
 snipped

David Winsemius, MD
Heritage Laboratories
West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph label greek symbol failure

2009-08-19 Thread e-letter
On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote:
 On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote:
 On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote:
  On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote:
 
  I have tried to add the delta (δ) symbol to the y axis label and the
  result is D, using the command:
 
  ...ylab=δt...
 
  Try ylab = expression(delta*t) instead.
 
 This does not work, the result is
 expression(delta*t)

 It works for the rest of us who suggested this.

 plot(1:10, ylab = expression(delta*t))

True, but the following commands fails:
plot(1:10,ylab=temperature expression(delta*t))
plot(1:10,ylab=temperature expression(delta*t))
Error: syntax error, unexpected SYMBOL, expecting ',' in
plot(1:10,ylab=temperature expression

So I want to be able to have 'δt' and 'temperature δt' as a y-axis label.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Duncan Murdoch

On 19/08/2009 9:02 AM, Petr PIKAL wrote:

Thank you

Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52:


On 19/08/2009 8:31 AM, Petr PIKAL wrote:

Dear all



snip

I would say the answer depends on the meaning of the variables.  In the 
unusual case that they are measured in dimensionless units, it might 
make sense not to scale.  But if you are using arbitrary units of 
measurement, do you want your answer to depend on them?  For example, if 


you change from Kg to mg, the numbers will become much larger, the 
variable will contribute much more variance, and it will become a more 
important part of the largest principal component.  Is that sensible?


Basically variables are in percentages (all between 0 and 6%) except dus 
which is present or not present (for the purpose of prcomp transformed to 
0/1 by as.numeric:). The only variable which is not such is iep which is 
basically in range 5-8. So ranges of all variables are quite similar. 

What surprises me is that biplot without scaling I can interpret by used 
variables while biplot with scaling is totally different and those two 
pictures does not match at all. This is what surprised me as I would 
expected just a small difference between results from those two settings 
as all numbers are quite comparable and does not differ much.



If you look at the standard deviations in the two cases, I think you can 
see why this happens:


Scaled:

Standard deviations:
[1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397

Not Scaled:

Standard deviations:
[1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582


The first two sds are close, so small changes to the data will affect 
their direction a lot.  Your biplots look at the 2nd and 3rd components.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph label greek symbol failure

2009-08-19 Thread baptiste auguie
plot(1:10,ylab=expression(temperature *delta*t))

2009/8/19 e-letter inp...@gmail.com:
 On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote:
 On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote:
 On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote:
  On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote:
 
  I have tried to add the delta (δ) symbol to the y axis label and the
  result is D, using the command:
 
  ...ylab=δt...
 
  Try ylab = expression(delta*t) instead.
 
 This does not work, the result is
 expression(delta*t)

 It works for the rest of us who suggested this.

 plot(1:10, ylab = expression(delta*t))

 True, but the following commands fails:
 plot(1:10,ylab=temperature expression(delta*t))
 plot(1:10,ylab=temperature expression(delta*t))
 Error: syntax error, unexpected SYMBOL, expecting ',' in
 plot(1:10,ylab=temperature expression

 So I want to be able to have 'δt' and 'temperature δt' as a y-axis label.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to specify two variance effects in gls

2009-08-19 Thread Bond, Stephen
Hello everybody,

I have a dataset where each row has number of subjects and that gives me 
natural weights for the variance function. Additionally I see that variance 
increases with Age, which is a regressor.
So using gls I have



weights=varFixed(~1/n)

but don't know how to include the extra effect of the regressor.
Fitted values show a quadratic curve vs. age, not sure if that helps.

Thanks everybody.
Stephen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confidence interval on parameters from optim function

2009-08-19 Thread Devred, Emmanuel
Hi everyone,

I have two questions:

I would like to get confidence intervals on the coefficients derived
from the optim() function.
I apply optim() to a given function f
 res -
optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.))
And I would like to get the p-value and confidence intervals associated
with 
 res$par

My second question deals with error message. I am doing a loop with the
optim() function in it, when I get an error message like below, the loop
is stopped, however I would like to change my initial values to avoid
this error message and keep the loop going, so if it crashes when I am
away the program can still run, any idea ?

Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower =
c(0,  : 
  L-BFGS-B needs finite values of 'fn'

Thank you for any information on these two problems.

  Emmanuel

---
Dr. Emmanuel Devred
Bedford Institute of Oceanography,
1 Challenger Drive,
Dartmouth, Nova Scotia, B2Y 4A2,
Canada

Ph:  (1) 902 426-4681
Fax: (1) 902 426-9388

devr...@mar.dfo-mpo.gc.ca

http://myweb.dal.ca/edevred/
---



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help understanding lrm function of Design library

2009-08-19 Thread Michael Denslow
On Tue, Aug 18, 2009 at 8:13 AM, Frank E Harrell
Jrf.harr...@vanderbilt.edu wrote:
 Noah Silverman wrote:

 Hi,

 I'm developing an experiment with logistic regression.

 I've come across the lrm function in the Design  library.

 While I understand and can use the basic functionality, there are a ton of
 options that go beyond my knowledge.

 I've carefully read the help page for lrm, but don't understand many of
 the arguments and optional return values.
 (penalty, penalty.matrix, strata.penalty, var.penalty, weights)

 Additionally, there are several items returned in the model that I don't
 understand.
 (Max Derriv, Model L.R, C, Dxy, Gamma, Tau-a, R2, Briar, Wald Z)

 1) Could someone point me to an online resource where I could learn more?
  (I'm big on trying to teach myself.)

 2) Or, alternately, does anybody want to explain a few things for me?

 Thanks!

 --
 Noah

 My book Regression Modeling Strategies covers much of that.  Its case study
 on penalized ordinal logistic regression is also largely in a paper in
 Statistics in Medicine if you have access to that.  Once you do some
 background reading I'll be glad to answer some focused questions.
 Frank

 --
 Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University


Noah,

I have found the Regression Modeling Strategies book to be incredibly
valuable. I also took a Regression course with Frank Harrell this past
February 
(http://biostat.mc.vanderbilt.edu/wiki/Main/RmS#2009_Upcoming_short_courses_in_R).
I can not say enough about this course. I am an ecologist but had no
problem with the mostly medical examples. Dr. Harrell is a great
teacher! There was so much great information presented that I hope to
repeat the course if at all possible!

Good luck,
Michael


-- 
Michael Denslow

Graduate Student
I.W. Carpenter Jr. Herbarium [BOON]
Department of Biology
Appalachian State University
Boone, North Carolina U.S.A.

-- AND --

Communications Manager
Southeast Regional Network of Expertise and Collections
sernec.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RGoogleDocs/RCurl through proxy

2009-08-19 Thread Duncan Temple Lang

Hi Remko

 There is a new version (1.1-0) of the RCurl package
(on which RGoogleDocs depends) (no binary for Windows at this point).

This version allows one to specify default curl options that are used 
each time a new curl handle/object is created.


You set these defaults in R's own options() list, e.g.

  options( RCurlOptions = list(verbose = TRUE,
   proxy = host:port,
   noproxy = www.nytimes.com))


The proxy option is most likely what you need to set.
You can specify the host, port, user and password in this string.
Or you can specify the user:password string via the 'proxyuserpwd'
option.  You can even specify the mechanism to use for authenticating
the user via the 'proxyauth' option.

libcurl (and hence RCurl) has several different options
related to proxies
http://curl.haxx.se/libcurl/c/curl_easy_setopt.html

  D.

Remko Duursma wrote:

Dear list,

I am trying to use RGoogleDocs, but I am connecting through a proxy server.

I know RCurl is used for the connection, which should be able to deal
with proxies and such.
How do I set this up for RCurl? And can I use those settings with
RGoogleDocs as well?

I have the name of the proxy server and the port number.

(Windows XP).

thanks,
Remko



-
Remko Duursma
Post-Doctoral Fellow

Centre for Plants and the Environment
University of Western Sydney
Hawkesbury Campus
Richmond NSW 2753

Dept of Biological Science
Macquarie University
North Ryde NSW 2109
Australia

Mobile: +61 (0)422 096908
www.remkoduursma.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Petr PIKAL
Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00:

 On 19/08/2009 9:02 AM, Petr PIKAL wrote:
  Thank you
  
  Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52:
  
  On 19/08/2009 8:31 AM, Petr PIKAL wrote:
  Dear all
 
  
  snip
  
  I would say the answer depends on the meaning of the variables.  In 
the 
  unusual case that they are measured in dimensionless units, it might 
  make sense not to scale.  But if you are using arbitrary units of 
  measurement, do you want your answer to depend on them?  For example, 
if 
  
  you change from Kg to mg, the numbers will become much larger, the 
  variable will contribute much more variance, and it will become a 
more 
  important part of the largest principal component.  Is that sensible?
  
  Basically variables are in percentages (all between 0 and 6%) except 
dus 
  which is present or not present (for the purpose of prcomp transformed 
to 
  0/1 by as.numeric:). The only variable which is not such is iep which 
is 
  basically in range 5-8. So ranges of all variables are quite similar. 
  
  What surprises me is that biplot without scaling I can interpret by 
used 
  variables while biplot with scaling is totally different and those two 

  pictures does not match at all. This is what surprised me as I would 
  expected just a small difference between results from those two 
settings 
  as all numbers are quite comparable and does not differ much.
 
 
 If you look at the standard deviations in the two cases, I think you can 

 see why this happens:
 
 Scaled:
 
 Standard deviations:
 [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397
 
 Not Scaled:
 
 Standard deviations:
 [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582
 
 
 The first two sds are close, so small changes to the data will affect 

I see. But I would expect that changes to data made by scaling would not 
change it in such a way that unscaled and scaled results are completely 
different.

 their direction a lot.  Your biplots look at the 2nd and 3rd components.

Yes because grouping in 2nd and 3rd component biplot can be easily 
explained by values of some variables (without scaling). 

I must admit that I do not use prcomp much often and usually scaling can 
give me explainable result, especially if I use it to variable 
reduction. Therefore I am reluctant to use it in this case.

when I try more standard way

 fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp)
 summary(fit)

Call:
lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp)

Residuals:
 Min   1Q   Median   3Q  Max 
-0.41751 -0.15568 -0.03613  0.20124  0.43046 

Coefficients:
Estimate Std. Error t value Pr(|t|) 
(Intercept)  7.120850.62257  11.438 8.24e-08 ***
sio2-0.672500.20953  -3.210 0.007498 ** 
al2o30.405340.08641   4.691 0.000522 ***
p2o5-0.769090.11103  -6.927 1.59e-05 ***
as.numeric(dus) -0.640200.18101  -3.537 0.004094 ** 

I get quite plausible result which can be interpreted without problems.

My data is a result of designed experiment (more or less :) and therefore 
all variables are significant. Is that the reason why scaling may bye 
inappropriate in this case?

Regards
Petr Pikal

 
 Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm.fit algo

2009-08-19 Thread Pavlo Kononenko
Thank you, Steve!

2009/8/18 Steve Lianoglou mailinglist.honey...@gmail.com:
 Hi,

 On Aug 17, 2009, at 5:09 PM, Pavlo Kononenko wrote:

 Hi, everyone,

 This is a little silly, but I cant figure out the algorithm behind
 lm.fit function used in the context of promax rotation algorithm:

 The promax function is:

 promax - function(x, m = 4)
 {
   if(ncol(x)  2) return(x)
   dn - dimnames(x)
   xx - varimax(x)
   x - xx$loadings
   Q - x * abs(x)^(m-1)
   U - lm.fit(x, Q)$coefficients
   d - diag(solve(t(U) %*% U))
   U - U %*% diag(sqrt(d))
   dimnames(U) - NULL
   z - x %*% U
   U - xx$rotmat %*% U
   dimnames(z) - dn
   class(z) - loadings
   list(loadings = z, rotmat = U, crap = x, coeff = Q)
 }

 And the line I'm having trouble with is:

 U - lm.fit(x, Q)$coefficients

 Isn't this doing a least squares regression using the predictor variables in
 x and the (I guess) real valued numbers in vector Q?

 x is a matrix of n (observations) by p (predictors)

 The $coefficients is just taking the vector of coefficients/weights over the
 predictors  -- this would be a vector of length p -- such that

 x %*% t(t(U)) ~ Q

  * t(t(U)) is ugly, but I just want to say get U to be a column vector
  * ~ is used as almost equals)

 You'll need some numerical/scientific/matrix library in java, perhaps this
 could be a place to start:

 http://commons.apache.org/math/userguide/stat.html#a1.5_Multiple_linear_regression

 Hope that helps,
 -steve

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help for t.test using by

2009-08-19 Thread 1Rnwb

any one

1Rnwb wrote:
 
 I am trying to do ttest for each plate which has equal number of disease
 and controls. by searching this forum I found one posting suggesting OP to
 use by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub)). when i
 modified this for my use I used to get the pvalues for each plate,
 recently upgraded to R2.9.x and now i am getting following error when i
 use this 
  by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub))
 Error in t.test.formula(IL1Ra ~ Group, data = .sub) : 
   grouping factor must have exactly 2 levels
 
 I checked my data file and the plate ID has P16-P47 and the disease
 column is Y/N. I have search all but could not locate the answer. Can
 anyone suggest me the direction where i am going wrong. 
 for more detail i amm posting a part of my data.
 
 SampleID  PlateID Sex Disease DurationAge prt1prt2
 prt3prt4
 1 P16 F   N   25.33282811 836.08979   
 20.04582692 295.74  11731.43
 2 P16 F   N   32.74912883 243.0652116 
 53.16487056 383.63  4451.36
 3 P16 M   N   3.49961999  181.5587298 
 23.24604762 522.52  325.52
 4 P16 M   N   5.249771666 261.0978097 
 19.69024684 833.61  2229.04
 5 P16 M   N   39.16612385 237.334794  
 83.97284692 694.03  1204.36
 6 P16 F   N   33.49929145 540.486536  
 29.69569346 895.36  72105.75
 7 P16 F   N   3.915997361 5215.378446 
 35.44324704 1023.63 101680.69
 8 P16 F   N   33.49929145 466.5188732 
 12.93422535 814.26  7169.61
 9 P16 F   N   38.1661348  706.5117791 
 19.69024684 331.62  7296.72
 10P16 M   N   0.916030215 242.3011559 
 36.40117263 90.135962.08
 11P16 F   Y   14975.2542.08281552 133.8052501 
 29.0570764  260.55  1280.29
 12P16 M   Y   1095.75 40.41593932 247.6495456 
 19.69024684 161.64  6685.6
 13P16 F   Y   292225.08300169 231.6043765 
 9.022844487 902.53  37571.39
 14P16 F   Y   12053.2537.49924765 158.6370595 
 24.90606549 1.63471.15
 15P16 F   Y   14610   76.58209602 278.3837309 
 41.51010914 152.13  16285.49
 16P16 F   Y   3287.25 49.58239171 128.4568603 
 17.91234645 274.99  41823.21
 17P17 M   N   2.333079994 397.4440508 
 55.20987654 366.37  90011.27
 18P17 F   N   4.749435461 222.0839813 
 73.62489675 271.65  903.89
 19P17 F   N   1.749468315 676.8904636 
 47.7037037  721.19  663.15
 20P17 M   N   33.24946503 413.712486  
 27.42386831 611.64  4195.86
 21P17 F   N   15.4162131  346.9913086 
 30.71604938 302.29  2151.47
 22P17 F   N   33.3327405  2598.071161 
 107.8088094 1800.44 102005.12
 23P17 M   N   9.832656179 3296.535581 
 54.1563786  2642.73 676.78
 24P17 F   N   3.749446413 488.2589867 
 15.29151952 562.88  11619.62
 25P17 M   N   9.749380705 159.1536586 
 23.41690259 132.64  931.14
 26P17 M   Y   12053.2550.91548265 173.7764722 
 18.90280088 52.25   15962.05
 27P17 M   Y   3652.5  25.83248096 264.8254141 
 35.32510288 497.33  15504.95
 28P17 F   Y   28.41607073 415.3599225 
 35.0617284  1600.67 14224.39
 29P17 F   Y   6209.25 60.74882219 213.7280879 
 44.41152263 111.36  9018.48
 30P17 M   Y   4748.25 47.33258719 264.0016959 
 31.24279835 366.81  3809.83
 31P17 M   Y   16071   54.49906148 112.4128794 
 34.40329218 114.56  1727.02
 32P17 M   Y   7670.25 52.33253243 145.5753317 
 32.2962963  44.92   15264.84
 33P17 M   Y   876631.74913978 220.2561296 
 34.40329218 130.31  2377.36
 34P17 F   Y   8400.75 40.83231669 194.6662059 
 48.09876543 407.83  19587.61
 35P17 M   Y   12053.2536.1661567  212.1613578 
 96.51775947 621.42  3269.9
 36P17 M   Y   8035.5  57.91540599 115.2852178 
 24.79012346 282.88  8960.27
 37P17 M   Y   5478.75 36.33270765 160.4592669 
 40.72427984 102.32  5006.77
 38P17 F   Y   9496.5  34.74910692 329.8991551 
 40.32921811 417.32  248857.58
 39P18 F   N   14.2496731 

Re: [R] Confidence interval on parameters from optim function

2009-08-19 Thread Ravi Varadhan
Hi Emmanuel,

You can obtain standard error estimate from the Hessian matrix evaluated at
the optimum as: sqrt(diag(solve(ans$hessian))), where `ans' is the object
returned by `optim'. However, `optim' does not return the Hessian matrix by
default.  So, you need to specify `hessian = TRUE' when you call `optim'.
There are two issues that I would think about:

1.  You need to pay attention to the standard regularity conditions that are
required in order for the confidence interval to be valid.  One important
condition is that the solution be in the interior of the parameter space.
Since you are using box-constraints, I would check to make sure that the
solution is not on the boundary. 

2.  The Hessian returned by `optim' is a bit inaccurate.  Often, this is not
a big deal.  There are situations, however, where this matters, as you might
get an indefinite Hessian due to inaccuracy. So, I generally prefer to use
the `hessian' function in the numDeriv package.

Ravi.


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Devred, Emmanuel
Sent: Wednesday, August 19, 2009 9:41 AM
To: r-help@r-project.org
Subject: [R] Confidence interval on parameters from optim function

Hi everyone,

I have two questions:

I would like to get confidence intervals on the coefficients derived
from the optim() function.
I apply optim() to a given function f
 res -
optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.))
And I would like to get the p-value and confidence intervals associated
with 
 res$par

My second question deals with error message. I am doing a loop with the
optim() function in it, when I get an error message like below, the loop
is stopped, however I would like to change my initial values to avoid
this error message and keep the loop going, so if it crashes when I am
away the program can still run, any idea ?

Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower =
c(0,  : 
  L-BFGS-B needs finite values of 'fn'

Thank you for any information on these two problems.

  Emmanuel

---
Dr. Emmanuel Devred
Bedford Institute of Oceanography,
1 Challenger Drive,
Dartmouth, Nova Scotia, B2Y 4A2,
Canada

Ph:  (1) 902 426-4681
Fax: (1) 902 426-9388

devr...@mar.dfo-mpo.gc.ca

http://myweb.dal.ca/edevred/
---



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting a logistic regression

2009-08-19 Thread 1Rnwb

with my limited understanding, I am not surprised to see this data fitting
nicely at the end just by eyeballing at it. the reaction at the early time
point is not completed as the time passes which is close to 20 units the
reaction generates more metabolite to be measured reliably your t=0 and t=10
are basically the same, if i limit the digits to 2 then the values will be
0.20. you can do the log transform and then try to fit it.


Dani Valverde-4 wrote:
 
 Hello,
 I have this data:
   Time  AMP
  0 0.200
 10 0.1958350
 20 0.2914560
 40 0.6763628
 60 0.8494534
 90 0.9874526
120 1.0477692
 
 where AMP is the concentration of this metabolite with time. If you plot
 the data, you can see that it could be fitted using a logistic
 regression. For this purpose, I used this code:
 
 AMP.nls - nls(AMP~SSlogis(Time,Asym, xmid, scal), data =
 concentrations,model=T)
 
 When plotting the fitted function, it seems that it fits quite well at
 the end of the time. However, at the beginning it seems that the fit is
 not so good. How can I achieve a better fit? Forgive me if it is a
 stupid question, but I am just starting with non linear regression.
 Thank you,
 
 Dani
 -- 
 [?]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Fitting-a-logistic-regression-tp25041444p25043712.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ridge regression

2009-08-19 Thread spime

Dear all,

I considered an ordinary ridge regression problem. I followed three
different ways:
1. estimate beta without any standardization
2. estimate standardized beta (standardizing X and y) and then again convert
back  
3. estimate beta using lm.ridge() function


X-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3)
y-t(as.matrix(cbind(2,3,4,5)))

n-nrow(X)
p-ncol(X)

#Without standardization
intercept - rep(1,n)
Xn - cbind(intercept, X)
K-diag(c(0,rep(1.5,p)))
beta1 - solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y
beta1

#with standardization
ys-scale(y)
Xs-scale(X)
K-diag(1.5,p)
bs - solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys 
b- sd(y)*(bs/sd(X))
intercept - mean(y)-sum(as.matrix(colMeans(X))*b)
beta2-rbind(intercept,b)
beta2

#Using lm.ridge function of MASS package
beta3-lm.ridge(y~X,lambda=1.5)

I'm getting three different outputs for using the above three different
approaches, but which would been the same for all. 

 beta1
[,1]
intercept  3.4007944
   0.3977462
   0.2082025
  -0.4829115
 beta2
[,1]
intercept  3.3399855
   0.1639469
   0.0262021
  -0.1228987

 beta3
 X1  X2  X3 
 3.35158977  0.19460958  0.03152778 -0.15546775 

It will be very helpful to me if anybody can help me regarding why the
outputs are coming different. 

regards.
-- 
View this message in context: 
http://www.nabble.com/ridge-regression-tp25045020p25045020.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence interval on parameters from optim function

2009-08-19 Thread Ravi Varadhan
Emmanuel,

I didn't answer your second question.

You can use the `try' function to capture errors and keep proceeding through
simulations without crashing out:  

?try

If `L-BFGS-B' does not work well, you could try the `spg' function in the
BB package.


Ravi.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Devred, Emmanuel
Sent: Wednesday, August 19, 2009 9:41 AM
To: r-help@r-project.org
Subject: [R] Confidence interval on parameters from optim function

Hi everyone,

I have two questions:

I would like to get confidence intervals on the coefficients derived
from the optim() function.
I apply optim() to a given function f
 res -
optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.))
And I would like to get the p-value and confidence intervals associated
with 
 res$par

My second question deals with error message. I am doing a loop with the
optim() function in it, when I get an error message like below, the loop
is stopped, however I would like to change my initial values to avoid
this error message and keep the loop going, so if it crashes when I am
away the program can still run, any idea ?

Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower =
c(0,  : 
  L-BFGS-B needs finite values of 'fn'

Thank you for any information on these two problems.

  Emmanuel

---
Dr. Emmanuel Devred
Bedford Institute of Oceanography,
1 Challenger Drive,
Dartmouth, Nova Scotia, B2Y 4A2,
Canada

Ph:  (1) 902 426-4681
Fax: (1) 902 426-9388

devr...@mar.dfo-mpo.gc.ca

http://myweb.dal.ca/edevred/
---



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with predict.coxph

2009-08-19 Thread Michael Conklin
In examining the predict.coxph functions for the library I have with 2.7.1 
versus the library with 2.9.1 I find a major rewrite of the function. A number 
of internal survival functions are no longer present so much of the code has 
changed.  This makes identifying the specific problem beyond my capabilities.  
What I want to do, is generate predictions for specific combinations of 
covariates. The number of combinations I am interested in is different than the 
number of records in the original data file. Any help would be appreciated as 
some of the graphic routines I want to use on the data are only available in 
2.8 or greater - meaning I am currently looking at trying to run two different 
versions of R to get the project done.

TIA,

Michael Conklin 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Michael Conklin
Sent: Tuesday, August 18, 2009 8:26 PM
To: r-help@r-project.org
Subject: [R] Problem with predict.coxph

We occasionally utilize the coxph function in the survival library to fit 
multinomial logit models. (The breslow method produces the same likelihood 
function as the multinomial logit). We then utilize the predict function to 
create summary results for various combinations of covariates.  For example:

mod1-coxph(Depvar~Price:Product+strata(ID),data=MyDCMData2,na.action=na.omit,method=breslow)

The model runs fine.

Then we create some new data that is all combinations of Price and Product and 
retrieve the summary linear predictors.

newdata=expand.grid(Price=factor(as.character(1:5)),Product=factor(as.character(1:5)))
## create a utility matrix for all combinations of prices and products
totalut-predict(mod1,newdata=newdata,type=lp)

Under R 2.7.1 this produces the following output:

 totalut
  [,1]
1   0.01534582
2  -0.07628528
3  -0.88085189
4  -1.19458045
5  -1.03579684
6   0.40065672
7   0.15922492
8  -0.49233524
9  -0.65483441
10 -1.07739920
11  0.27589201
12  0.48055065
13  0.33638585
14 -0.28416678
15 -0.48762319
16  1.06071986
17  0.69041596
18  0.67479476
19  0.36360168
20 -0.09492167
21  0.66554276
22  0.55748465
23  0.37596413
24  0.01612020
25 -0.03567735


The problem is that under R 2.8.1 and R 2.9.1 the previous line fails with the 
following error:

 totalut-predict(mod1,newdata=newdata,type=lp)
Error in model.frame.default(Terms2, newdata, xlev = object$xlevels) :
  variable lengths differ (found for 'Price')
In addition: Warning message:
'newdata' had 25 rows but variable(s) found have 43350 rows


Does anyone have an idea what is going on?

Best regards,

Michael Conklin



W. Michael Conklin
Chief Methodologist

MarketTools, Inc. | www.markettools.comhttp://www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Duncan Murdoch

On 8/19/2009 10:14 AM, Petr PIKAL wrote:

Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00:


On 19/08/2009 9:02 AM, Petr PIKAL wrote:
 Thank you
 
 Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 14:49:52:
 
 On 19/08/2009 8:31 AM, Petr PIKAL wrote:

 Dear all

 
 snip
 
 I would say the answer depends on the meaning of the variables.  In 
the 
 unusual case that they are measured in dimensionless units, it might 
 make sense not to scale.  But if you are using arbitrary units of 
 measurement, do you want your answer to depend on them?  For example, 
if 
 
 you change from Kg to mg, the numbers will become much larger, the 
 variable will contribute much more variance, and it will become a 
more 

 important part of the largest principal component.  Is that sensible?
 
 Basically variables are in percentages (all between 0 and 6%) except 
dus 
 which is present or not present (for the purpose of prcomp transformed 
to 
 0/1 by as.numeric:). The only variable which is not such is iep which 
is 
 basically in range 5-8. So ranges of all variables are quite similar. 
 
 What surprises me is that biplot without scaling I can interpret by 
used 
 variables while biplot with scaling is totally different and those two 


 pictures does not match at all. This is what surprised me as I would 
 expected just a small difference between results from those two 
settings 

 as all numbers are quite comparable and does not differ much.


If you look at the standard deviations in the two cases, I think you can 



see why this happens:

Scaled:

Standard deviations:
[1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397

Not Scaled:

Standard deviations:
[1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582


The first two sds are close, so small changes to the data will affect 


I see. But I would expect that changes to data made by scaling would not 
change it in such a way that unscaled and scaled results are completely 
different.



their direction a lot.  Your biplots look at the 2nd and 3rd components.


Yes because grouping in 2nd and 3rd component biplot can be easily 
explained by values of some variables (without scaling). 

I must admit that I do not use prcomp much often and usually scaling can 
give me explainable result, especially if I use it to variable 
reduction. Therefore I am reluctant to use it in this case.


when I try more standard way


fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp)
summary(fit)


Call:
lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp)

Residuals:
 Min   1Q   Median   3Q  Max 
-0.41751 -0.15568 -0.03613  0.20124  0.43046 


Coefficients:
Estimate Std. Error t value Pr(|t|) 
(Intercept)  7.120850.62257  11.438 8.24e-08 ***
sio2-0.672500.20953  -3.210 0.007498 ** 
al2o30.405340.08641   4.691 0.000522 ***

p2o5-0.769090.11103  -6.927 1.59e-05 ***
as.numeric(dus) -0.640200.18101  -3.537 0.004094 ** 


I get quite plausible result which can be interpreted without problems.

My data is a result of designed experiment (more or less :) and therefore 
all variables are significant. Is that the reason why scaling may bye 
inappropriate in this case?


No, I think it's just that the cloud of points is approximately 
spherical in the first 2 or 3 principal components, so the principal 
component directions are somewhat arbitrary.  You just got lucky that 
the 2nd and 3rd components are interpretable:  I wouldn't put too much 
faith in being able to repeat that if you went out and collected a new 
set of data using the same design.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ridge regression

2009-08-19 Thread Frank E Harrell Jr
If you didn't post anonymously I would have made a suggestion.  Full 
names and affiliations should be given.


Frank


spime wrote:

Dear all,

I considered an ordinary ridge regression problem. I followed three
different ways:
1. estimate beta without any standardization
2. estimate standardized beta (standardizing X and y) and then again convert
back  
3. estimate beta using lm.ridge() function



X-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3)
y-t(as.matrix(cbind(2,3,4,5)))

n-nrow(X)
p-ncol(X)

#Without standardization
intercept - rep(1,n)
Xn - cbind(intercept, X)
K-diag(c(0,rep(1.5,p)))
beta1 - solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y
beta1

#with standardization
ys-scale(y)
Xs-scale(X)
K-diag(1.5,p)
bs - solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys 
b- sd(y)*(bs/sd(X))

intercept - mean(y)-sum(as.matrix(colMeans(X))*b)
beta2-rbind(intercept,b)
beta2

#Using lm.ridge function of MASS package
beta3-lm.ridge(y~X,lambda=1.5)

I'm getting three different outputs for using the above three different
approaches, but which would been the same for all. 


beta1

[,1]
intercept  3.4007944
   0.3977462
   0.2082025
  -0.4829115

beta2

[,1]
intercept  3.3399855
   0.1639469
   0.0262021
  -0.1228987


beta3
 X1  X2  X3 
 3.35158977  0.19460958  0.03152778 -0.15546775 


It will be very helpful to me if anybody can help me regarding why the
outputs are coming different. 


regards.



--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help for t.test using by

2009-08-19 Thread David Winsemius


On Aug 19, 2009, at 9:00 AM, 1Rnwb wrote:



any one


I looked at your posting yesterday and could not figure out what your  
statistical question was, nor could I figure out how the t.test  
formula related to the column names in the data.frame. I saw no Group  
or IL1Ra variables in your data. I also tried to import that data and  
found that you had more column headers than columns. I gave up.


--
DW



1Rnwb wrote:


I am trying to do ttest for each plate which has equal number of  
disease
and controls. by searching this forum I found one posting  
suggesting OP to
use by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub)).  
when i

modified this for my use I used to get the pvalues for each plate,
recently upgraded to R2.9.x and now i am getting following error  
when i

use this

by(eo,PlateID, function(.sub) t.test(IL1Ra~Group,data=.sub))

Error in t.test.formula(IL1Ra ~ Group, data = .sub) :
 grouping factor must have exactly 2 levels

I checked my data file and the plate ID has P16-P47 and the disease
column is Y/N. I have search all but could not locate the answer.  
Can

anyone suggest me the direction where i am going wrong.
for more detail i amm posting a part of my data.

SampleIDPlateID Sex Disease DurationAge prt1prt2
prt3prt4
1   P16 F   N   25.33282811 836.08979   
20.04582692 295.74  11731.43
2   P16 F   N   32.74912883 243.0652116 
53.16487056 383.63  4451.36
3   P16 M   N   3.49961999  181.5587298 
23.24604762 522.52  325.52
4   P16 M   N   5.249771666 261.0978097 
19.69024684 833.61  2229.04
5   P16 M   N   39.16612385 237.334794  
83.97284692 694.03  1204.36
6   P16 F   N   33.49929145 540.486536  
29.69569346 895.36  72105.75
7   P16 F   N   3.915997361 5215.378446 
35.44324704 1023.63 101680.69
8   P16 F   N   33.49929145 466.5188732 
12.93422535 814.26  7169.61
9   P16 F   N   38.1661348  706.5117791 
19.69024684 331.62  7296.72
10  P16 M   N   0.916030215 242.3011559 
36.40117263 90.135962.08
11  P16 F   Y   14975.2542.08281552 133.8052501 
29.0570764  260.55  1280.29
12  P16 M   Y   1095.75 40.41593932 247.6495456 
19.69024684 161.64  6685.6
13  P16 F   Y   292225.08300169 231.6043765 
9.022844487 902.53  37571.39
14  P16 F   Y   12053.2537.49924765 158.6370595 
24.90606549 1.63471.15
15  P16 F   Y   14610   76.58209602 278.3837309 
41.51010914 152.13  16285.49
16	P16	F	Y	3287.25	49.58239171	128.4568603	17.91234645	274.99	 
41823.21

17  P17 M   N   2.333079994 397.4440508 
55.20987654 366.37  90011.27
18  P17 F   N   4.749435461 222.0839813 
73.62489675 271.65  903.89
19  P17 F   N   1.749468315 676.8904636 
47.7037037  721.19  663.15
20  P17 M   N   33.24946503 413.712486  
27.42386831 611.64  4195.86
21  P17 F   N   15.4162131  346.9913086 
30.71604938 302.29  2151.47
22  P17 F   N   33.3327405  2598.071161 
107.8088094 1800.44 102005.12
23  P17 M   N   9.832656179 3296.535581 
54.1563786  2642.73 676.78
24  P17 F   N   3.749446413 488.2589867 
15.29151952 562.88  11619.62
25  P17 M   N   9.749380705 159.1536586 
23.41690259 132.64  931.14
26	P17	M	Y	12053.25	50.91548265	173.7764722	18.90280088	52.25	 
15962.05

27  P17 M   Y   3652.5  25.83248096 264.8254141 
35.32510288 497.33  15504.95
28  P17 F   Y   28.41607073 415.3599225 
35.0617284  1600.67 14224.39
29  P17 F   Y   6209.25 60.74882219 213.7280879 
44.41152263 111.36  9018.48
30  P17 M   Y   4748.25 47.33258719 264.0016959 
31.24279835 366.81  3809.83
31  P17 M   Y   16071   54.49906148 112.4128794 
34.40329218 114.56  1727.02
32  P17 M   Y   7670.25 52.33253243 145.5753317 
32.2962963  44.92   15264.84
33  P17 M   Y   876631.74913978 220.2561296 
34.40329218 130.31  2377.36
34	P17	F	Y	8400.75	40.83231669	194.6662059	48.09876543	407.83	 
19587.61

35  P17 M   Y   12053.2536.1661567  212.1613578 
96.51775947 621.42  3269.9
36  P17 M   Y   8035.5  57.91540599 115.2852178 

Re: [R] Fitting a logistic regression

2009-08-19 Thread Jun Shen
I would suggest a model with a baseline level, something like

nls(AMP~E0+(Emax-E0)*Time**gamma/(EC50**gamma+Time**gamma),data=your data,
start=list(EC50=50,gamma=2,E0=0.2,Emax=1.2))-mod.test

AIC(mod.test) does improve. Hope this helps.

Jun

On Wed, Aug 19, 2009 at 5:04 AM, Dani Valverde daniel.valve...@uab.catwrote:

 Hello,
 I have this data:
  Time  AMP
 0 0.200
10 0.1958350
20 0.2914560
40 0.6763628
60 0.8494534
90 0.9874526
   120 1.0477692

 where AMP is the concentration of this metabolite with time. If you plot
 the data, you can see that it could be fitted using a logistic
 regression. For this purpose, I used this code:

 AMP.nls - nls(AMP~SSlogis(Time,Asym, xmid, scal), data =
 concentrations,model=T)

 When plotting the fitted function, it seems that it fits quite well at
 the end of the time. However, at the beginning it seems that the fit is
 not so good. How can I achieve a better fit? Forgive me if it is a
 stupid question, but I am just starting with non linear regression.
 Thank you,

 Dani
 --
 [?]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jun Shen PhD
PK/PD Scientist
BioPharma Services
Millipore Corporation
15 Research Park Dr.
St Charles, MO 63304
Direct: 636-720-1589

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package read large file

2009-08-19 Thread Michael Knudsen
On Wed, Aug 19, 2009 at 10:51 AM, Mohamed
Lajnefmohamed.laj...@inserm.fr wrote:

 I am looking for packages that could read large files in R?
 any suggestions are welcome.

As already pointed out by Jim, your question is not very specific. My
wild guess is that you probably have some memory issues -- if that is
the case, maybe the package bigmemory can alleviate your pain.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mild and extreme outliers in boxplot

2009-08-19 Thread Ottorino-Luca Pantani

Rnewbie ha scritto:

dear all,

could somebody tell me how I can plot mild outliers as a circle(°) and
extreme outliers as an asterisk(*) in a box-whisker plot?

Thanks very much in advance
  

?boxplot

or

help(bxp)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Embedding lists in matrices and matrices in lists

2009-08-19 Thread Michael Kogan
Thanks, that was the solution! But in fact I didn't want to have this 
list of lists layer at all. And now I'm having trouble writing 
matrices into the database.  It's really really strange... If I write a 
matrix into the database manually everything works, but if I create a 
function which adds a matrix into the database and then run the function 
it fails... Look at this (sorry for the long code block, in fact you can 
skip the sums and test_sums functions):

##test matrix###
m1=matrix(c(
1,0,
0,1
),nrow=2, byrow=TRUE)

#sums###
sums=function(matrix)
{
c(sort(colSums(matrix)),sort(rowSums(matrix)))
}

##test_sums#
test_sums=function(matrix)
{
liste=database[[nrow(matrix),ncol(matrix)]][[1]]
w=1
if(length(liste)0){
for(i in 1:length(liste)){
if(all(sums(matrix)==sums(liste[[i]]))){
w0=0
}else{
w0=1}
w=w*w0}
}else{
w=2}
w
}

#write to db
write_to_db=function(matrix){
en=nrow(matrix)
fn=ncol(matrix)
if(test_sums(matrix)==1){
print(matrix isn't in the DB yet)
database[en,fn][[1]]=list(database[en,fn][[1]], matrix)
}else if(test_sums(matrix)==2){
print(matrix isn't in the DB entry yet and DB is empty)
database[en,fn][[1]][[1]]=list(matrix)
}else{
print(matrix already exists in the DB)
}
}

##create database##
database=matrix(list(),2,2)

##trying to write a matrix into db using the function##
write_to_db(m1)
database

###writing a matrix into db without using a function###
en=nrow(m1)
fn=ncol(m1)
if(test_sums(m1)==1){
print(matrix isn't in the DB yet)
database[en,fn][[1]]=list(database[en,fn][[1]], m1)
}else if(test_sums(m1)==2){
print(matrix isn't in the DB entry yet and DB is empty)
database[en,fn][[1]][[1]]=list(m1)
}else{
print(matrix already exists in the DB)
}
database

###test whether matrix already is in db##
write_to_db(m1)

And the output is:

[1] matrix isn't in the DB entry yet and DB is empty
 [,1] [,2]
[1,] NULL NULL
[2,] NULL NULL
[1] matrix isn't in the DB entry yet and DB is empty
 [,1] [,2]  
[1,] NULL NULL  
[2,] NULL List,1

[1] matrix already exists in the DB
So writing the matrix into the database using the function fails while 
writing it using the same commands but not through a function works! 
Maybe it's a problem with all these layers in the database matrix?


Petr PIKAL schrieb:

Hi

r-help-boun...@r-project.org napsal dne 19.08.2009 13:04:39:

  

Strange, it doesn't work for me:

Error in database[4, 4][[1]][1, ] : incorrect number of dimensions
Execution halted

R version 2.9.0 (2009-04-17) on Arch Linux, no additional packages 
installed.



database[4,4][[1]][,1]

does not work for me either but it is result of your complicated structure 
of nested lists


This works

  

database[4,4][[1]][[1]][,1]


[1] 0 1 1 1

See the structure

Matrix of lists

  

database


 [,1] [,2] [,3] [,4]   [,5]
[1,] NULL NULL NULL NULL   NULL
[2,] NULL NULL NULL NULL   NULL
[3,] NULL NULL NULL NULL   NULL
[4,] NULL NULL NULL List,1 NULL
[5,] NULL NULL NULL NULL   NULL

List of lists
  

database[4,4]


[[1]]
[[1]][[1]]
 [,1] [,2] [,3] [,4]
[1,]0111
[2,]1011
[3,]1101
[4,]1110

List which contains your matrix

  

database[4,4][[1]]


[[1]]
 [,1] [,2] [,3] [,4]
[1,]0111
[2,]1011
[3,]1101
[4,]1110

Here is your matrix

  

database[4,4][[1]][[1]]


 [,1] [,2] [,3] [,4]
[1,]0111
[2,]1011
[3,]1101
[4,]1110
  


Regards
Petr


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hist kernel density estimates

2009-08-19 Thread maram salem
Dear All,
Attached are the codes of a histogram  a kernel density estimate and the 
output they produced.

In fact the variable q is a vector of 1000 simulated values; that is I 
generated 1000 samples from the pareto distribution, from each sample I 
calculated the value of q ( a certain fn in the sample observations), and thus 
I was left with 1000 values of q and I don't know the distribution of q.

Hence, I used the attached codes for histogram and kernel density estimation 
toestimate the density of q.

But what I'm really intersed in is to estimate the probability that q is 
greater than a certain value , for ex.,P(q11000), using the density estimates 
I obtained.
 Could u help me with a fn or some document to do this?
Thank u so much

Maram


  __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hist kernel density estimates

2009-08-19 Thread indranil basu
Hi,
I think I've not received the attachment. Please check.
Thanks and Regards.
- Indranil Basu,
Bangalore, India

On Wed, Aug 19, 2009 at 9:01 PM, maram salem marammagdysa...@yahoo.comwrote:

 Dear All,
 Attached are the codes of a histogram  a kernel density estimate and the
 output they produced.

 In fact the variable q is a vector of 1000 simulated values; that is I
 generated 1000 samples from the pareto distribution, from each sample I
 calculated the value of q ( a certain fn in the sample observations), and
 thus I was left with 1000 values of q and I don't know the distribution of
 q.

 Hence, I used the attached codes for histogram and kernel density
 estimation toestimate the density of q.

 But what I'm really intersed in is to estimate the probability that q is
 greater than a certain value , for ex.,P(q11000), using the density
 estimates I obtained.
  Could u help me with a fn or some document to do this?
 Thank u so much

 Maram



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Petr PIKAL
Ok

Thank you for your time.

Best regards
Petr Pikal

Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 16:29:07:

 On 8/19/2009 10:14 AM, Petr PIKAL wrote:
  Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00:
  
  On 19/08/2009 9:02 AM, Petr PIKAL wrote:
   Thank you
   
   Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 
14:49:52:
   
   On 19/08/2009 8:31 AM, Petr PIKAL wrote:
   Dear all
  
   
   snip
   
   I would say the answer depends on the meaning of the variables. In 

  the 
   unusual case that they are measured in dimensionless units, it 
might 
   make sense not to scale.  But if you are using arbitrary units of 
   measurement, do you want your answer to depend on them?  For 
example, 
  if 
   
   you change from Kg to mg, the numbers will become much larger, the 

   variable will contribute much more variance, and it will become a 
  more 
   important part of the largest principal component.  Is that 
sensible?
   
   Basically variables are in percentages (all between 0 and 6%) 
except 
  dus 
   which is present or not present (for the purpose of prcomp 
transformed 
  to 
   0/1 by as.numeric:). The only variable which is not such is iep 
which 
  is 
   basically in range 5-8. So ranges of all variables are quite 
similar. 
   
   What surprises me is that biplot without scaling I can interpret by 

  used 
   variables while biplot with scaling is totally different and those 
two 
  
   pictures does not match at all. This is what surprised me as I 
would 
   expected just a small difference between results from those two 
  settings 
   as all numbers are quite comparable and does not differ much.
  
  
  If you look at the standard deviations in the two cases, I think you 
can 
  
  see why this happens:
  
  Scaled:
  
  Standard deviations:
  [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397
  
  Not Scaled:
  
  Standard deviations:
  [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582
  
  
  The first two sds are close, so small changes to the data will affect 

  
  I see. But I would expect that changes to data made by scaling would 
not 
  change it in such a way that unscaled and scaled results are 
completely 
  different.
  
  their direction a lot.  Your biplots look at the 2nd and 3rd 
components.
  
  Yes because grouping in 2nd and 3rd component biplot can be easily 
  explained by values of some variables (without scaling). 
  
  I must admit that I do not use prcomp much often and usually scaling 
can 
  give me explainable result, especially if I use it to variable 
  reduction. Therefore I am reluctant to use it in this case.
  
  when I try more standard way
  
  fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp)
  summary(fit)
  
  Call:
  lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp)
  
  Residuals:
   Min   1Q   Median   3Q  Max 
  -0.41751 -0.15568 -0.03613  0.20124  0.43046 
  
  Coefficients:
  Estimate Std. Error t value Pr(|t|) 
  (Intercept)  7.120850.62257  11.438 8.24e-08 ***
  sio2-0.672500.20953  -3.210 0.007498 ** 
  al2o30.405340.08641   4.691 0.000522 ***
  p2o5-0.769090.11103  -6.927 1.59e-05 ***
  as.numeric(dus) -0.640200.18101  -3.537 0.004094 ** 
  
  I get quite plausible result which can be interpreted without 
problems.
  
  My data is a result of designed experiment (more or less :) and 
therefore 
  all variables are significant. Is that the reason why scaling may bye 
  inappropriate in this case?
 
 No, I think it's just that the cloud of points is approximately 
 spherical in the first 2 or 3 principal components, so the principal 
 component directions are somewhat arbitrary.  You just got lucky that 
 the 2nd and 3rd components are interpretable:  I wouldn't put too much 
 faith in being able to repeat that if you went out and collected a new 
 set of data using the same design.
 
 Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] edit.row.names taking row names from the edited dataframe

2009-08-19 Thread Ottorino-Luca Pantani

Erik Iverson ha scritto:


This really has nothing to do with the row names issue you mention, as far as I can tell.  It has to do with this: 


http://stackoverflow.com/questions/1195826/dropping-factor-levels-in-a-subsetted-data-frame-in-r

Best,
Erik Iverson 
  

Perhaps you may also take a look at this
http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels

df.mydata$problemFactor - df.mydata$problemFactor[, drop = TRUE]


8rino

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph label greek symbol failure

2009-08-19 Thread Gavin Simpson
On Wed, 2009-08-19 at 14:20 +0100, e-letter wrote:
 On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote:
  On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote:
  On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote:
   On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote:
  
   I have tried to add the delta (δ) symbol to the y axis label and the
   result is D, using the command:
  
   ...ylab=δt...
  
   Try ylab = expression(delta*t) instead.
  
  This does not work, the result is
  expression(delta*t)
 
  It works for the rest of us who suggested this.
 
  plot(1:10, ylab = expression(delta*t))
 
 True, but the following commands fails:
 plot(1:10,ylab=temperature expression(delta*t))
 plot(1:10,ylab=temperature expression(delta*t))
 Error: syntax error, unexpected SYMBOL, expecting ',' in
 plot(1:10,ylab=temperature expression
 
 So I want to be able to have 'δt' and 'temperature δt' as a y-axis label.

Ah, but you never said that. We aren't mind readers you know ;-)

What you need is an expression that will, when used, give you a text
label containing Temperature δt. What you have done is create a
character string of the literal ...expression(delta*t) which is of
course why it is printed as the label - after all , you asked R to do
this.

I suggest you read the plotmath help page, accessed via:

?plotmath

executed at the prompt.

I found this expression stuff complicated when I first started out,
until I realised that whatever gets passed to expression() has to be
a syntactically valid R command. So for your particular example we need:

plot(1:10, ylab = expression(Temperature ~ delta*t))

Temperature is treated as the character string Temperature; the ~
means leave some space between the bits either side of ~; delta is a
special name that will get replaced by the relevant glyph; the *
juxtaposes delta and t, ie places them next to one another without any
space in between.

I see that Baptiste has also just replied with a similar solution to
mine. Baptiste's solution quotes temperature  with the spacing being
stated explicitly. Whilst both yield similar results, using ~ and not
having to quote things results in a simpler solution, IMHO.

HTH

Gavin

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hist kernel density estimates

2009-08-19 Thread David Winsemius


On Aug 19, 2009, at 11:34 AM, indranil basu wrote:


Hi,
   I think I've not received the attachment. Please check.


In all likelihood the OP did not read and follow the instructions  
regarding attachments that are to be found in the Posting Guide.



Thanks and Regards.
- Indranil Basu,
Bangalore, India

On Wed, Aug 19, 2009 at 9:01 PM, maram salem marammagdysa...@yahoo.com 
wrote:



Dear All,
Attached are the codes of a histogram  a kernel density estimate  
and the

output they produced.

In fact the variable q is a vector of 1000 simulated values; that  
is I
generated 1000 samples from the pareto distribution, from each  
sample I
calculated the value of q ( a certain fn in the sample  
observations), and
thus I was left with 1000 values of q and I don't know the  
distribution of

q.

Hence, I used the attached codes for histogram and kernel density
estimation toestimate the density of q.

But what I'm really intersed in is to estimate the probability that  
q is

greater than a certain value , for ex.,P(q11000), using the density
estimates I obtained.
Could u help me with a fn or some document to do this?
Thank u so much

Maram




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] visualizzazione colonne

2009-08-19 Thread Ottorino-Luca Pantani

sabrina.michie...@alice.it ha scritto:

ciao,
ho aperto un file in R di classe data frame di 15000 righe e 29 colonne.
Nella console però sono visualizzate solo la prima e l'ultima colonna e le 
ultime 8000 righe circa.
E' possibile una visualizzazione completa?
Grazie 
Sabrina


  

Hi Sabrina,
please do not forget that this list in english. Remember it for the future.

I find very inconvenient to have all the rows displayed at one time, 
especially if they are  15000 !!!


Should I need to look at all the dataset I would use page(mydataframe) 
(emacs or xemacs + ESS needed)

or from the console

fix(mydatframe) very dangerous !!!

or
edit (mydataframe)

Before attempting this, read the help
?edit
?fix

Should you want to look at only some part of the dataframe, try for example

mydataframe[3:15, 7:8]

This will show you the data contained lines from line 3 to 15 and in 
column from 7 to 8.


Ciao






--
Ottorino-Luca Pantani, Università di Firenze

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence interval on parameters from optim function

2009-08-19 Thread Arun.stat

Regrading your second question, I guess somehow you get undefined value like
logarithm of zero of your target function for some unfortunate parameter
values in the parameter space.



Devred, Emmanuel wrote:
 
 Hi everyone,
 
 I have two questions:
 
 I would like to get confidence intervals on the coefficients derived
 from the optim() function.
 I apply optim() to a given function f
 res -
 optim(c(0.08,0.04,1.),f,NULL,method=L-BFGS-B,lower=c(0.,0.,0.))
 And I would like to get the p-value and confidence intervals associated
 with 
 res$par
 
 My second question deals with error message. I am doing a loop with the
 optim() function in it, when I get an error message like below, the loop
 is stopped, however I would like to change my initial values to avoid
 this error message and keep the loop going, so if it crashes when I am
 away the program can still run, any idea ?
 
 Error in optim(c(0.08, 0.04, 1), f, NULL, method = L-BFGS-B, lower =
 c(0,  : 
   L-BFGS-B needs finite values of 'fn'
 
 Thank you for any information on these two problems.
 
   Emmanuel
 
 ---
 Dr. Emmanuel Devred
 Bedford Institute of Oceanography,
 1 Challenger Drive,
 Dartmouth, Nova Scotia, B2Y 4A2,
 Canada
 
 Ph:  (1) 902 426-4681
 Fax: (1) 902 426-9388
 
 devr...@mar.dfo-mpo.gc.ca
 
 http://myweb.dal.ca/edevred/
 ---
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Confidence-interval-on-parameters-from-optim-function-tp25044388p25046772.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to compute other columns without a column for sample name

2009-08-19 Thread Ottorino-Luca Pantani

sandsky ha scritto:

Data has the first row for variable name and the first column for sample
name. I want to take Log for all data, but how to compute without the
first column for sample name.

  

log.raw_data=log(raw_data,base=2)

Error in Math.data.frame(list(sample_id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,  : 
  non-numeric variable in data frame: sample_id


Thank you in advance,

Jin
  

You are trying to calculate a logarithm for a character variable.
You most probably imported the data with read.table (header = FALSE).
try

sapply(raw_data, is.numeric)

to see which are the columns that contains logarithmizable items

8rino

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with predict.coxph

2009-08-19 Thread Terry Therneau
-- begin included message ---
We occasionally utilize the coxph function in the survival library to
fit multinomial logit models. (The breslow method produces the same
likelihood function as the multinomial logit). We then utilize the
predict function to create summary results for various combinations of
covariates.  For example:

...

The problem is that under R 2.8.1 and R 2.9.1 the previous line fails
with the following error:

 totalut-predict(mod1,newdata=newdata,type=lp)
Error in model.frame.default(Terms2, newdata, xlev = object$xlevels) :
  variable lengths differ (found for 'Price')
In addition: Warning message:
'newdata' had 25 rows but variable(s) found have 43350 rows

---end inclusion --


 The coxph code was updated to use the standard R methods for
prediction with factors.  I even added test cases -- but obviously I've
missed something.  I'll look into a fix and get back to you.  
  Can I assume that Price and Product were both factors with the 5
levels 1:5?

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Hist kernel density estimates

2009-08-19 Thread maram salem


For the hist estimate
par(mex=1.3)
dens-density(q)
options(scipen=4)
 ylim-range(dens$y)
 h-hist(q,breaks=scott,freq=FALSE,probability=TRUE,
+  right=FALSE,xlim=c(9000,16000),ylim=ylim,main=Histogram of q(scott))
 lines(dens)
box()
 
For the kernel estimateoptions(scipen=4)
 d - density(q, bw = nrd0,kernel=gaussian)
 d
 plot(d)
 
In fact the variable q is a vector of 1000 simulated values; that is I 
generated 1000 samples from the pareto distribution, from each sample I 
calculated the value of q ( a certain fn in the sample observations), and thus 
I was left with 1000 values of q and I don't know the distribution of q.

Hence, I used the attached codes for histogram and kernel density estimation 
toestimate the density of q.

But what I'm really intersed in is to estimate the probability that q is 
greater than a certain value , for ex.,P(q11000), using the density estimates 
I obtained.
 Could u help me with a fn or some document to do this?
Thank u so much

Maram


      
Dear All,
Attached are the codes of a histogram  a kernel density estimate and the 
output they produced. I'll copy the codes here in case there's something wrong 
with the attachement


  __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph label greek symbol failure

2009-08-19 Thread e-letter
On 19/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote:
 On Wed, 2009-08-19 at 14:20 +0100, e-letter wrote:
 On 18/08/2009, Gavin Simpson gavin.simp...@ucl.ac.uk wrote:
  On Tue, 2009-08-18 at 13:06 +0100, e-letter wrote:
  On 17/08/2009, Michael Knudsen micknud...@gmail.com wrote:
   On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote:
  
   I have tried to add the delta (δ) symbol to the y axis label and the
   result is D, using the command:
  
   ...ylab=δt...
  
   Try ylab = expression(delta*t) instead.
  
  This does not work, the result is
  expression(delta*t)
 
  It works for the rest of us who suggested this.
 
  plot(1:10, ylab = expression(delta*t))
 
 True, but the following commands fails:
 plot(1:10,ylab=temperature expression(delta*t))
 plot(1:10,ylab=temperature expression(delta*t))
 Error: syntax error, unexpected SYMBOL, expecting ',' in
 plot(1:10,ylab=temperature expression

 So I want to be able to have 'δt' and 'temperature δt' as a y-axis label.

 Ah, but you never said that. We aren't mind readers you know ;-)

 What you need is an expression that will, when used, give you a text
 label containing Temperature δt. What you have done is create a
 character string of the literal ...expression(delta*t) which is of
 course why it is printed as the label - after all , you asked R to do
 this.

 I suggest you read the plotmath help page, accessed via:

 ?plotmath

 executed at the prompt.

 I found this expression stuff complicated when I first started out,
 until I realised that whatever gets passed to expression() has to be
 a syntactically valid R command. So for your particular example we need:

 plot(1:10, ylab = expression(Temperature ~ delta*t))

 Temperature is treated as the character string Temperature; the ~
 means leave some space between the bits either side of ~; delta is a
 special name that will get replaced by the relevant glyph; the *
 juxtaposes delta and t, ie places them next to one another without any
 space in between.

Being familiar with latex, I interpret your description of the tilde
(~) as non-breaking space, therefore I used the command:
plot(1:10,ylab=expression(temperature~delta*t))
to give me the result I wanted; in other words the space characters
are not needed between temperature,~ and delta in your suggestion.

Thank you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Albyn Jones
scaling changes the metric, ie which things are close to each other.
there is no reason to expect the picture to look the same when you 
change the metric.

On the other hand, your two pictures don't look so different to me.
It appears that the scaled plot is similar to the unscaled plot, with
the roles of the second and third pc reversed, ie the scaled plot is
similar but rotated and distorted.  For example, the observations
forming the strip across the bottom of the first plot form a vertical
strip on the right hand side of the second plot.

albyn

On Wed, Aug 19, 2009 at 02:31:23PM +0200, Petr PIKAL wrote:
 Dear all
 
 here is my data called rglp
 
 structure(list(vzorek = structure(1:17, .Label = c(179/1/1, 
 179/2/1, 180/1, 181/1, 182/1, 183/1, 184/1, 185/1, 
 186/1, 187/1, 188/1, 189/1, 190/1, 191/1, 192/1, 
 R310, R610L), class = factor), iep = c(7.51, 7.79, 5.14, 
 6.35, 5.82, 7.13, 5.95, 7.27, 6.29, 7.5, 7.3, 7.27, 6.46, 6.95, 
 6.32, 6.32, 6.34), skupina = c(7.34, 7.34, 5.14, 6.23, 6.23, 
 7.34, 6.23, 7.34, 6.23, 7.34, 7.34, 7.34, 6.23, 7.34, 6.23, 6.23, 
 6.23), sio2 = c(0.023, 0.011, 0.88, 0.028, 0.031, 0.029, 0.863, 
 0.898, 0.95, 0.913, 0.933, 0.888, 0.922, 0.882, 0.923, 1, 1), 
 p2o5 = c(0.78, 0.784, 1.834, 1.906, 1.915, 0.806, 1.863, 
 0.775, 0.817, 0.742, 0.783, 0.759, 0.787, 0.758, 0.783, 3, 
 2), al2o3 = c(5.812, 5.819, 3.938, 5.621, 3.928, 3.901, 5.621, 
 5.828, 4.038, 5.657, 3.993, 5.735, 4.002, 5.728, 4.042, 6, 
 5), dus = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c(ano, ne), class = 
 factor)), .Names = c(vzorek, 
 iep, skupina, sio2, p2o5, al2o3, dus), class = data.frame, 
 row.names = c(NA, 
 -17L))
 
 and I try to do principal component analysis. Here is one without scaling
 
 fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2)
 biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8)
 
 you can see that data make 3 groups according to variables sio2 and dus 
 which seems to be reasonable as lowest group has different value of dus = 
 ano while highest group has low value of sio2.
 
 But when I do the same with scale=T
 
 fit-prcomp(~iep+sio2+al2o3+p2o5+as.numeric(dus), data=rglp, factors=2, 
 scale=T)
 biplot(fit, choices=2:3,xlabs=rglp$vzorek, cex=.8)
 
 I get completely different picture which is not possible to interpret in 
 such an easy way.
 
 So if anybody can advice me if I shall follow recommendation from help 
 page (which says
 The default is FALSE for consistency with S, but in general scaling is 
 advisable.
 or if I shall stay with scale = FALSE and with simply interpretable 
 result?
  
 Thank you.
 
 Petr Pikal
 petr.pi...@precheza.cz
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale or not to scale that is the question - prcomp

2009-08-19 Thread Kevin Wright
If you mentally rotate the second biplot by 90 degrees, the plots are not so
different.  This just indicates that the 2nd and 3rd principal components
have switched roles.

Kevin Wright


On Wed, Aug 19, 2009 at 10:09 AM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Ok

 Thank you for your time.

 Best regards
 Petr Pikal

 Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 16:29:07:

  On 8/19/2009 10:14 AM, Petr PIKAL wrote:
   Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009 15:25:00:
  
   On 19/08/2009 9:02 AM, Petr PIKAL wrote:
Thank you
   
Duncan Murdoch murd...@stats.uwo.ca napsal dne 19.08.2009
 14:49:52:
   
On 19/08/2009 8:31 AM, Petr PIKAL wrote:
Dear all
   
   
snip
   
I would say the answer depends on the meaning of the variables. In

   the
unusual case that they are measured in dimensionless units, it
 might
make sense not to scale.  But if you are using arbitrary units of
measurement, do you want your answer to depend on them?  For
 example,
   if
   
you change from Kg to mg, the numbers will become much larger, the

variable will contribute much more variance, and it will become a
   more
important part of the largest principal component.  Is that
 sensible?
   
Basically variables are in percentages (all between 0 and 6%)
 except
   dus
which is present or not present (for the purpose of prcomp
 transformed
   to
0/1 by as.numeric:). The only variable which is not such is iep
 which
   is
basically in range 5-8. So ranges of all variables are quite
 similar.
   
What surprises me is that biplot without scaling I can interpret by

   used
variables while biplot with scaling is totally different and those
 two
  
pictures does not match at all. This is what surprised me as I
 would
expected just a small difference between results from those two
   settings
as all numbers are quite comparable and does not differ much.
  
  
   If you look at the standard deviations in the two cases, I think you
 can
  
   see why this happens:
  
   Scaled:
  
   Standard deviations:
   [1] 1.3335175 1.2311551 1.0583667 0.7258295 0.2429397
  
   Not Scaled:
  
   Standard deviations:
   [1] 1.0030048 0.8400923 0.5679976 0.3845088 0.1531582
  
  
   The first two sds are close, so small changes to the data will affect

  
   I see. But I would expect that changes to data made by scaling would
 not
   change it in such a way that unscaled and scaled results are
 completely
   different.
  
   their direction a lot.  Your biplots look at the 2nd and 3rd
 components.
  
   Yes because grouping in 2nd and 3rd component biplot can be easily
   explained by values of some variables (without scaling).
  
   I must admit that I do not use prcomp much often and usually scaling
 can
   give me explainable result, especially if I use it to variable
   reduction. Therefore I am reluctant to use it in this case.
  
   when I try more standard way
  
   fit-lm(iep~sio2+al2o3+p2o5+as.numeric(dus), data=rglp)
   summary(fit)
  
   Call:
   lm(formula = iep ~ sio2 + al2o3 + p2o5 + as.numeric(dus), data = rglp)
  
   Residuals:
Min   1Q   Median   3Q  Max
   -0.41751 -0.15568 -0.03613  0.20124  0.43046
  
   Coefficients:
   Estimate Std. Error t value Pr(|t|)
   (Intercept)  7.120850.62257  11.438 8.24e-08 ***
   sio2-0.672500.20953  -3.210 0.007498 **
   al2o30.405340.08641   4.691 0.000522 ***
   p2o5-0.769090.11103  -6.927 1.59e-05 ***
   as.numeric(dus) -0.640200.18101  -3.537 0.004094 **
  
   I get quite plausible result which can be interpreted without
 problems.
  
   My data is a result of designed experiment (more or less :) and
 therefore
   all variables are significant. Is that the reason why scaling may bye
   inappropriate in this case?
 
  No, I think it's just that the cloud of points is approximately
  spherical in the first 2 or 3 principal components, so the principal
  component directions are somewhat arbitrary.  You just got lucky that
  the 2nd and 3rd components are interpretable:  I wouldn't put too much
  faith in being able to repeat that if you went out and collected a new
  set of data using the same design.
 
  Duncan Murdoch

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph label greek symbol failure

2009-08-19 Thread Gavin Simpson
On Wed, 2009-08-19 at 17:06 +0100, e-letter wrote:
snip /
 
 Being familiar with latex, I interpret your description of the tilde
 (~) as non-breaking space, therefore I used the command:
 plot(1:10,ylab=expression(temperature~delta*t))
 to give me the result I wanted; in other words the space characters
 are not needed between temperature,~ and delta in your suggestion.

No, the spacing is not needed, just like any other spacing in R, it is
ignored, but my eye-brain wetware system is /not/ the R parser and as
such I like to space things out so *I* can parse and *understand* my
code; I find it easier to find syntax errors etc that way.

Note that there was a difference between what I posted and Baptiste's
example; there the space was explicitly encoded within temperature ,
where as I use ~ as the indicator for spacing; any other spacing was
just for sake of legibility.

I also don't think this is non-breaking because unless you add an
explicit \n (newline) in your label, the entire label will be rendered
on a single line.

Better to think of 
x ~ y (or x~y if you think that is more readable)

as just putting spacing between x and y. As ?plotmath shows, you can
string multiple ~ together to give extra spacing;

x ~~ y

And so on...

Glad you got this working as you wanted.

G

 
 Thank you.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] BUGS

2009-08-19 Thread James Lenihan

I am running a BUGS function with following
 
schools.sim -bugs(data,inits,
 parameters,
 model.file=schools.txt,
 n.chains=3,
 n.iter=1000,
 bugs.directory=E:/Rprograms)
 
My model.file IS in the directory =E:/Rprograms and the code is: 
 
model{
    for (j in 1:J){
    y[j] ~ dnorm (theta[j], tau.y[j])
    theta[j] ~ dnorm (mu.theta, tau.theta)
    tau.y[j] - pow(sigma.y[j], -2)
    }
    mu.theta ~ dnorm (0.0, 1.0E-6)
    tau.theta - pow(sigma.theta, -2)
    sigma.theta ~ dunif (0, 1000) 
 
I am getting the following error which I can't understand.   
 
Error in bugs(data, inits, parameters, model.file = schools.txt, n.chains = 
3,  : 
  schools.txt does not exist.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a construct for conditional comment?

2009-08-19 Thread Greg Snow
I believe that #if lines for C++ programs is handled by the preprocessor, not 
the compiler.  So if you want the same functionality for R programs, it would 
make sense to just preprocess the R file.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Peng Yu
 Sent: Tuesday, August 18, 2009 3:55 PM
 To: r-h...@stat.math.ethz.ch
 Subject: [R] Is there a construct for conditional comment?
 
 Hi,
 
 In C++, I can use the following construct to choice either of the two
 blocks the comment but not both. Depending on whether the number after
 #if is zero or not, the commented block can be chose. I'm wondering
 if such thing is possible in R?
 
 #if 0
 commented with 0
 #else
 commented with 1
 #endif
 
 Regards,
 Peng
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BUGS

2009-08-19 Thread stephen sefick
Naively it seems that there is no school.txt file in your specified directory.

Stephen Sefick

On Wed, Aug 19, 2009 at 12:05 PM, James
Lenihanjamesleni...@sbcglobal.net wrote:

 I am running a BUGS function with following

 schools.sim -bugs(data,inits,
  parameters,
  model.file=schools.txt,
  n.chains=3,
  n.iter=1000,
  bugs.directory=E:/Rprograms)

 My model.file IS in the directory =E:/Rprograms and the code is:

 model{
     for (j in 1:J){
     y[j] ~ dnorm (theta[j], tau.y[j])
     theta[j] ~ dnorm (mu.theta, tau.theta)
     tau.y[j] - pow(sigma.y[j], -2)
     }
     mu.theta ~ dnorm (0.0, 1.0E-6)
     tau.theta - pow(sigma.theta, -2)
     sigma.theta ~ dunif (0, 1000)

 I am getting the following error which I can't understand.

 Error in bugs(data, inits, parameters, model.file = schools.txt, n.chains = 
 3,  :
   schools.txt does not exist.

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Best performance measure?

2009-08-19 Thread Noah Silverman

Hello,

I working on a model to predict probabilities.

I don't really care about binary prediction accuracy.

I do really care about the accuracy of my probability predictions.

Frank was nice enough to point me to the val.prob function from the 
Design library.  It looks very promising for my needs.


I've put together some tests and run the val.prob analysis.  It produces 
some very informative graphs along with a bunch of performance measures.


Unfortunately, I'm not sure which measure, if any, is the best one.  
I'm comparing hundreds of different models/parameter combinations/etc.  
So Ideally I'd like a single value or two as the performance measure 
for each one.  That way I can pick the best  model from all my 
experiments.


As mentioned above, I'm mainly interested in the accuracy of my 
probability predictions.


Does anyone have an opinion about which measure I should look at??
(I see Dxy, C, R2, D, U, Briar, Emax, Eavg, etc.)

Thanks!!

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Noah Silverman

Hello,

In my ongoing quest to develop a best model, I'm testing various forms 
of SVM to see which is best for my application.


I have been using the SVM from the e1071 library without problem for 
several weeks.


Now, I'm interested in RVM and LSSVM to see if I get better performance.

When running RVM or LSSVM on the exact same data as the SVM{e1071}, I 
get an error that I don't understand:


Error in .local(x, ...) : kernel must inherit from class 'kernel'

Does this make sense to anyone?  Can you suggest how to resolve this?

Thanks!

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BUGS

2009-08-19 Thread Uwe Ligges



stephen sefick wrote:

Naively it seems that there is no school.txt file in your specified directory.

Stephen Sefick

On Wed, Aug 19, 2009 at 12:05 PM, James
Lenihanjamesleni...@sbcglobal.net wrote:

I am running a BUGS function with following

schools.sim -bugs(data,inits,
 parameters,
 model.file=schools.txt,
 n.chains=3,
 n.iter=1000,
 bugs.directory=E:/Rprograms)

My model.file IS in the directory =E:/Rprograms and the code is:



According to ?bugs:


bugs.directory: directory that contains the WinBUGS executable. If the 
global option R2WinBUGS.bugs.directory is not NULL, it will be used as 
the default.


working.directory: sets working directory during execution of this 
function; WinBUGS' in- and output will be stored in this directory; if 
NULL, a temporary working directory via tempdir is used.


So either specify working.directory in a way that it points to the 
model.txt file or specify a full path name for your model.file


I don't believe that bugs.directory=E:/Rprograms will do anything 
sensible (as it should point to your WinBUGS installation.


Uwe Ligges






model{
for (j in 1:J){
y[j] ~ dnorm (theta[j], tau.y[j])
theta[j] ~ dnorm (mu.theta, tau.theta)
tau.y[j] - pow(sigma.y[j], -2)
}
mu.theta ~ dnorm (0.0, 1.0E-6)
tau.theta - pow(sigma.theta, -2)
sigma.theta ~ dunif (0, 1000)

I am getting the following error which I can't understand.

Error in bugs(data, inits, parameters, model.file = schools.txt, n.chains = 
3,  :
  schools.txt does not exist.

   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best performance measure?

2009-08-19 Thread Frank E Harrell Jr

Noah Silverman wrote:

Hello,

I working on a model to predict probabilities.

I don't really care about binary prediction accuracy.

I do really care about the accuracy of my probability predictions.

Frank was nice enough to point me to the val.prob function from the 
Design library.  It looks very promising for my needs.


I've put together some tests and run the val.prob analysis.  It produces 
some very informative graphs along with a bunch of performance measures.


Unfortunately, I'm not sure which measure, if any, is the best one.  
I'm comparing hundreds of different models/parameter combinations/etc.  
So Ideally I'd like a single value or two as the performance measure 
for each one.  That way I can pick the best  model from all my 
experiments.


As mentioned above, I'm mainly interested in the accuracy of my 
probability predictions.


Does anyone have an opinion about which measure I should look at??
(I see Dxy, C, R2, D, U, Briar, Emax, Eavg, etc.)

Thanks!!

-N


It all depends on the goal, i.e., the relative value you place on 
absolute accuracy vs. discrimination ability. The Brier score combines 
both and other than interpretability has many advantages.


Frank



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a bug in the offset parameter syntax in the geepack package?

2009-08-19 Thread ahnven

Dear R-users,

I was doing some Poisson regression using the geepack package and I found a
strange behaviour of the geeglm command when defining an offset. Maybe
it's my limited knowledge of R syntax, or maybe it's something else.

Let's make an example. After loading the geepack library, you may write

model=geeglm(y~covariates+offset(log(xxx)),data=data,family=poisson,id=idx)

which is equivalent to

model=geeglm(y~covariates,offset=log(xxx),data=data,family=poisson,id=idx)

however, the command works also with the syntax

model=geeglm(y~covariates,offset(log(xxx)),data=data,family=poisson,id=idx)

giving different (wrong) results!

I discovered it because I was calculating some incidence rates by hand and
then comparing with the glm estimates. In the manual, indeed, the way in
which the offset should be written is not so easy to understand.

Thanks for any response,
Mattia Prosperi


-- 
View this message in context: 
http://www.nabble.com/a-bug-in-the-offset-parameter-syntax-in-the-geepack-package--tp25047143p25047143.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Urgent Help

2009-08-19 Thread Bin1aya

Dear Sir,

I am using RClimDex. I get following error. 
 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
na.strings,  : 
 scan() expected 'a real', got 'T'

I have done copy paste of climate data from excel file to notepad and tried
to upload. I do not have any knowledge about programming languages. Please
help me.

Regards,
Binaya Pasakhala
-- 
View this message in context: 
http://www.nabble.com/Urgent-Help-tp25048115p25048115.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error with acf/pacf functions

2009-08-19 Thread tmakino

Ok strangely both acf and pacf work with the xts object as long as the xts
package hasn't been loaded.  As soon as I load the package, I get the error
again.

tmakino wrote:
 
 I'm trying to apply a acf/pacf function to a xts object and get the
 following error:
 
 Error in if (frequency  1  abs(frequency - round(frequency))  ts.eps)
 frequency - round(frequency) : 
   missing value where TRUE/FALSE needed
 
 Any ideas?  Thanks.
 

-- 
View this message in context: 
http://www.nabble.com/Error-with-acf-pacf-functions-tp25029832p25047616.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ridge regression [Repost]

2009-08-19 Thread Sabyasachi Patra

Dear all,

For an ordinary ridge regression problem, I followed three different
approaches:
1. estimate beta without any standardization
2. estimate standardized beta (standardizing X and y) and then again convert
back  
3. estimate beta using lm.ridge() function


X-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3)
y-as.matrix(c(2,3,4,5))

n-nrow(X)
p-ncol(X)

#Without standardization
intercept - rep(1,n)
Xn - cbind(intercept, X)
K-diag(c(0,rep(1.5,p)))
beta1 - solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y
beta1

#with standardization
ys-scale(y)
Xs-scale(X)
K-diag(1.5,p)
bs - solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys
b- sd(y)*(bs/sd(X))
intercept - mean(y)-sum(as.matrix(colMeans(X))*b)
beta2-rbind(intercept,b)
beta2

#Using lm.ridge function of MASS package
beta3-lm.ridge(y~X,lambda=1.5)

I'm getting three different results using above described approaches:

 beta1
[,1]
intercept  3.4007944
   0.3977462
   0.2082025
  -0.4829115
 beta2
[,1]
intercept  3.3399855
   0.1639469
   0.0262021
  -0.1228987

 beta3
 X1  X2  X3
 3.35158977  0.19460958  0.03152778 -0.15546775

It will be very helpful to me if anybody can help me regarding why the
outputs are coming different.

I am extremely sorry for my previous anonymous post.

regards.

-
Sabyasachi Patra
PhD Scholar
Indian institute of Technology Kanpur
India.
-- 
View this message in context: 
http://www.nabble.com/Ridge-regression--Repost--tp25048898p25048898.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer with random slopes for 2 or more first-level factors?

2009-08-19 Thread Jonathan Baron
Since nobody has answered yet, let me try.  I'm not 100% sure.  (So
why bother?  To check my own understanding.)

On 08/18/09 20:47, Jason R. Finley wrote:
 I have data from a design in which items are completely nested within  
 subjects.  Subject is the only second-level factor, but I have  
 multiple first-level factors (IVs).  Say there are 2 such independent  
 variables that I am interested in.  What is the proper syntax to fit a  
 mixed-effects model with a by-subject random intercept, and by-subject  
 random slopes for both the 2 IVs?
 
 I can think of at least two possibilities:
 
 lmer(DV ~ IV1 + IV2 + (1 + IV1 + IV2 | Subject))
 
 lmer(DV ~ IV1 + IV2 + (1 + IV1 | Subject) + (1 + IV2 | Subject))
 
 Or maybe there is some other way to do it?  Maybe the correct syntax  
 depends on whether the random effect of subjects on the intercept and  
 slopes are correlated or not?  (If so, how do I proceed?)

I think that the first way assumes correlations of IV1, IV2, and the
Subject intercept.  The second way assumes that each of IV1 and IV2 is
correlated with the Subject intercept, and the intercept is computed
separately for IV1 and IV2.

There are other ways.  You can do, for example, 

lmer(DV ~ IV1 + IV2 + (1 | Subject) + (1 | Subject)) + 
(0 + IV1 | Subject) + (0 + IV2| Subject))

This assumes, I think, that intercepts and slopes are uncorrelated.
If you can tolerate that assumption, an advantage of doing it this way
is that you can use mcmc sampling, and thus you can use pvals.fnc() in
the languageR package.  (What I do not know are the consequences of
making this assumption if it is false.)

 Finally, what would be the syntax if I wanted to include a random  
 subject effect for the INTERACTION of IV1 and IV2?

lmer(DV ~ IV1 + IV2 + (1 + IV1 * IV2 | Subject))

-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] double precision

2009-08-19 Thread miller_2555


Roger Bivand wrote:
 
 On Tue, 5 Dec 2006, Yoni Schamroth wrote:
 
 Hi,
 
 I am attempting to query a data frame from a mysql database.
 One of the variables is a unique identification number (numeric) 18
 digits
 long.
 I am struggling to retrieve this variable exactly without any rounding.
 
 Read it as a character - a double is a double:
 
 x - 6527600583317876352
 y - 6527600583317876380
 all.equal(x,y)
 [1] TRUE
 storage.mode(x)
 [1] double
 
 and why they are equal is a FAQ (only ~16 digits in a double). Integer is
 4-byte. Since they are IDs, not to be used for math, leave them as
 character strings - which they are, like telephone numbers.
 

Resurrecting this post for a moment, the same issue arose when interfacing R
with a Postgres database using the bigint data type (a signed 64-bit integer
ranging from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 as of
this writing). While the underlying cause is summarized above, I'd like to
recommend the inclusion of a 64-bit integer data type into the R base. For
performance reasons, I use R to independently generate a unique transaction
ID that is equivalent to the Postgres-generated bigint (with some
consistency checks --  generally bad design, but vastly quicker than
querying the database for the same value). I currently generate a string
representation and pass that to the DBI, though the process is cumbersome
and likely not as efficient as an arithmetic equivalent (particularly when
using a 64-bit processor architecture). Furthermore, there are additional
gyrations that need to occur when querying the database for bigint values.
Do significant practical challenges exist in the implementation of a 64-bit
integer that would outweigh the faster and cleaner compatibility with
database backends?
-- 
View this message in context: 
http://www.nabble.com/-R--double-precision-tp7708057p25048915.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Urgent Help

2009-08-19 Thread Stefan Grosse
On Wed, 19 Aug 2009 10:01:19 -0700 (PDT) Bin1aya bpasakh...@gmail.com
wrote:

B I have done copy paste of climate data from excel file to notepad
B and tried to upload. I do not have any knowledge about programming
B languages. Please help me.

There are better ways of importing data from Excel. Have a look at the
wiki:
http://wiki.r-project.org/rwiki/doku.php?id=tips:data-io:ms_windows

The first possible solution path should work...

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >