[R] how to extract t-test statistics from glm()?
I need to extract t-test statistics from glm(). For example, Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 46.219911.6310 3.974 0.000106 *** Var1 1.0440 0.5948 1.755 0.081088 . Var2 -0.4717 2.0257 -0.233 0.816178 Var3 0.2376 0.1454 1.635 0.104024 And I want to put all the t-values (and if possible, only that is associated with one particular independent variable such as Var3) to another table that can be reused later. I'm new to R and your help is highly appreciated! Eric [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimate parameters of copulas
Dear list, Does anyone have any experience on how to estimate the parameters of the specific copula in R or S+?? For example, calculate the conditional d. f. of F(x2|x1) by using the partial derivatives of the Clayton copula, C(u1|u2)=((u2^(-alpha) +u1^(-alpha) - 1 )/ u2^(-alpha) )^-1/alpha-1 Here, how tto estimate the value of alpha based on the observations on (x1, x2)??? fitCopula function?? I could not find some reference on this function.Also does anyone have some example of fitCopula??? Appreciate for your kind help. With regards, Michael [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM covergence problem
Hi, R users I am trying to use EM algorithmto estimate a latent class model of discrete choice. The basic model is a logit model which has two variables X=(X1,X2) and the utility is defined as v = Beta*X. There are 3 classes of individuals each has its own Beta values, so Beta is a 3*2 matrix. For each individual, (there are 1000), he make a choice between two randomly generated choice alternatives and has to choose one. The proportion of classes is set to (0.2,0.5,0.3). based on this seting, the proportion parameters alpha always end up with one of them goes to 0 in EM estimation. I checked the code and can not find what's the problem. Maybe this is a bit off topic in R help, but I code it in R and hence in hope that some one here can figure it out. Plus, I have a normal mixture model with success EM result. Thanks in advance. Following is the code: #= X = array(,dim=c(1000,2))#first alternative b= c(2.3,0.7,0.3,0.7,0.1,0.8) alpha = c(0.3,0.5,0.2)#proportion X[,1] = runif(1000,min=5,max=20) X[,2] = runif(1000,min=5,max=20) ##bi = 0.3 Y = array(,dim=c(1000,2))#second alternative Y[,1] = runif(1000,min=5,max=20) Y[,2] = runif(1000,min=5,max=20) V11 = X[1:300,]%*%b[1:2] V12 = Y[1:300,]%*%b[1:2] V21 = X[301:800,]%*%b[3:4] V22 = Y[301:800,]%*%b[3:4] V31 = X[801:1000,]%*%b[5:6] V32 = Y[801:1000,]%*%b[5:6] V1 = rbind(V11,V21,V31)+rnorm(1000) V2 = rbind(V12,V22,V32)+rnorm(1000) oo = exp(V1)+exp(V2) P1 = exp(V1)/oo P2 = exp(V2)/oo D = ifelse(V1V2,0,1) #second part of Q function Q2 = function(Beta,H){ probs = logProbInd(Beta) li = sum(H*probs) return(li) } logProbInd=function(Beta){#X, Y, D values take as is from environment dim(Beta) = c(2,3) Beta = t(Beta) probs = matrix(,nrow = 1000, ncol = 3) for(i in 1:3){ v1 = X%*%Beta[(i-1)*2+1:2] v2 = Y%*%Beta[(i-1)*2+1:2] p1 = exp(v1)/(exp(v1)+exp(v2)) p2 = exp(v2)/(exp(v1)+exp(v2)) probs[,i] = ifelse (D==0,log(p1),log(p2)) } return (probs) } #H [individuals][class] E_step = function(alpha,Beta){#calc posterior of H tmpH = matrix(,nrow = 1000,ncol =3) lprobs = logProbInd(Beta) for(i in 1:3){#classes tmpH[,i] = alpha[i]*exp(lprobs[,i]) } H = tmpH /apply(tmpH,1,sum) return( H) } M_step = function(H,Beta){ #first part use direct estimation aita = apply(H,2,sum)/1000 opt.c = optim(Beta,Q2,H=H,method=BFGS,control = list(fnscale = -1)) lik = opt.c$value return(c(aita,opt.c$par,lik)) } #EM loops alf = c(0.33,0.33,0.33) Bt = seq(0.1,0.6,by=0.1) sc = rep(-8000,5) i=1 while(T){ H= E_step(alpha=alf,Beta=Bt) theta = M_step(H=H,Beta=Bt) print(theta) alf = theta[1:3] Bt = theta[4:9] #check convergence sc[(i%%5) +1] = theta[10] i=i+1 #if((theta[10] - mean(sc) ) 0.0005) break } [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] linear mixed model?
Hi, I got the following example from MASS (10.2 Classic Nested Designs) A cooperative trial in analytical chemistry taken from Analytical Methods Committe (1987). Seven specimens were sent to six laboratories, each three times a month apart for duplicate analysis. The response is the concentration of (unspecified) analyte in g/kg. I understand what is written in the book about the linear mixed model (nested). Laboratories and batches are treated as random and to assess components of variation. However, I have an extral question about it. From the analysis, we can evaluate the variance from different labs. Let's say a training program was done to the labs, I want to see whether after the trainning program, the variance from the lab (sigma_L in the book) becomes smaller or not. Basically I need to compare \sigma_L before and after the training program. How can I do that? I think there should already be some model for this type of analysis. Thank you in advance. Dave [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A question about a linear mixed model.
Hi, I got the following example from MASS (10.2 Classic Nested Designs) A cooperative trial in analytical chemistry taken from Analytical Methods Committe (1987). Seven specimens were sent to six laboratories, each three times a month apart for duplicate analysis. The response is the concentration of (unspecified) analyte in g/kg. I understand what is written in the book about the linear mixed model (nested). Laboratories and batches are treated as random and to assess components of variation. However, I have an extral question about it. From the analysis, we can evaluate the variance from different labs. Let's say a training program was done to the labs, I want to see whether after the trainning program, the variance from the lab (sigma_L in the book) becomes smaller or not. Basically I need to compare \sigma_L before and after the training program. How can I do that? I think there should already be some model for this type of analysis. Thank you in advance. Dave [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] manipulate group data using column name
Hi, Maybe this is a trivial question but I can not figure out a good solution. I have a data frame fa and want to add a new column sum with the sum value of fa$X1 grouped by fa$X3. fa X1 X2 X3 1 1 11 1 2 2 12 1 3 3 13 1 4 4 14 2 5 5 15 2 6 6 16 2 7 7 17 3 8 8 18 3 9 9 19 3 10 10 20 3 fa$X3 is the index of group i can aggregate(fa[,X1],list(fa$X3),sum) Group.1 x 1 1 6 2 2 15 3 3 34 then I want to add a new column sum in fa and assign the aggregated result to the new column. Is there a solution without using loops? or maybe there is some way can even avoid aggregate operation? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merging two lists but get indexes
Suppose I have two columns of entries, how can I get the union of the two columns? Please note: I input my columns through excel. These entries have text format in excel. Also, out of curiosity, how can I find out the data type of a data frame ? a - read.csv(book1.csv) a n1 n2 1 apple soda 2 orange apple 3 soda green 4red yellow 5 white blue 6 white union(a$n1,a$n2) [1] 2 3 5 4 6 1 I want the actual names instead of the indexes. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What's meaning of the lambda in nlrq output
I used the nlrq function in the package quantreg. There is a lambda in the output when I set trace=TRUE. With different start point, the results are converged, but the last lambda is different. I want to know the meaning lambda=1 and lambda=0. Many Thanks! Examples of output 1. Where the last lambda=1: 108.6581 : 0.3 8.0 iter0 value 108.658087 final value 49.875178 converged lambda = 0.6304686 49.87518 : 0.3539175 5.6116474 iter0 value 48.123700 final value 48.120120 converged lambda = 0.7646881 48.12012 : 0.4701847 6.4582566 iter0 value 47.833255 final value 47.832257 converged lambda = 0.8588522 47.83226 : 0.5052472 6.8564865 final value 47.802009 converged lambda = 1 47.80201 : 0.5257616 7.0207726 iter0 value 47.776363 final value 47.747994 stopped after 4 iterations lambda = 0.5044797 47.74799 : 0.5474859 7.1612348 final value 47.746067 converged lambda = 1 47.74607 : 0.5503345 7.1881443 iter0 value 47.746099 final value 47.746099 converged lambda = 0.999864 47.7461 : 0.5502692 7.1879762 2. Where the last lambda=0: 54.16497 : 0.6 8.0 iter0 value 54.164967 final value 47.768140 converged lambda = 0.2133083 47.76814 : 0.5560481 7.2205976 iter0 value 47.768140 final value 47.767343 converged lambda = 0.004997877 47.76734 : 0.5557633 7.2172873 iter0 value 47.787967 final value 47.747700 stopped after 3 iterations lambda = 0.1719594 47.7477 : 0.547935 7.165428 iter0 value 47.799883 final value 47.747700 converged lambda = 0 47.7477 : 0.547935 7.165428 -- Shg SUN @China __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] persp() problem
Dear list, I have a problem on persp() x - u1data #first coloum in attached data y - u2data #second coloum in attached data f - function(x,y){qgev(pnorm(rhoF*qnorm(pnorm((qnorm(y)-rho2*qnorm(x)/sqrt(1-rho2^2 +sqrt(1-rhoF^2)*qnorm(0.95)),-0.3935119, 0.4227890, 0.2701648)} z - outer(x,y,f) persp(x,y,z) The R will display: Error in persp.default(x, y, z) : increasing 'x' and 'y' values expected So I try to adjust it to: testx - unique(sort(u1data)) testy - unique(u2data[order(u1data)]) testf - function(testx,testy){qgev(pnorm(rhoF*qnorm(pnorm((qnorm(testy)-rho2*qnorm(testx)/sqrt(1-rho2^2 +sqrt(1-rhoF^2)*qnorm(0.95)),-0.3935119, 0.4227890, 0.2701648)} testz - unique(outer(testx,testy,testf)[order(u1data)]) BUT SAME WARN: Error in persp.default(testx, testy, testz) : increasing 'x' and 'y' values expected So how can I use persp in this situation?? Thanks for any help! Besides that I also want to enquiry on how to build the martix below? [,1] [,2] [,3] [,4] [,5]...[,676] [1,] 1 NA NA NA NA [2,] NA 5 NA NA NA [3,] NA NA 7 NA NA [4,] NA NA NA 9 NA [5,] NA NA NA NA 12 .. .. . [676,] Appreciate for any reply. Thank you, With regards Mc 0.3102202 0.6118165 0.898893 0.1153732 0.889533 0.9403834 0.1939559 0.008136975 0.752973 0.848 0.8709167 0.9363419 0.7327573 0.5331876 0.889305 0.142052 0.7996472 0.9241059 0.5789939 0.6572769 0.904507 0.7628466 0.7408237 0.9154523 0.8063378 0.8264442 0.928366 0.3868742 0.8406373 0.9308378 0.1201804 0.5251654 0.8883977 0.1073146 0.5495656 0.8913543 0.3801476 0.4933598 0.8844978 0.1704377 0.3456871 0.8655715 0.5100424 0.6447163 0.9029398 0.1004252 0.6747806 0.9067755 0.4969747 0.2788576 0.855801 0.9503927 0.4357328 0.877291 0.7038206 0.1551265 0.833016 0.4496465 0.2228675 0.8465589 0.3152238 0.4508357 0.8792615 0.8595227 0.1699521 0.8362594 0.8781077 0.442257 0.8781285 0.9926326 0.4477713 0.8787715 0.5001015 0.4227189 0.8757058 0.7418432 0.2242488 0.8467729 0.8401202 0.4881974 0.8838192 0.3141832 0.5075416 0.886233 0.9387748 0.76381 0.9186662 0.5421865 0.7493546 0.9166607 0.6063368 0.4569826 0.8799987 0.07328759 0.5085049 0.8863852 0.3523433 0.239639 0.8494759 0.2143498 0.1904935 0.8405416 0.0043028 0.2351062 0.8487917 0.2200048 0.525856 0.888467 0.929427 0.7432261 0.91576 0.4657704 0.02713088 0.781553 0.4974512 0.1951512 0.8414202 0.3032740 0.6642617 0.9054128 0.9586954 0.5320473 0.8891269 0.8522317 0.641292 0.9024777 0.7172216 0.7869344 0.9221045 0.5019212 0.5607276 0.8926608 0.6892507 0.722802 0.9130005 0.5197227 0.547926 0.8911097 0.6910355 0.4711651 0.8817452 0.1635166 0.5960693 0.8969858 0.6157443 0.3740314 0.8693667 0.3012652 0.3860136 0.8709832 0.7918442 0.5396737 0.8900836 0.7781783 0.7286279 0.9137793 0.5959097 0.6063727 0.8982006 0.5335883 0.5932636 0.8966064 0.4868495 0.4225826 0.8756897 0.3517952 0.242862 0.8500197 0.8330476 0.638904 0.9021835 0.1146440 0.03340253 0.786976 0.3653549 0.1617220 0.8345306 0.3287211 0.05560717 0.8008241 0.3507715 0.6139207 0.899147 0.3250503 0.4740786 0.882139 0.4259684 0.4225980 0.8756974 0.3831548 0.342748 0.86514 0.9297493 0.2764793 0.8553747 0.5209303 0.1444220 0.830535 0.2882225 0.4086862 0.8739324 0.06696263 0.05796908 0.8020525 0.2569928 0.6683259 0.9059325 0.737885 0.2643880 0.8535031 0.3417781 0.7339068 0.9145383 0.3003221 0.3746508 0.86948 0.6078735 0.08374094 0.8128343 0.7575533 0.2061716 0.8434881 0.04393339 0.2817576 0.856314 0.03873323 0.5468552 0.8910451 0.2659629 0.9524063 0.9580801 0.3318478 0.09184787 0.8157145 0.01326908 0.7941016 0.9232865 0.9531499 0.4246153 0.8758845 0.7878766 0.3739315 0.8693344 0.4278367 0.1187609 0.8239353 0.4751271 0.1594320 0.8340107 0.1298515 0.3193727 0.8618747 0.3889802 0.2466491 0.8506491 0.8919073 0.8100828 0.9256764 0.1007718 0.736085 0.914866 0.349479 0.5494373 0.8913082 0.6323708 0.5641948 0.8930688 0.4892277 0.592541 0.8965224 0.4317011 0.4708038 0.8817252 0.3640796 0.4606731 0.8804793 0.01587231 0.7557596 0.9176387 0.4029896 0.7396784 0.9153261 0.770008 0.6728253 0.9064569 0.7758348 0.3790344 0.870014 0.9256958 0.3866791 0.8709969 0.6965112 0.4335875 0.8770627 0.7327289 0.7813275 0.9212618 0.5994686 0.6169224 0.8994933 0.4861951 0.4077528 0.8737927 0.6227905 0.7492211 0.916635 0.2875190 0.754739 0.9174414 0.3583206 0.7108705 0.911435 0.2379315 0.7160225 0.9121322 0.2701988 0.371379 0.869047 0.2939349 0.2881057 0.8572388 0.686119 0.3435468 0.8652209 0.737684 0.1018767 0.818933 0.1022111 0.4620055 0.8806782 0.06743866 0.001328434 0.7155677 0.3189881 0.8103201 0.9257696 0.002606828 0.2529638 0.85179 0.754083 0.7591491 0.9180267 0.7783337 0.6120818 0.8988813 0.09418842 0.4592762 0.8803417 0.1289346 0.05072358 0.7982676 0.7680676 0.723607 0.913101 0.09463494 0.3502957 0.8662194 0.6247462 0.4354511 0.8773048 0.5472626 0.2640714 0.8534728 0.8046991 0.4033715 0.8731942 0.992037 0.8971756 0.9419815 0.1490033 0.5284286
[R] how to realize integration in C?
Hi, all, In order to save simulation time, I need to call C in R. For the C codes, I need integrate certain functions ( both continous and discrete ), I wonder how to realize the integration in C? I try to use qsimp and trapzd subroutines from Numerical Recipes. But the error continues to appear: Too many steps in routine qsimp . Anyone has any idea why this happens and are there better solutions for the integration realization? Thanks a lot! Best, Lynette Sun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to realize integration in C?
Hi, Thank you for your suggestions. But I also realized I didn't make myself clear. I actually don't just need the integration results in R. The integration in C is part of my C codes. I need the integration results in C. How to realize that? Any suggestions? Thanks a lot! Best, Lynette - Original Message - From: Lynette Sun [EMAIL PROTECTED] To: r-help r-help@stat.math.ethz.ch Sent: Thursday, November 16, 2006 2:41 PM Subject: [R] how to realize integration in C? Hi, all, In order to save simulation time, I need to call C in R. For the C codes, I need integrate certain functions ( both continous and discrete ), I wonder how to realize the integration in C? I try to use qsimp and trapzd subroutines from Numerical Recipes. But the error continues to appear: Too many steps in routine qsimp . Anyone has any idea why this happens and are there better solutions for the integration realization? Thanks a lot! Best, Lynette Sun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which genetic optimization package allows customized crossover and mutation operation
Hi Dirk, Thanks for the reply. I have some GA code in java actually, but failed to install the SJava package in windows XP, really frustratting experience, do not understand why no binary distribution available. Now I am trying to finish my current work purely in java except some data exchange with R via files:(. I use a fake email address in outlookexpress to prevent spam actually, (have already discard some email addresses for that reason). and .. Sun is my real name;) have a nice day. Sun - 原始邮件 发件人: Dirk Eddelbuettel [EMAIL PROTECTED] 收件人: sun [EMAIL PROTECTED] 抄送: r-help@stat.math.ethz.ch 已发送: 2006/11/10(周五), 下午5:08:45 主题: Re: [R] Which genetic optimization package allows customized crossover and mutation operation On 10 November 2006 at 16:19, sun wrote: | No, I am not familiar with that project, actually never heard it before your | reply.:) | | Anyway, I found some code later on from Claudio Agostinelli who developed | this code for his teaching, I think. Unfortornuitly, his code does not work | on my settings. And I do not want to send much time on the object oriented | programing skill in R, so finally I went back to java for this GA work. FWIW I did some recent work with the 'pgapack' C++ library, but I have no R package for show for that. Also, mail to your address bounces and you are not posting with a real name. Both practices are frowned upon, so you may get fewer helpful replies than you could otherwise expect. Cheers, Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ 抢注雅虎免费邮箱-3.5G容量,20M附件! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which genetic optimization package allows customized crossover and mutation operation
No, I am not familiar with that project, actually never heard it before your reply.:) Anyway, I found some code later on from Claudio Agostinelli who developed this code for his teaching, I think. Unfortornuitly, his code does not work on my settings. And I do not want to send much time on the object oriented programing skill in R, so finally I went back to java for this GA work. Thanks anyway. Have a nice weekend. Sun - Original Message - From: Spencer Graves [EMAIL PROTECTED] To: sun [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Thursday, November 09, 2006 3:17 AM Subject: Re: [R] Which genetic optimization package allows customized crossover and mutation operation Are you familiar with www.bioconductor.org? They have a listserve with people who may know more about the question you asked. Spencer Graves sun wrote: Hi, I am looking for genetic optimization package that allow to define my own chromosome solution and crossover/mutation opoerations. I checked genoud and genopt, not with much effort of 'cause, with no luck. Any one can shed some light on this? thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Which genetic optimization package allows customized crossover and mutation operation
Hi, I am looking for genetic optimization package that allow to define my own chromosome solution and crossover/mutation opoerations. I checked genoud and genopt, not with much effort of 'cause, with no luck. Any one can shed some light on this? thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate random numbers that sum up to 1
Thanks for the answers. I am sorry if the question is too simple to make everyone thought it is a homework, but it is not. I am setting up a simulation with expected utility theory which use EU = sum(Pi*Ui) form and I have to randomly generate these Pis, I do not really know which distribution these Pis has to follow but just know they have to be equally random hence the question. I thought Dirichlet(1,1, ..., 1) could do the trick for me. Sun - Original Message - From: Duncan Murdoch [EMAIL PROTECTED] To: Alberto Monteiro [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Wednesday, October 11, 2006 11:39 PM Subject: Re: [R] generate random numbers that sum up to 1 On 10/11/2006 5:04 PM, Alberto Monteiro wrote: I don't have the previous messages, but it seems to me that the solutions didn't quite get the requirements of the problem. For example, it's obvious that for n = 2, the random variables X1 and X2 should be X1 - runif(1) and X2 - 1 - X1; while the solution X - runif(2); X1 - X[1] / sum(X); X2 - X[2] / sum(X) will give different distributions, as shown in this test case: N - 1000 M - matrix(runif(2 * N), 2, N) X - M[1,] / (M[1,] + M[2,]) hist(X) It's obvious that for a generic n-th dimensional set of uniform variables X1, ... X_n subject to the restriction X1 + ... + X_n = 1, the solution is to take a uniform distribution in the (n-1)-th dimensional hyperpyramid generated by the above relation and the restrictions that each X_i = 0. For example, for n = 3, we should sample from the equilateral triangle with vertices c(1,0,0), c(0,1,0) and c(0,0,1). For n = 4, we should sample from the pyramid whose vertices are c(1,0,0,0), c(0,1,0,0), c(0,0,1,0) and c(0,0,0,1). I don't know if there is a simple formula to do this sampling. That's exactly what the Dirichlet(1,1, ..., 1) distribution does. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generate random numbers that sum up to 1
I am trying to generate a vector of random numbers with the constraint that they have to sum up to one with uniform distribution. eg. {0.1,0.7,0.2 } any function to do this? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] formatting data to be analysed using multinomial logistic regression (nnet)
bump. would like to know the answer too. I am about using nnet--multinom to estimate a multinomial logit model, but are not sure if this function handles categorical data input. Thanks for any help. - Original Message - From: Bob Green [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Sunday, September 10, 2006 10:24 PM Subject: [R] formatting data to be analysed using multinomial logistic regression (nnet) I am looking into using the multinomial logistic regression option in the nnet library and have two questions about formatting the data. 1. Can data be analysed in the following format or does it need to be transformed into count data, such as the housing data in MASS? Id Crime paranoia hallucinate toc disorg crimhist age 1 2 1 0 1 0 1 25 2 2 0 1 1 1 1 37 3 1 1 0 1 1 0 42 4 3 0 0 0 0 1 25 5 2 1 0 1 0 0 49 2.Can a ratio variable such as $age be included into a model, such as the one below? crimepred - glm (crime ~ paranoia + hallucinate + toc + crimhist, family = poisson, data = mht ) Any assistance with the above is appreciated, regards Bob Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] input data format of multinom in nnet
Hi, I 'd like to use multinom to estimate a multinomial logit model of a conjoint survey with a format like: individual 1 --choice -- X1-X2-X3 1-A1.2blue12 1-0-1.4red-13 1-B3.1green---17 1-0-4.2red-14 .. 2C2.1-blue11 ... 0 means this alternative did not be chosen. choice set size is 2 for each choice. As I know at this moment, multinom deal with count data format (iris data). Anybody knows if multinom can take this format as input? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newbie question about index
Thanks for all these replies, all work perfectly. Sun --- Petr Pikal [EMAIL PROTECTED]写道: Hallo probably there are other options but outer(1:3,a, ==)*1 can do what you want. HTH Petr On 31 Aug 2006 at 22:41, z s wrote: Date sent:Thu, 31 Aug 2006 22:41:27 +0800 (CST) From: z s [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] newbie question about index Hi, I am trying to convert a variable a = sample(1:3,100,rep = T) represents choices into a 3X100 dummy varible b with corresponding element set to 1 otherwise 0. eg. a: 1 3 2 1 2 3 1 1 b: 1 0 0 1 0 0 1 1.. 0 0 1 0 1 0 0 0... 0 1 0 0 0 1 0 0... Is there something like b[a] =1 existing? I could not figure this out myself. - Mp3�杩袼�-新歌热歌高速下 [[alternative HTML version deleted]] Petr Pikal [EMAIL PROTECTED] ___ 抢注雅虎免费邮箱-3.5G容量,20M附件! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weibull distribution
Thanks for the suggestion! I switched to optimize(), al - optimize(f.fn, lower = 0.1, upper =100,tol=0.001); the warnings were gone and it works stably. But when I tried al - uniroot(f.fn, lower = 0.1, upper =100,tol=0.001); error occured: f() values at end points not of opposite sign. The error seems to me like there is no root found within the interval. I was not able to solve this problem. Thanks! Leaf - Original Message - From: Thomas Lumley, [EMAIL PROTECTED] Sent: 2006-07-21, 09:35:11 To: Valentin Dimitrov, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution On Fri, 21 Jul 2006, Valentin Dimitrov wrote: Dear Leaf, I modified your code as follows: gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) {abs(sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2))} alpha - optim(start, f.fn) beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } Now it works properly. First, I added an abs(). You tried to solve an equation by means of the R-function optim(), which finds a minimum. That's why you can find the solution of f(x)=a through minimization of abs(f(x)-a). Second, I deleted the optim-method BFGS from the optim() function, because it is not appropriate in this case. optim() is not appropriate at all in this case -- its help page says to use optimize() for one-dimensional problems. In fact, in one dimension there isn't any need to resort to optimization when you really want root-finding, and uniroot() is more appropriate than optimize(). -thomas [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weibull distribution
Hi William, Thanks a lot for your response. I checked the package and found that what I want to solve was the opposite, that is, from mean and sd to parameters shape and scale. Could anyone give some hints please? Any suggestion would be appreciated! Leaf - Original Message - From: William Asquith, [EMAIL PROTECTED] Sent: 2006-07-17, 16:18:31 To: Leaf Sun, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution Do not have answer per se, but if you are seeking some comparisons-- try three parameter Weibull as implemented by the lmomco package. William On Jul 17, 2006, at 1:18 PM, Leaf Sun wrote: Hi all, By its definition, the mean and variance of two-par. Weibull distribution are: (www.wikipedia.org) I was wondering, if given mean and sd. could we parameterize the distribution? I tried this in R. gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/ alpha)-(gamma(1+1/alpha))^2) alpha - optim(start, f.fn,method='BFGS') beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } But the problems come up here: 1)the return values of a and b are only related to the input mean, and nothing to do with the sd. For instance, when I apply a mean mu = 3 whatever I use sd=2, sd=4, the function returned the same scale and shape values. gamma.fun(3,4,10); ab 5.112554 3.263178 gamma.fun(3,2,10); ab 5.112554 3.263178 2) the start value determines the results: if I apply mean = 3, and sd=2, with a start of 10, it would return alpha close to 10, if I use a start = 100, it would return alpha close to 100. gamma.fun(3,2,10); ab 5.112554 3.263178 gamma.fun(3,2,100); a b 99.713.017120 Since I am not a statistician, I guess there must be some theoretical reasons wrong with this question. So I am looking forward to some correction and advice to solve these. Thanks a lot in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
@yahoo.ca Subject: Re: [R] Weibull distribution Message-ID: [EMAIL PROTECTED] X-mailer: Foxmail 6, 3, 103, 21 [cn] Mime-Version: 1.0 Content-Type: multipart/alternative; boundary==003_Dragon527446281311_= This is a multi-part message in MIME format. --=003_Dragon527446281311_= Content-Type: text/plain; charset=gb2312 Content-Transfer-Encoding: 7bit Hi William, Thanks a lot for your response. I checked the package and found that what I want to solve was the opposite, that is, from mean and sd to parameters shape and scale. Could anyone give some hints please? Any suggestion would be appreciated! Leaf - Original Message - From: William Asquith, [EMAIL PROTECTED] Sent: 2006-07-17, 16:18:31 To: Leaf Sun, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution Do not have answer per se, but if you are seeking some comparisons-- try three parameter Weibull as implemented by the lmomco package. William On Jul 17, 2006, at 1:18 PM, Leaf Sun wrote: Hi all, By its definition, the mean and variance of two-par. Weibull distribution are: (www.wikipedia.org) I was wondering, if given mean and sd. could we parameterize the distribution? I tried this in R. gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/ alpha)-(gamma(1+1/alpha))^2) alpha - optim(start, f.fn,method='BFGS') beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } But the problems come up here: 1)the return values of a and b are only related to the input mean, and nothing to do with the sd. For instance, when I apply a mean mu = 3 whatever I use sd=2, sd=4, the function returned the same scale and shape values. gamma.fun(3,4,10); ab 5.112554 3.263178 gamma.fun(3,2,10); ab 5.112554 3.263178 2) the start value determines the results: if I apply mean = 3, and sd=2, with a start of 10, it would return alpha close to 10, if I use a start = 100, it would return alpha close to 100. gamma.fun(3,2,10); ab 5.112554 3.263178 gamma.fun(3,2,100); a b 99.713.017120 Since I am not a statistician, I guess there must be some theoretical reasons wrong with this question. So I am looking forward to some correction and advice to solve these. Thanks a lot in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html --=003_Dragon527446281311_= Content-Type: text/html; charset=gb2312 Content-Transfer-Encoding: 7bit !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN HTMLHEAD META http-equiv=Content-Type content=text/html; charset=gb2312 META content=MSHTML 6.00.2900.2912 name=GENERATOR/HEAD BODY DIVHi William,/DIV DIVnbsp;/DIV DIVThanks a lot for your response. I checked the package and found that what I want to solve was the opposite, that is, from mean and sd to parameters shape and scale. Could anyone give some hints please? Any suggestion would be appreciated!/DIV DIVBRLeaf/DIV DIVnbsp;/DIV DIVnbsp;/DIV DIVnbsp;/DIV DIV- Original Message -/DIV DIVnbsp;/DIV DIVFONT size=2FONT face=TahomaSTRONGFrom:/STRONG William Asquith,nbsp;nbsp;A href=mailto:[EMAIL PROTECTED][EMAIL PROTECTED]/ABRBSent:/B 2006-07-17,nbsp; 16:18:31BRBTo:/B Leaf Sun, A href=mailto:[EMAIL PROTECTED][EMAIL PROTECTED]/ABRBSubject:/Bnbsp; Re: [R] Weibull distribution/FONT/FONT/DIV DIVnbsp;nbsp;/DIV DIV TABLE width=100% TBODY TR TD width=100% BLOCKQUOTE style=PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #00 2px solid; MARGIN-RIGHT: 0px DIVDo nbsp;not nbsp;have nbsp;answer nbsp;per nbsp;se, nbsp;but nbsp;if nbsp;you nbsp;are nbsp;seeking nbsp;some nbsp;comparisons-- nbsp;/DIV DIVtry nbsp;three nbsp;parameter nbsp;Weibull nbsp;as nbsp;implemented nbsp;by nbsp;the nbsp;lmomco nbsp;package./DIV DIVnbsp;/DIV DIVWilliam/DIV DIVOn nbsp;Jul nbsp;17, nbsp;2006, nbsp;at nbsp;1:18 nbsp;PM, nbsp;Leaf nbsp;Sun nbsp;wrote:/DIV DIVnbsp;/DIV DIVgt; nbsp;Hi nbsp;all,/DIV DIVgt;/DIV DIVgt; nbsp;By nbsp;its nbsp;definition, nbsp;the nbsp;mean nbsp;and nbsp;variance nbsp;of nbsp;two-par. nbsp;Weibull nbsp; nbsp;/DIV DIVgt; nbsp;distribution nbsp;are:/DIV DIVgt;/DIV
[R] Weibull distribution
Hi all, By its definition, the mean and variance of two-par. Weibull distribution are: (www.wikipedia.org) I was wondering, if given mean and sd. could we parameterize the distribution? I tried this in R. gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2) alpha - optim(start, f.fn,method='BFGS') beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } But the problems come up here: 1) the return values of a and b are only related to the input mean, and nothing to do with the sd. For instance, when I apply a mean mu = 3 whatever I use sd=2, sd=4, the function returned the same scale and shape values. gamma.fun(3,4,10); ab 5.112554 3.263178 gamma.fun(3,2,10); ab 5.112554 3.263178 2) the start value determines the results: if I apply mean = 3, and sd=2, with a start of 10, it would return alpha close to 10, if I use a start = 100, it would return alpha close to 100. gamma.fun(3,2,10); ab 5.112554 3.263178 gamma.fun(3,2,100); a b 99.71 3.017120 Since I am not a statistician, I guess there must be some theoretical reasons wrong with this question. So I am looking forward to some correction and advice to solve these. Thanks a lot in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] PLS method
dear all, I am a new comer to R and statistic. Now I have a little confuse about the package pls. I have to use 5 components to form a model. There are strong relationship between some of the components, which leads to the changes of the sign of each coeficeince, of course this is unwanted when using the normal regression way. So I choose the way of PLS, which is good at solve this kind of problem. In my work, q is the response and w,c,d,r,o are the 5 components. w c d r oq 1 219.580 0.880 102.742 12.988 0.9380 11 2 245.806 0.900 97.798 11.764 1.0080 12 3 219.850 0.910 93.764 5.608 1.1006 16 4 226.904 0.842 110.080 14.614 0.8398 7 5 250.792 0.868 108.212 14.714 0.8990 10 6 225.264 0.930 96.748 6.906 1.1784 16 7 229.562 0.856 103.204 12.900 0.8730 12 8 239.560 0.880 101.036 11.766 0.9452 12 9 199.008 0.920 91.338 3.918 1.1234 17 10 220.458 0.910 88.322 9.868 1.0746 13 11 201.228 0.910 89.202 10.328 1.0514 14 12 199.160 0.920 90.126 2.088 1.1326 15 13 135.540 0.786 121.506 19.140 0.6934 2 14 296.272 0.864 130.896 22.614 0.9104 6 15 190.766 0.840 108.050 7.336 0.8210 8 I have used the following sentence. b.pls-mvr(q~w+c+d+r+o,data=b,method=simpls) coef.mvr(b.pls) , , 5 comps q w 0.01993749 c 12.42713250 d -0.12050551 r -0.20287088 o 9.63670488 I have found that the sign of each component still cannot be explained by the reality. For instance, the sign of w should be negative rather than positive. b.pls-mvr(q~c+d+r+o,data=b,method=simpls) coef.mvr(b.pls) , , 4 comps q c 76.39196611 d -0.06512864 r -0.18272329 o -3.02212146 When I delete one of the components, the w, I found the coefficients of the rest ones also changes the sign, the component of o. As far as I concerned, this kind of situation should only happened when use the normal regression rather than PLS regression. Is there any wrong with my understanding? Why does this problem happen? I am appreciated for your help and answer. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] number of iteration s exceeded maximum of 50
Thanks to Douglas and all others who responded. I applied nls(y ~ a*x^b, start = list(a = a1, b = b1), control = list(maxiter = 500), trace=TRUE) to increase the number of iterations, found it successful. The suggestion Douglas raised in plotting the data and then tracing the optim numbers is correct because I found when I gave the number of b1 oppositely(say, should be positive, then given negative), nls( ) would never reached the convergence. Thanks for the nice suggestions! Leaf Sorry, I thought it was a straightforward question inside which I was stuck . I used nls( ) to estimate a and b in this function. nls(y~ a*x^b,start=list(a=a1,b=b1) seems the start list I gave was not able to reach convergence and it gave notes: number of iteration s exceeded maximum of 50. Then I putnls.control(maxiter = 50, tol = 1e-05, minFactor = 1/1024) in nls(.. ), and modified the argument of maxiter = 500. But it worked out as the same way and noted : number of iteration s exceeded maximum of 50. I have totally no idea how to set this parameter MAXITER. Thanks for any information! I think you are assuming that values passed to nls.control are persistent and will apply to further calls to nls.They don't.If you want to increase the maximum number of iterations you do it as nls(y ~ a*x^b, start = list(a = a1, b = b1), control = list(maxiter = 500)) but I would suggest that you also use trace = TRUE in the call to nls so you can see where the iterations are going.Merely increasing the number of iterations for an optimization that has gone into never-never land isn't going to help it converge. Two other things to consider: this is a partially linear model in the the parameter `a' appears linearly in the model expression.You may be able to stabilize the iterations using nls(y ~ x^b, start = list(b = b1), control = list(maxiter = 500), trace = TRUE, alg = 'plinear') Finally, and most important, please plot the data before trying to fit a nonlinear model to it so you can see if it has the characteristics that you would expect from data generate by such a model.As Brian Joiner said, Regression without plots is truly a regression. Leaf Hiall, Ifoundr-site-researchnotworkformethese days. WhenIwasdoingnls(),therewasan errornumberofiterationsexceededmaximumof50. Isetnumberinnls.controlwhichissupposedto controlthenumberofiterationsbutitdidn'twork well.Couldanybodywiththisexperiencetellmehow tofixit?Thanksinadvance! Wecannotmakesuggestionsunlessyoutelluswhat youtriedyourself. Idpossible,pleasegib´veareproducibleexamle. UweLigges Leaf [[alternativeHTMLversiondeleted]] __ R-help@stat.math.ethz.chmailinglist https://stat.ethz.ch/mailman/listinfo/r-help PLEASEdoreadthepostingguide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] number of iteration s exceeded maximum of 50
Sorry, I thought it was a straightforward question inside which I was stuck . I used nls( ) to estimate a and b in this function. nls(y~ a*x^b,start=list(a=a1,b=b1) seems the start list I gave was not able to reach convergence and it gave notes: number of iteration s exceeded maximum of 50. Then I put nls.control(maxiter = 50, tol = 1e-05, minFactor = 1/1024) in nls(.. ), and modified the argument of maxiter = 500. But it worked out as the same way and noted : number of iteration s exceeded maximum of 50. I have totally no idea how to set this parameter MAXITER. Thanks for any information! Leaf Hi all, I found r-site-research not work for me these days. When I was doing nls( ) , there was an error number of iterations exceeded maximum of 50. I set number in nls.control which is supposed to control the number of iterations but it didn't work well. Could anybody with this experience tell me how to fix it? Thanks in advance! We cannot make suggestions unless you tell us what you tried yourself. Id possible, please gib´ve a reproducible examle. Uwe Ligges Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] number of iteration s exceeded maximum of 50
Hi all, I found r-site-research not work for me these days. When I was doing nls( ) , there was an error number of iterations exceeded maximum of 50. I set number in nls.control which is supposed to control the number of iterations but it didn't work well. Could anybody with this experience tell me how to fix it? Thanks in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] small sample size confidence interval by bootstrap
Hi, All: I only have 4 samples. I wish to get a confidence interval around the mean. Is it reasonable? If not, is there a way to compute a confidence interval for such small sample size's mean? Many thanks, U [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Integration subroutine?
Hi all, Why fr2-function(x) { if(x1){x*x}else{1} } integrate(fr2,-1,2) gives the following wrong answer: 3 with absolute error 3.3e-14 Warning message: the condition has length 1 and only the first element will be used in: if (x 1) { while fr-function(x){as.numeric(x1)*(x^2-1)+1} integrate(fr,-1,2) works? 1.67 with absolute error 3.4e-05 Thanks. Lynette Sun Department of Statistics [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to get the row name?
Hi R-listers, I have a simple question about a data frame. I sorted a data set by one of the variable in some condition (eg. X=0), the followed is part of the achieved. I was wondering how can I get the row name, i. e. (1202, 2077 , 2328, 3341,... ) and save them as a vector. Thanks! Tag Species X Y Dbh3 Recr4 mort slope elevation aspectSA SR dist1 dist2 dist3 120219103 316 856.0 430.3 21 41 9.87151.42 60.08 25.38 1.02 0.2236068 0.7211103 1.3601471 207729893 316 935.4 482.7 28 41 5.66137.28 13.86 25.14 1.01 0.6403124 0.8944272 1.0630146 232832989 316 910.7 301.5 12 41 8.07137.69 86.16 25.26 1.01 0.300 1.2806248 1.3038405 334145198 316 975.2 2.4 144 41 2.95121.10 173.60 0.00 0.00 0.5656854 1.2727922 1.3416408 ... Regards, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] (no subject)
Hi, I am new in writing R extension. I read the Writing R Extensions, and search online but just cannot find the answer. I am trying to add c-code, which depend on GSL C library, into my R package. I am using Max OS-X 10.4.3, powerpc-apple-darwin8-gcc-4.0.1 I begin wtih package.skeleton, and then add my c file into folder src and then add file Makevars: PKG_LIBS=-lgsl -lgslcblas -lm In the zzz.R file, I add the function .First.lib: .First.lib - function(lib, pkg) { library.dynam(anovaGSL, pkg, lib) } I think these two things are enough. then I just run R CMD check anovaGSL (my package name) followed is the error message: Error: .First.lib failed for 'anovaGSL' Call sequence: 2: stop(gettextf(.First.lib failed for '%s', libraryPkgName(package)), domain = NA) 1: library(package, lib.loc = lib.loc, character.only = TRUE, verbose = FALSE) Execution halted See section 'Generic functions and methods' of the 'Writing R Extensions' manual. * checking replacement functions ... WARNING Error: .First.lib failed for 'anovaGSL' Call sequence: 2: stop(gettextf(.First.lib failed for '%s', libraryPkgName(package)), domain = NA) 1: library(package, lib.loc = lib.loc, character.only = TRUE, verbose = FALSE) Execution halted In R, the argument of a replacement function which corresponds to the right hand side must be named 'value'. * checking foreign function calls ... WARNING Error: .First.lib failed for 'anovaGSL' Call sequence: 2: stop(gettextf(.First.lib failed for '%s', libraryPkgName(package)), domain = NA) 1: library(package, lib.loc = lib.loc, character.only = TRUE, verbose = FALSE) Execution halted See section 'System and foreign language interfaces' of the 'Writing R Extensions' manual. * checking Rd files ... OK * checking for missing documentation entries ... ERROR Error: .First.lib failed for 'anovaGSL' appreciate any help!!! wei [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] any package to compare two scoring systems
Hi, I wonder if there is a package aimed at comparing two scoring systems. Such as, to compare a student's performance (scores or ranks) in two different testing systems to find out if they are consistent? Sorry if this question is too naive. It seems easy but just wonder if there are handy packages or special methodologies. Thank you, [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] read.list()
Hi all, I need to write and read a list in R. I did r.site.search, found there is a package rmutil doing this, unfortunately it is not on the list of package. In another words, I can't install it from any CRAN mirror. Anybody has idea about this? or any suggestion about the list? Thanks! Best! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] residuals in logistic regression model
In the logistic regression model, there is no residual log (pi/(1-pi)) = beta_0 + beta_1*X_1 + . But glm model will return residuals What is that? How to understand this? Can we put some residual in the logistic regression model by replacing pi with pi' (the estimated pi)? log (pi'/(1-pi')) = beta_0 + beta_1*X_1 + .+ ei Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] residuals in logistic regression model
Thanks, Professor. But is it ok to write residuals in the right hand side of the logistic regression formula? Some people said I cannot since the generalized linear model is to use a function to link the expectation to a linear model. So there should not be residuals in the right hand side. My question is that If residuals do exist (as in the glm model output), why not put them in the formula (for example, if I write the left-hand side as the estimated odds-ratio)? Many thanks! Happy Thanksgiving! On 11/24/05, John Fox [EMAIL PROTECTED] wrote: Dear Urania, The residuals method for glm objects can compute several kinds of residuals; the default is deviance residuals. See ?residuals.glm for details and references. I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Urania Sun Sent: Thursday, November 24, 2005 1:36 PM To: r-help@stat.math.ethz.ch Subject: [R] residuals in logistic regression model In the logistic regression model, there is no residual log (pi/(1-pi)) = beta_0 + beta_1*X_1 + . But glm model will return residuals What is that? How to understand this? Can we put some residual in the logistic regression model by replacing pi with pi' (the estimated pi)? log (pi'/(1-pi')) = beta_0 + beta_1*X_1 + .+ ei Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] residuals in logistic regression model
Thanks a lot, Professor. Now I know if I put some additive residuals in the right handside of the logistic regression equation, they are different with any glm returned residuals. But is it ever ok or legal to put some additive residuals in the right-hand side of the logistic equation regardless whatever they are when I write the left-hand side as the log odds-ratio of proportion instead of probability? I have a big confusion on this. Thanks a lot. Your book in the library is currently checked out by someone else. I may try to get one of my own. On 11/24/05, John Fox [EMAIL PROTECTED] wrote: Dear Urania, -Original Message- From: Urania Sun [mailto: [EMAIL PROTECTED] Sent: Thursday, November 24, 2005 8:52 PM To: John Fox Cc: r-help@stat.math.ethz.ch Subject: Re: [R] residuals in logistic regression model Thanks, Professor. But is it ok to write residuals in the right hand side of the logistic regression formula? Some people said I cannot since the generalized linear model is to use a function to link the expectation to a linear model. So there should not be residuals in the right hand side. My question is that If residuals do exist (as in the glm model output), why not put them in the formula (for example, if I write the left-hand side as the estimated odds-ratio)? There are several kinds of residuals for generalized linear models, as I mentioned (see ?residuals.glm). The residuals in the glm output are deviance residuals, which are the casewise (signed) components of the residual deviance; differences between y and fitted-y are called response residuals (and aren't generally as useful). The left-hand side of a logit model transformed with the logit-link is the log-odds, not the odds or odds-ratio. The form of the model to which the response residuals applies has the proportion, not the logit, on the left-hand side. These matters are discussed in the references given in ?residuals.glm, and in many other places, such as Sec. 6.6 of my R and S-PLUS Companion to Applied Regression. Many thanks! Happy Thanksgiving! Unfortunately we celebrate Thanksgiving in Canada in October, probably because the weather here in late November leaves little to be thankful for. Regards, John On 11/24/05, John Fox [EMAIL PROTECTED] wrote: Dear Urania, The residuals method for glm objects can compute several kinds of residuals; the default is deviance residuals. See ?residuals.glm for details and references. I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On Behalf Of Urania Sun Sent: Thursday, November 24, 2005 1:36 PM To: r-help@stat.math.ethz.ch Subject: [R] residuals in logistic regression model In the logistic regression model, there is no residual log (pi/(1-pi)) = beta_0 + beta_1*X_1 + . But glm model will return residuals What is that? How to understand this? Can we put some residual in the logistic regression model by replacing pi with pi' (the estimated pi)? log (pi'/(1-pi')) = beta_0 + beta_1*X_1 + .+ ei Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] what does the it when there is a zero events in the Logistic Regression with glm?
Sorry for my stupid mistakes and thanks for your reply. I just have a study on the occurrence of rare events. Although I collected thousands of observations, there are some groups with 0 events. I think it is too crude to drop those 0-events groups. I have read some books about logistic regression searched the r-help maillist. But I donot find enough information about separation. Would you be so kind to give me some suggestions on separation and the better algorithms? Thanks! Sh.G. Sun Prof Brian Ripley wrote: On Tue, 22 Nov 2005, S. Sun wrote: I have a question about the glm. Not really: your question is about understanding logistic regressions. When the events of an observation is 0, the logit function on it is Inf. I wonder how the glm solve it. Note that logit(0) = -Inf whereas logit(1) = Inf. It is the fitted probabilities which are passed to logit, not the empirical proportions. Logistic regression is often applied to Bernouilli trials with 0/1 proportions, with nothing to `solve'. So the issue only arises if the MLE would give 0 (or 1) fitted values, and it cannot in a logistic regression. You have here an example in which the MLE does not exist and the log-likelihood does not attain its maximum. Such situations are known as `separation' and it is well-known that there are better algorithms for such problems. An example: Treat Events Trials A 0 50 B 7 50 C 10 50 D 15 50 E 17 50 Program: treat - factor(c(A, B, C, D, E)) events - c(0, 7, 10, 15, 17) trials - rep(50, 5) glm(cbind(events, trials-events)~treat, family=binomial) What's wrong with it? And are there better ideas? Nothing is `wrong with it'. It finds fitted values which are very close to the observed values. You have chosen an inappropriate model and an inappropriate parametrization (see ?relevel). I presume you did think something is wrong, but you did not tell us what. Please do read the posting guide and try to provide us with enough information to help you. Also, please do sign your messages indicating who you are and what your background is. In cases like this the best advice is to suggest asking your supervisor (if you have one) or to read the literature (but what specifically depends on your background). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] what does the it when there is a zero events in the Logistic Regression with glm?
Dear all, I have a question about the glm. When the events of an observation is 0, the logit function on it is Inf. I wonder how the glm solve it. An example: Treat Events Trials A 0 50 B 7 50 C 10 50 D 15 50 E 17 50 Program: treat - factor(c(A, B, C, D, E)) events - c(0, 7, 10, 15, 17) trials - rep(50, 5) glm(cbind(events, trials-events)~treat, family=binomial) What's wrong with it? And are there better ideas? -- Sh. Sun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Point pattern to grid
Hi Roger, Thanks again for your kind help. Yes, I still use the 200K points data applying this program but the good thing is I found it finished in no time. The questions again here are: 1) try0 - lapply(split(as(df1, data.frame), res), mean) When I tried to replace mean to sum, error looks like this: Error in [EMAIL PROTECTED], i, drop = FALSE] : undefined columns selected 2) If I just need to know the number of points in each cells, how can I modify the codes. The codes still a bit beyond me. Thanks! Leaf === At 2005-11-18, 01:39:05 you wrote: === On Thu, 17 Nov 2005, Leaf Sun wrote: Dear all, I'd like to change a point pattern to a grid of cells and use one of the variables as the output. e.g. The point pattern is of a window of (500*500) and several features such as pH, SoilType etc. I like to divide it into a grid with cell size 5*5, and use the mean of the point values falling inside the cell as the output. Is there any package in R working with this? Thanks in advance! This might have been better posted on R-sig-geo. Try this: library(sp) df1 - data.frame(x=runif(1,0,500), y=runif(1,0,500), z=rnorm(1)) coordinates(df1) - c(x, y) summary(df1) # SpatialPointsDataFrame grd - GridTopology(c(2.5,2.5), c(5,5), c(100,100)) sgrd - SpatialGrid(grd) #SpatialGrid bbox(sgrd) res - overlay(sgrd, df1) # find which grid cells the points are in str(res) try0 - lapply(split(as(df1, data.frame), res), mean) # take means by grid cell - assumes all numeric columns in df1 # (soil type??) - maybe write a custom function to handle non-numeric # columns sensibly try01 - vector(mode=list, length=prod(slot(slot(sgrd, grid), cells.dim))) nafill - rep(as.numeric(NA), ncol(as(df1, data.frame))) try01 - lapply(try01, function(x) nafill) # make a container to put the means in with the right number of columns try01[as.integer(names(try0))] - try0 # insert means into correct list elements try1 - data.frame(t(data.frame(try01))) # transpose summary(try1) sgrd1 - SpatialGridDataFrame(slot(sgrd, grid), try1) image(sgrd1, x) image(sgrd1, y) image(sgrd1, z) It goes a bit further than the short description of the sp package in the latest R-News, and will most likely be a new method for overlay in sp. If these are your 200K points, it may take a little longer ... Cheers, Leaf -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] = = = = = = = = = = = = = = = = = = = = __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Point pattern to grid
Dear all, I'd like to change a point pattern to a grid of cells and use one of the variables as the output. e.g. The point pattern is of a window of (500*500) and several features such as pH, SoilType etc. I like to divide it into a grid with cell size 5*5, and use the mean of the point values falling inside the cell as the output. Is there any package in R working with this? Thanks in advance! Cheers, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Variogram
Sorry about this, I didn't know. I guess I have posted too much garbage on here. Thanks to Edzer for your answers! Leaf === At 2005-11-09, 02:14:21 you wrote: === Leaf, please note that r-help is not the appropriate place to ask package-specific questions. We have r-sig-geo for questions related to geographic data in R, and gstat has a mailing list on its own. The answer is below. -- Edzer Leaf wrote: Dear All, Is there anybody has the experience in using variogram(gstat) ? Please kindly give me some hints about the results. I used variogram() to build a semivariogram plot as: tr.var=variogram(Incr~1,loc=~X+Y,data=TRI2TU,width=5) then fir the variogram to get the parameters as: v.fit = fit.variogram(tr.var,vgm(0.5,Exp,300,1)) v.fit modelpsillrange 1 Nug 1.484879 0.0 2 Exp 3.476700 29.70914 This is the output of v.fit. Can anybody help me write the exponential formula for this variogram? I have the problem in understanding the result. BTW The equation you're looking for is: if h = 0, gamma(h) = 0 if h 0, gamma(h) = 1.484879 + 3.4767 (1 - exp(-h/29.70914)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html = = = = = = = = = = = = = = = = = = = = __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] spatial epidemiology
Hi, All: Does R have such packages for spatial epidemiology? Many thanks, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] OLS variables
Dear all, Is there any simple way in R that can I put the all the interactions of the variables in the OLS model? e.g. I have a bunch of variables, x1,x2, x20... I expect then to have interaction (e.g. x1*x2, x3*x4*x5... ) with some combinations(2 way or higher dimensions). Is there any way that I can write the model simpler? Thanks! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] OLS variables
Thanks for the information! Leaf === At 2005-11-06, 11:07:31 you wrote: === IMHO, the details section of help(formula) provides a nicer help. Regards, Adai On Sun, 2005-11-06 at 08:27 -0500, John Fox wrote: Dear Leaf, I assume that you're using lm() to fit the model, and that you don't really want *all* of the interactions among 20 predictors: You'd need quite a lot of data to fit a model with 2^20 terms in it, and might have trouble interpreting the results. If you know which interactions you're looking for, then why not specify them directly, as in lm(y ~ x1*x2 + x3*x4*x5 + etc.)? On the other hand, it you want to include all interactions, say, up to three-way, and you've put the variables in a data frame, then lm(y ~ .^3, data=DataFrame) will do it. There are many terms in this model, however, if not quite 2^20. The introductory manual that comes with R has information on model formulas in Section 11. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun Sent: Sunday, November 06, 2005 3:11 AM To: r-help@stat.math.ethz.ch Subject: [R] OLS variables Dear all, Is there any simple way in R that can I put the all the interactions of the variables in the OLS model? e.g. I have a bunch of variables, x1,x2, x20... I expect then to have interaction (e.g. x1*x2, x3*x4*x5... ) with some combinations(2 way or higher dimensions). Is there any way that I can write the model simpler? Thanks! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Adaikalavan Ramasamy[EMAIL PROTECTED] Centre for Statistics in Medicine http://www.ihs.ox.ac.uk/csm/ Wolfson College Annexe Tel : 01865 284 408 Linton Road, Oxford OX2 6UD Fax : 01865 284 424 . = = = = = = = = = = = = = = = = = = = = Leaf Sun [EMAIL PROTECTED] 2005-11-06 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] exponential convolution (different lembdas)?
Anyone knows how to realize multiple exponential(different lembdas) convolutons in R or where I can find the final equation? Thanks. Best, Lynette __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Visualizing a Data Distribution -- Was: breaks in hist()
Thanks for all the response. I think plotting a cdf or taking transformation could make the plot look better. But my further question is how to set the breaks to make the histogram concentrate in the interval of (0.01,0.2). I can even ignore the other parts of the values. Thanks! Leaf === At 2005-11-02, 12:07:12 you wrote: === Leaf Sun wrote: The histogram is highly screwed to the right, say, the range of the vector is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). I guess the histogram is as you wrote. See http://web.maths.unsw.edu.au/~tduong/seminars/intro2kde/ for a short explanation. -Original Message- From: Berton Gunter [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 02, 2005 1:10 PM To: 'Leaf Sun'; r-help@stat.math.ethz.ch Subject: [R] Visualizing a Data Distribution -- Was: breaks in hist() Leaf: An interesting question concerning graphical perception. As you have noted, choice of bin boundaries in a histogram can have a big effect on how a distribution is perceived. My $.02 (U.S.): Histograms are a relic of manual data plotting. We have much better alternatives these days that should be used instead. e.g. 1. (my preference, but properly not consumer-friendly). Plot the cdf instead (?ecdf) . 2. Plot a density estimator (?density ; ?densityplot) 3. See David Scott's ash package, perhaps the KernSmooth package also (though density() probably already has anything that you'd need from it). Cheers, -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun Sent: Wednesday, November 02, 2005 9:49 AM To: r-help@stat.math.ethz.ch Subject: [R] breaks in hist() Dear listers, A quick question about breaks in hist(). The histogram is highly screwed to the right, say, the range of the vector is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). My question is : how to set the breaks then make the histogram look even? Thanks in advance, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html = = = = = = = = = = = = = = = = = = = = __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] multidimensional integration not over a multidimensionalrectangle
Hi, anyone knows about any functions in R can get multidimensional integration not over a multidimensional rectangle (not adapt). For example, I tried the following function f(x,n)=x^n/n! phi.fun-function(x,n) { if (n==1) { x }else{ integrate(phi.fun, lower=0, upper=x, n=n-1)$value } } I could get f(4,2)=4^2/2!=8, but failed in f(4,3)=4^3/3! Thanks Best, Lynette __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] breaks in hist()
Dear listers, A quick question about breaks in hist(). The histogram is highly screwed to the right, say, the range of the vector is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). My question is : how to set the breaks then make the histogram look even? Thanks in advance, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] multidimensional integration not over a multidimensional rectangle
Hi, anyone knows about any functions in R can get multidimensional integration not over a multidimensional rectangle (not adapt). For example, I tried the following function f(x,n)=x^n/n! phi.fun-function(x,n) { if (n==1) { x }else{ integrate(phi.fun, lower=0, upper=x, n=n-1)$value } } I could get f(4,2)=4^2/2!=8, but failed in f(4,3)=4^3/3! Thanks Best, Lynette __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Finding the neighbors of the point
Hi Roger and the list, The package is working very well. What surprised me most is the speed. As I mentioned in my previous emails, I have to find the neighbors for around 200,000 individuals. It took no more than 10 minutes for the function to finish the searching and returned enough information (ldnn, lnn). As of the third dimension -Z, I applied the code you sent to me, also worked very well. I only modified some condition that meets the requirement. This package is just great for such neighbor searching. Thank you very much and all the best! Leaf === At 2005-10-25, 04:31:54 you wrote: === On Mon, 24 Oct 2005, Leaf Sun wrote: Running R 2.2.0 on winXP. Computer P4 CPU 3.2G and 1G of RAM. Please try the attached Windows binary package. Look at the help page for ann.dist(). It returns a list of three elements, the first, lnn, gives the index numbers of the neighbours closer than maxdist. From there say you have a vector z where you want the neighbour relation to apply only when z[i] z[j], so res - ann.dist(pts, maxdist=md) glist - vector(mode=list, length=length(res$lnn)) for (i in seq(along=res$lnn)) { if (length(res$lnn[[i]]) 0) { glist[[i]] - ifelse(z[i] z[res$lnn[[i]]], 1, 0) } } so glist tells you which to drop. Alternatively, you can drop them straight away: res - ann.dist(pts, maxdist=md) glist - vector(mode=list, length=length(res$lnn)) for (i in seq(along=res$lnn)) { if (length(res$lnn[[i]]) 0) { glist[[i]] - res$lnn[[i]]][z[i] z[res$lnn[[i } } (neither of these are tried, so the brackets may not match). Please let me know how you get on. Roger === At 2005-10-24, 09:46:28 you wrote: === On Mon, 24 Oct 2005, Leaf Sun wrote: No, I mean I have to find the neighbors of 200,000 points. Your R version and OS - output of version on your machine? Roger === At 2005-10-24, 03:30:41 you wrote: === On Fri, 21 Oct 2005, Leaf Sun wrote: Roger, The data frame is of 200,000 by 15 elements. Do you mean that you need to find distances in 15 dimensions? Roger I've learned some C, long time ago. But I guess I would understand the C codes. Thanks! Leaf === At 2005-10-21, 14:11:38 you wrote: === On Fri, 21 Oct 2005, Leaf Sun wrote: Dear all, I got point data of trees. I was wondering if anybody has experience in searching the neighbors within a specified distance efficiently. XY Z 99 34 65 98 35 29 98 34 28 99 33 33 98 32 23 99 33 21 99 33 22 99 32 24 99 30 23 ... What I want to do is : searching for the neighbors with a distance R for each tree the neighbor must have a bigger Z. The data set is huge so the R-codes is working slowly when I search it without subset it. And huge is how big? For very large problems, you'll need a kd-tree or r-tree approach to divide up the point locations before making the spatial query (I think the retention of neighbours with a larger z is the final step). There do not seem to be such functions in R or contributed packages at present. If you are willing to collaborate, I can pass on a draft package corrected by Christian Sangiorgio for approximate nearest neighbours (an interface to ANN by David Mount and collaborators), but it isn't working yet. So an investment in time and some knowledge of C++ will be useful. Any suggestion would be much appreciated! Leaf -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] = = = = = = = = = = = = = = = = = = = = __ Do You Yahoo!? http://mail.yahoo.com -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] . = = = = = = = = = = = = = = = = = = = = Leaf Sun [EMAIL PROTECTED] 2005-10-24 __ Do You Yahoo!? http://mail.yahoo.com -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED
[R] Errors occured
Hi all, Has anybody have the experience in the errors: Error in data.frame(..., check.names=FALSE): arguments imply differing number of rows: 343,15 This is the error occured in the middle of the program. I don't think the data frame has any problem, if there is problem with the program, why it happened in the middle? Does anybody have such an experience? It seems so weird to me. Thanks a lot! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Finding the neighbors of the point
Dear all, I got point data of trees. I was wondering if anybody has experience in searching the neighbors within a specified distance efficiently. XY Z 99 34 65 98 35 29 98 34 28 99 33 33 98 32 23 99 33 21 99 33 22 99 32 24 99 30 23 ... What I want to do is : searching for the neighbors with a distance R for each tree the neighbor must have a bigger Z. The data set is huge so the R-codes is working slowly when I search it without subset it. Any suggestion would be much appreciated! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Sorting a data frame by one of the variables
Dear all, I have a date frame like this: X Y Z 22 24 4.3 2.3 3.4 5.3 . 57.223.434 What my purpose is: to sort the data frame by either X, Y or Z. sample output is (sorted by X) : X Y Z 2.3 3.4 5.3 . .. 22 24 4.3 ... 57.2 23.4 34 I have no idea how to use sort, order or rank functions. Please help me out. Thanks! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] run many linear regressions against the same independent variables in batch
R function lm(response ~ term) allows me to run a linear regression on a single response vector. For example, I have recent one year historical prices for a stock and SP index. I can run regression of the stock prices (as response vector) against the SP index prices (as term vector). Now assume I have 1000 stocks to run the above regressions (against the same SP index prices). The only way I know is that I write a loop. Within each loop I do the regression for one stock price. Is there a batch method to run the 1000 regressions in one shot? Note that this functionality is available in SAS (the SAS procedure reg). Actually, some times we run such regressions for about 300K securities. Performing regressions in loop takes a long time. On the contrary, running on SAS is much faster. Thank you in advance. Heng Sun 212-855-5754 Director Quantitative Risk Depository Trust and Clearing Corporation [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] run many linear regressions against the same independent variables in batch
Thank you Gabor. I suspected R could do that. But I tried a data frame and it did not work. Now I test for matrix and it works. So it seems performing many regressions in one shot works for a matrix, but not a data frame. Heng Gabor Grothendieck [EMAIL PROTECTED] 10/14/2005 12:05 PM To Heng Sun [EMAIL PROTECTED] cc r-help@stat.math.ethz.ch Subject Re: [R] run many linear regressions against the same independent variables in batch This runs a regression of each column (except the first) of matrix state.x77 against the first: lm(state.x77[,-1] ~ state.x77[,1]) On 10/14/05, Heng Sun [EMAIL PROTECTED] wrote: R function lm(response ~ term) allows me to run a linear regression on a single response vector. For example, I have recent one year historical prices for a stock and SP index. I can run regression of the stock prices (as response vector) against the SP index prices (as term vector). Now assume I have 1000 stocks to run the above regressions (against the same SP index prices). The only way I know is that I write a loop. Within each loop I do the regression for one stock price. Is there a batch method to run the 1000 regressions in one shot? Note that this functionality is available in SAS (the SAS procedure reg). Actually, some times we run such regressions for about 300K securities. Performing regressions in loop takes a long time. On the contrary, running on SAS is much faster. Thank you in advance. Heng Sun 212-855-5754 Director Quantitative Risk Depository Trust and Clearing Corporation [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] generalized linear model for multinomial data?
Dear All: Does R have the package as in SAS's generalized logits model for nominal response data? I have searched but cannot find the windows package. Many thanks, HS __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] generalized linear model for multinomial data?
Dear Professor: Thanks! I just found it. It is bundled within the VR package. Hongyu - Original Message - From: John Fox [EMAIL PROTECTED] To: 'Hongyu Sun' [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Sunday, October 02, 2005 3:27 PM Subject: RE: [R] generalized linear model for multinomial data? Dear Hongyu, See multinom() in the nnet package, associated with Venables and Ripley's Modern Applied Statistics with S. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Hongyu Sun Sent: Sunday, October 02, 2005 3:07 PM To: r-help@stat.math.ethz.ch Subject: [R] generalized linear model for multinomial data? Dear All: Does R have the package as in SAS's generalized logits model for nominal response data? I have searched but cannot find the windows package. Many thanks, HS __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to do multiple comparisons in R?
Hi, Sorry I have to bother a question. Does R have the functions to do lsd, tukey, bonferonni, contrast etc. like in SAS? Many thanks, HS __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] the number of cluster
hi, all, how to decide the number of cluster before you use kmeans and hclust? thank you in advance! best -xpsun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to draw the data set processed by hclust?
hi, all i hava a dataset formated as follow: x1 y1 z1 x2 y2 z2 x3 y3 z3 ... how to cluster it with hclust? and the draw the data with color of cluster? thank you in advance! best! xpsun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] cl$cluster of kmeans
hi, all, i had a problem when i used kmeans as follow: pp - scan(m0028.data, quiet= TRUE) x - matrix(pp, nc=3, byrow=TRUE) cl-kmeans(x, 4, 20) plot(x, col=cl$cluster) save(cl$cluster, file=m0028.ks, ascii=TRUE) Error in save(cl$cluster, file = m0028.ks, ascii = TRUE) : Object cl$cluster not found when i wanted to save the vector of cluster only, a reflection was thrown out as that. could i save cl$cluster only? how? thank you in advance. best, xp sun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ploting an ellipse keeps giving errors
library (ellipse) shape1 = c (1, 0, 0,1) dim(shape1) = c(2,2) ellipse (center = c(0,0), shape = shape1, radius = 1) = Error in plot.xy(xy.coords(x, y), type = type, col = col, lty = lty, ...) : plot.new has not been called yet It is really frustrating. Also what do the shape matrix, radius correspond to an ellipse function (x-x0)^2/a + (y-y0)^2/b = 1 ? Please advise! Many thanks, Sun [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ploting an ellipse keeps giving errors
Thanks a lot. I added that line. library(ellipse) plot(-2:2, type='n') shape1 = c (1, 0, 0,1) dim(shape1) = c(2,2) center1 = c(0,0) radius1 = 1 ellipse (center1, shape1, radius1) But now it said: Error in ellipse(center, shape, radius) : dim- : dims [product 4] do not match the length of object [200] What is wrong? I don't understand the shape and radius's meaning so I have no idea of what object's lenght. Many thanks! Sun - Original Message - From: Patrick Burns [EMAIL PROTECTED] To: Sun [EMAIL PROTECTED] Sent: Wednesday, October 27, 2004 3:51 AM Subject: Re: [R] ploting an ellipse keeps giving errors The error message is telling you that it expects a plot to already be there. You can do something like: plot(-2:2, type='n') Patrick Burns Burns Statistics [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Sun wrote: library (ellipse) shape1 = c (1, 0, 0,1) dim(shape1) = c(2,2) ellipse (center = c(0,0), shape = shape1, radius = 1) = Error in plot.xy(xy.coords(x, y), type = type, col = col, lty = lty, ...) : plot.new has not been called yet It is really frustrating. Also what do the shape matrix, radius correspond to an ellipse function (x-x0)^2/a + (y-y0)^2/b = 1 ? Please advise! Many thanks, Sun [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ploting an ellipse keeps giving errors
Thank you. I found there are two ellipses 1. R2.0 library (car) 2. R1.9 and R2.0 library (ellipse) And they are different! I can't run 1. But the 2. is kind of specialized for t-distribution confidence and so on. I need to find a general ellipse for an ellipse equation like (x-x0)^2/a + (y-y0)^2/b =1 . Since I used chi-square percentile not t. I am trying to obtain the large sample 95% simultaneous confidence ellipse for two population means (say, weight and height). The input are the two sample means and their covariances. Maybe I have to make my own ellipse function. Sun - Original Message - From: Petr Pikal [EMAIL PROTECTED] To: Sun [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, October 27, 2004 4:14 AM Subject: Re: [R] ploting an ellipse keeps giving errors Hi Did you read what ellipse does and how it shall be used? And what about your system, R and ellipse version? From your example i got ellipse (center = c(0,0), shape = shape1, radius = 1) Error in ellipse.default(center = c(0, 0), shape = shape1, radius = 1) : Argument x is missing, with no default Ellipse is not for drawing arbitrary ellipses but has different usage. See its help page and an example. Cheers Petr On 27 Oct 2004 at 3:34, Sun wrote: library (ellipse) shape1 = c (1, 0, 0,1) dim(shape1) = c(2,2) ellipse (center = c(0,0), shape = shape1, radius = 1) = Error in plot.xy(xy.coords(x, y), type = type, col = col, lty = lty, ...) : plot.new has not been called yet It is really frustrating. Also what do the shape matrix, radius correspond to an ellipse function (x-x0)^2/a + (y-y0)^2/b = 1 ? Please advise! Many thanks, Sun [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ploting an ellipse keeps giving errors
Dear Mr. Fox: Now I got things done using your ellipse in Package car! Just note to plot an empty plot before calling ellipse. And input the correct radius and covariance matrix (I input the inverse of the covariance matrix multiplied by n as the covariance matrix, that was wrong. n should go to divide the righ hand side and square root to get the radius and no inverse needed). Thank you! By the way, I tried to write my own and it does not work only giving two parallel lines.:( Now the ellipse in car works totally well! Sun - Original Message - From: John Fox [EMAIL PROTECTED] To: 'Sun' [EMAIL PROTECTED] Cc: 'Petr Pikal' [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Wednesday, October 27, 2004 8:24 AM Subject: RE: [R] ploting an ellipse keeps giving errors Dear Sun, There are indeed (at least) two ellipse functions, and the one in the car package has different arguments from the one in the ellipse package. You should be able to use ellipse() in either package to do what you want; use the t argument for the version in ellipse or the radius argument for the version in car, setting this to the square-root of the critical chisquare or F. (ellipse in car works by deforming a circle.) If height and weight are from the same sample, then they won't be independent, so, e.g., the shape argument to ellipse in car would be their sample covariance matrix. The center argument to ellipse() in car and the centre argument to ellipse() in ellipse should be the vector of means. But the ellipse() function in ellipse and the data.ellipse() function in car already does all this, so why do you feel that you need to roll your own (or perhaps I misunderstand what you want)? I hope this helps. John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sun Sent: Wednesday, October 27, 2004 4:25 AM To: Petr Pikal Cc: [EMAIL PROTECTED] Subject: Re: [R] ploting an ellipse keeps giving errors Thank you. I found there are two ellipses 1. R2.0 library (car) 2. R1.9 and R2.0 library (ellipse) And they are different! I can't run 1. But the 2. is kind of specialized for t-distribution confidence and so on. I need to find a general ellipse for an ellipse equation like (x-x0)^2/a + (y-y0)^2/b =1 . Since I used chi-square percentile not t. I am trying to obtain the large sample 95% simultaneous confidence ellipse for two population means (say, weight and height). The input are the two sample means and their covariances. Maybe I have to make my own ellipse function. Sun - Original Message - From: Petr Pikal [EMAIL PROTECTED] To: Sun [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, October 27, 2004 4:14 AM Subject: Re: [R] ploting an ellipse keeps giving errors Hi Did you read what ellipse does and how it shall be used? And what about your system, R and ellipse version? From your example i got ellipse (center = c(0,0), shape = shape1, radius = 1) Error in ellipse.default(center = c(0, 0), shape = shape1, radius = 1) : Argument x is missing, with no default Ellipse is not for drawing arbitrary ellipses but has different usage. See its help page and an example. Cheers Petr On 27 Oct 2004 at 3:34, Sun wrote: library (ellipse) shape1 = c (1, 0, 0,1) dim(shape1) = c(2,2) ellipse (center = c(0,0), shape = shape1, radius = 1) = Error in plot.xy(xy.coords(x, y), type = type, col = col, lty = lty, ...) : plot.new has not been called yet It is really frustrating. Also what do the shape matrix, radius correspond to an ellipse function (x-x0)^2/a + (y-y0)^2/b = 1 ? Please advise! Many thanks, Sun [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
[R] How to use a matrix in pcurve?
Hi, Everyone, I want to calculate the principal curve of a points set. First I read the points'coordinate with function scan, then converted it to matrix with the function matrix, and fit the curve with function principal.curve. Here is my data in the file bmn007.data: 0.023603 -0.086540 -0.001533 0.024349 -0.083877 -0.001454 .. .. 0.025004 -0.083690 -0.001829 0.025562 -0.083877 -0.001857 0.026100 -0.083877 0.90 0.025965 -0.083877 0.002574 and the code as follow: pp - scan(bmn007.data, quiet= TRUE) x - matrix(pp, nc=2, byrow=TRUE) fit - principal.curve(x, plot = TRUE) points(fit,col=red) By now, I got a right result. But when i changed to use pcurve with matrix x as pcurve(x), an error was thrown as following: Estimating starting configuration using : CA Error in h %*% diag(sqrt(d)) : non-conformable arguments How to convert a matrix to the format could be accepted by pcurve? Any help appreciated! Regards - Sun __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to draw a multivariate function
Hi, Rusers: Thanks for answering my last questions. I am frustrated in plotting a trinomial pmf function f(x,y | n, pa, pb) = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) obviously it is a bivariate function of x and y. But I have put a lot of time on this. ** x - seq(0, n, len = n/2+1) # for now I set it to n/2 to control x+y = n y - seq(0, n, len = n/2+1) f = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) wireframe(f ~ x * y, shade = TRUE) ** well, but it plots nothing out. I wonder if you could help me? Seems R is hard to learn without your help. Many thanks, Sun [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to draw a multivariate function
Hi, All: Thanks. Here is the code n = 30 lamdaa = 4 lamdab = 1.5 pa = lamdaa/n pb = lamdab/n x - seq(0, n/2, len = n/2+1) y - seq(0, n/2, len = n/2+1) f = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) wireframe(f ~ x * y, shade = TRUE) The above cannot show anything. Just le t you know that now I changed to cloud, it can display something :) cloud(f ~ x * y, shade = TRUE) I have questions: 1. what does x*y mean here? I don't think it is a vector dot multiplication. I guess it will creat all rows of x and y for all possible combinations? Why wireframe cannot show here? 2. How to show the value on the cloud plot? I have no idea of how much the data value is from the plot. 3. Where can I get resources of R? The help file seems not very helpful to me. For example, the lm () function, its weighted least square option does not say clearly the weight = standard deviation. It said it is to minimize sum w*error^2, which mislead us to think it takes variance. I have to ask experienced people. And everytime the answer depends on luck. Thanks, - Original Message - From: Andrew Ward [EMAIL PROTECTED] To: Sun [EMAIL PROTECTED] Sent: Saturday, October 16, 2004 10:15 PM Subject: RE: [R] how to draw a multivariate function Dear Sun, Could you please provide an example that can be run by readers of the list? What you've given is missing at least n and pa. Regards, Andrew C. Ward,[EMAIL PROTECTED] Senior Analyst (Quantitative), Tel: +61 7 3864 0439 Queensland Studies Authority, Fax: +61 7 3229 3318 295 Ann Street, Brisbane Qld 4000, Australia -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Sun Sent: Sunday, 17 October 2004 12:59 PM To: R-help mailing list Subject: [R] how to draw a multivariate function Hi, Rusers: Thanks for answering my last questions. I am frustrated in plotting a trinomial pmf function f(x,y | n, pa, pb) = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) obviously it is a bivariate function of x and y. But I have put a lot of time on this. ** x - seq(0, n, len = n/2+1) # for now I set it to n/2 to control x+y = n y - seq(0, n, len = n/2+1) f = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) wireframe(f ~ x * y, shade = TRUE) ** well, but it plots nothing out. I wonder if you could help me? Seems R is hard to learn without your help. Many thanks, Sun [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ~~ This email (including any attached files) is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email. The views and opinions are the originator's and do not necessarily reflect those of the Queensland Studies Authority. All reasonable precautions have been taken to ensure that this email contained no viruses at the time it was sent. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to draw a multivariate function
Thanks, Spencer. Turing it into a function seems really to make difference. Well but I don't know what is the difference between data1 = cbind (x, y, f(x,y)) cloud(data1) and cloud(f(x,y) ~ x *y)? They draw different plots. I don't know which one is correct. I am really confused. Here are codes: n = 30 lamdaa = 4 lamdab = 1.5 pa = lamdaa/n pb = lamdab/n x - seq(0, n/2, len = n/2+1) y - seq(0, n/2, len = n/2+1) f - function(x,y) (factorial(n)/ (factorial(x) * factorial(y) * factorial(n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y))) data1 = cbind (x, y, f(x,y)) cloud(data1) cloud(f(x,y) ~ x *y) Thanks a lot - Original Message - From: Spencer Graves [EMAIL PROTECTED] To: Sun [EMAIL PROTECTED] Cc: R-help mailing list [EMAIL PROTECTED] Sent: Saturday, October 16, 2004 11:06 PM Subject: Re: [R] how to draw a multivariate function Have you studied the documentation for wireframe including the examples? See also the contour plot examples in Venables and Ripley (2002) Modern Applied Statistics with S, 4th ed. (Springer). It might help if you write your function as follows: f - function(x,y ,n, pa, pb) (factorial(n)/ (factorial(x) * factorial(y) * factorial(n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y))) I hope this fills in enough gaps that you will be able to complete the remaining steps. Good luck. spencer graves Sun wrote: Hi, Rusers: Thanks for answering my last questions. I am frustrated in plotting a trinomial pmf function f(x,y | n, pa, pb) = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) obviously it is a bivariate function of x and y. But I have put a lot of time on this. ** x - seq(0, n, len = n/2+1) # for now I set it to n/2 to control x+y = n y - seq(0, n, len = n/2+1) f = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) wireframe(f ~ x * y, shade = TRUE) ** well, but it plots nothing out. I wonder if you could help me? Seems R is hard to learn without your help. Many thanks, Sun [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to draw a multivariate function
Hi, Thanks. But minimizing 'sum(w*e^2)' means w is the variance instead of the standard deviation. However, the truth is that R takes standard deviation. R will square it! R-help document is not that to be proud of. It is not very clear or helpful sometimes. - Original Message - From: Deepayan Sarkar [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: Sun [EMAIL PROTECTED]; Andrew Ward [EMAIL PROTECTED] Sent: Sunday, October 17, 2004 12:04 AM Subject: Re: [R] how to draw a multivariate function On Saturday 16 October 2004 23:12, Sun wrote: Hi, All: Thanks. Here is the code n = 30 lamdaa = 4 lamdab = 1.5 pa = lamdaa/n pb = lamdab/n x - seq(0, n/2, len = n/2+1) y - seq(0, n/2, len = n/2+1) Have you looked at what values of x and y these produce? They include non-integer values. Are you sure you want that? f = factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y)) wireframe(f ~ x * y, shade = TRUE) The above cannot show anything. Just le t you know that now I changed to cloud, it can display something :) cloud(f ~ x * y, shade = TRUE) I have questions: 1. what does x*y mean here? I don't think it is a vector dot multiplication. I guess it will creat all rows of x and y for all possible combinations? Why wireframe cannot show here? Why guess instead of reading the documentation and looking at the examples? There's a very relevant example in the help page for wireframe. You clearly want to evaluate 'f' at all combinations of x and y, yet you seem to be evaluating it only along the diagonal (x = y). The correct way to do this is (as studying the examples should have suggested to you): g - expand.grid(x = seq(0, n), y = seq(0, n)) g$z - dtrinom(g$x, g$y) wireframe(z ~ x * y, data = g, shade = TRUE) where dtrinom could be defined as dtrinom - function(x, y) { ifelse(x + y n, NA, factorial(n)/ (factorial(x) * factorial(y) * factorial (n-x-y))* pa^x * pb^y * ((1-pa-pb)^(n-x-y))) } although I would suggest working on the log scale for numerical stability: dtrinom - function(x, y) { ifelse(x + y n, NA, exp((lfactorial(n) - lfactorial(x) - lfactorial(y) - lfactorial(n-x-y)) + x * log(pa) + y * log(pb) + (n-x-y) * log(1-pa-pb))) } 2. How to show the value on the cloud plot? I have no idea of how much the data value is from the plot. Read the documentation for the 'scales' argument. 3. Where can I get resources of R? The help file seems not very helpful to me. For example, the lm () function, its weighted least square option does not say clearly the weight = standard deviation. It said it is to minimize sum w*error^2, which mislead us to think it takes variance. I have to ask experienced people. And everytime the answer depends on luck. It's too bad you feel that way. Statistics, and in particular linear modeling, is a non-trivial subject, and R documentation is not supposed to serve as a textbook. If you don't understand what minimizing 'sum(w*e^2)' means, you really do need help from 'experienced people'. Alternatively, look at the references listed in the help page for lm. Hope that helps, Deepayan Thanks, - Original Message - From: Andrew Ward [EMAIL PROTECTED] To: Sun [EMAIL PROTECTED] Sent: Saturday, October 16, 2004 10:15 PM Subject: RE: [R] how to draw a multivariate function Dear Sun, Could you please provide an example that can be run by readers of the list? What you've given is missing at least n and pa. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] 20,000 * 6 data values
Hello, Rusers: What is the maximum number of data R can handle? Or I have to use SAS? I am trying to do some microarray data analysis. But I am totally new. Did anyone use R to do microarray analysis? Many thanks, Sun __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] KalmanLike: missing exogenous factor?
Prof Ripley, Thanks for explanation. I now understand where KalmanLike fits. I should not use exogenous factor. It should be called exogenous variable or inputs or known effects. My study on how trading sizes impact on stock prices has trading sizes as this exogenous variable. As you said, this should belong to some package. I did internet searches and found something similar but not covering my case. The restrictive access to web makes subscription from my work place not convenient. Sorry. Heng Sun Senior Quantitative Analyst Depository Trust and Clearing Corporation New York, USA 10041 Tel: 212-755-5754 --- Prof Brian Ripley [EMAIL PROTECTED] wrote: On Mon, 11 Oct 2004, Heng Sun wrote: From the help document on KalmanLike, KalmanRun, etc., I see the linear Gaussian state space model is a - T a + R e y = Z' a + eta following the book of Durbin and Koopman. In practice, it is useful to run Kalman filtering/smoothing/forecasting with exogenous factor: a - T a + L b + R e y = Z' a + M b + eta where b is some known vector (a function of time). Some other software like S-plus and Mathematica include the above exogenous factor. SsfPack by Koopman, etal. also has the factor built in the model to accommodate practical uses. So what is the rationale for R to leave off the exogenous factor? Is there a feasible way to convert the general model to the simple model in R? What is the rationale for your raising this? KalmanLike, KalmanRun, etc were written for R 1.5.0 as part of the ts package (see my article in R-news), and the ts applications (see the See Also section) do not need a so-called `exogenous factor' (which is not a `factor'). R does not pretend to have facilities for whatever subject area you mean (but do not say) by `in practice'. That's what addon packages are for (and some do touch on this area). We have no idea who [EMAIL PROTECTED]' is: it is courteous to use a signature and give your affiliation. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] KalmanLike: missing exogenous factor?
From the help document on KalmanLike, KalmanRun, etc., I see the linear Gaussian state space model is a - T a + R e y = Z' a + eta following the book of Durbin and Koopman. In practice, it is useful to run Kalman filtering/smoothing/forecasting with exogenous factor: a - T a + L b + R e y = Z' a + M b + eta where b is some known vector (a function of time). Some other software like S-plus and Mathematica include the above exogenous factor. SsfPack by Koopman, etal. also has the factor built in the model to accommodate practical uses. So what is the rationale for R to leave off the exogenous factor? Is there a feasible way to convert the general model to the simple model in R? Thanks, Heng Sun ___ Declare Yourself - Register online to vote today! __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to draw an overlaid plot for multiple curves using different symbols for each curve?
Hello, Rusers: I have a question to ask for your help. The data has three columns: a, b and c. We need to draw an overlaid plot of curves of a versus b for different c. That is, draw many curves in one plot and each curve uses a different symbol. Does R have such functions? I have searched for a while. But seems not. I am new to R. Your help is highly appreciated. Many thanks. Looking forward to your response, Sun __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to draw an overlaid plot for multiple curves usingdifferent symbols for each curve?
Hi, Thanks all (Andrew, Peter, Sean)! By the way, I looked at the xyplot in lattice library. It can draw a versus b, conditioned on c. But it draws curves in different panels and there seems no control on symbols. I will try points. Many thanks again, Sun - Original Message - From: Sean Davis [EMAIL PROTECTED] To: Sun [EMAIL PROTECTED]; R User-Liste [EMAIL PROTECTED] Sent: Sunday, September 19, 2004 6:49 PM Subject: Re: [R] how to draw an overlaid plot for multiple curves usingdifferent symbols for each curve? You might look at library(lattice) for some relatively natural solutions for your problem. Sean - Original Message - From: Sun [EMAIL PROTECTED] To: R User-Liste [EMAIL PROTECTED] Sent: Sunday, September 19, 2004 6:02 PM Subject: [R] how to draw an overlaid plot for multiple curves usingdifferent symbols for each curve? Hello, Rusers: I have a question to ask for your help. The data has three columns: a, b and c. We need to draw an overlaid plot of curves of a versus b for different c. That is, draw many curves in one plot and each curve uses a different symbol. Does R have such functions? I have searched for a while. But seems not. I am new to R. Your help is highly appreciated. Many thanks. Looking forward to your response, Sun __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] help for MLE
Hi, when I write the likelihood function as fn-function(x) -50*log(2*pi)-100*log(sigma)-(1/2*(sum((x-mu)/sigma)^2)) then what should I do since it shows that Error in log(sigma) : Object sigma not found. Thanks edward From: Edward Sun Dear Sir/Madam, I am using R version 1.8.1. I am doing following tast: First generate 100 Gaussion(3,1) numbers, then write the likelihood function to estimate the parameters of Gaussian distribution by direct maximizing the likelihood function. My likelihood function is: fn-function(x) (-50*log((sd(x))^2))-50*log(sqrt(2*pi))-(1/2*((mean(x))^2))*( sum((x-(mean(x))^2)) After I input above function, the '' '' turned to be '' + ''. and it did not work. I am looking for the help to solve this tast by writting a likelihood function. Thanks a lot. Best regards. edward sun -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help for MLE
Dear Sir/Madam, I am using R version 1.8.1. I am doing following tast: First generate 100 Gaussion(3,1) numbers, then write the likelihood function to estimate the parameters of Gaussian distribution by direct maximizing the likelihood function. My likelihood function is: fn-function(x) (-50*log((sd(x))^2))-50*log(sqrt(2*pi))-(1/2*((mean(x))^2))*(sum((x-(mean(x))^2)) After I input above function, the '' '' turned to be '' + ''. and it did not work. I am looking for the help to solve this tast by writting a likelihood function. Thanks a lot. Best regards. edward sun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help for MLE
Dear Sir/Madam, I am using R version 1.8.1. I am doing following tast: First generate 100 Gaussion(3,1) numbers, then write the likelihood function to estimate the parameters of Gaussian distribution by direct maximizing the likelihood function. My likelihood function is: fn-function(x) (-50*log((sd(x))^2))-50*log(sqrt(2*pi))-(1/2*((mean(x))^2))*(sum((x-(mean(x))^2)) After I input above function, the '' '' turned to be '' + ''. and it did not work. I am looking for the help to solve this tast by writting a likelihood function. Thanks a lot. Best regards. edward sun p.s x=rnorm(100, mean=3, sd=1) library(MASS) fitdistr(x, normal) and fitdistr(x, normal, list(mean=0, sd=1)) do not work. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Maximum likelihood estimation in R
Dear Sir, I am a new user of R and I am doing a tast, which is: find the maximum likelihood estimate of the parameter of Gaussian distribution for generated 100 numbers by using x=rnorm(100, mean=3, sd=1). I tried to use following Maximum Likelihood function fn-function(x) (-50*log((sd(x))^2))-50*log(sqrt(2*pi))-(1/2*((mean(x))^2))*(sum((x-(mean(x))^2)), but it did not work. I am looking for the complete syntax to finish my target task. Thanks for your help. Best regards. Edward Sun Germany __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] for help about MLE in R
Dear Sir, I am using R to estimate two parameters in Normal distribution. I generated 100 normal distributed numbers, on which to estimate the parameter. The syntax is: fn-function(x)-50*log((y)^2)+50*log(2*pi)-(1/2*(z^2))*(sum((x-y)^2)) out-nlm(fn, x, hessian=TRUE) but it does not work. Could you please help me to compose the syntax for the purpose that find maximum likelihood estimates of the generated random numbers by direct maximization of the likelihood function? Thanks a lot. Best regards Edward Sun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Message (The distribution of your message dated Tue, 27...)
The distribution of your message dated Tue, 27 Jan 2004 09:41:55 -0300 with subject test has been postponed because the DOCS list is held. No action is required from you; your message will be reprocessed automatically once the list owner releases the list. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] about parameter fitting of Gld(Generalized Lambda Distribution)
Currently, I am intrested in parameter fitting of Generalized Lambda Distribution.And I have found two packages in R related to Gld,Davies and gld. What's a pity that no method in Davies deals with fitting of gld,and starship used in package:gld is quite time-consuming when sample size is large. So i wonder if there is method of moments or least square implementation available at now. or if u know the method of moments(or moment matching), i think u must know how to solve a optimization problem of two parameters,involing multiple beta functions.plz give me some hints. Thanks in advance. Regards, Jean Sun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: Re: [R] what does the sum of square of Gaussian RVs with differentvariance obey?
Thanks a lot. It does work. The fitted data match the simulated ones well. Even no need the shifted or scaled version of Chi-squared pdf. Also, I have tested the case of non-independent RVs,generated by linear combining of independent Gaussian RVs,the result is satisfactory too. Regards, J.Sun 2003-09-23 07:07:00 Thomas Lumley wrote On Tue, 23 Sep 2003, Jean Sun wrote: From basic statistics principle,we know,given several i.i.d Gaussian RVs with zero or nonzero mean,the sum of square of them is a central or noncentral Chi-distributed RV.However if these Gaussian RVs have different variances,what does the sum of square of them obey? Nothing very useful. It's a mixture of chisquare(1) variables. One standard approach is to approximate it by a multiple of a chisquared distribution that has the correct mean and variance. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] what does the sum of square of Gaussian RVs with different variance obey?
From basic statistics principle,we know,given several i.i.d Gaussian RVs with zero or nonzero mean,the sum of square of them is a central or noncentral Chi-distributed RV.However if these Gaussian RVs have different variances,what does the sum of square of them obey? Thanks in advance. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help