Re: [R] Weibull distribution
Thanks for the suggestion! I switched to optimize(), al - optimize(f.fn, lower = 0.1, upper =100,tol=0.001); the warnings were gone and it works stably. But when I tried al - uniroot(f.fn, lower = 0.1, upper =100,tol=0.001); error occured: f() values at end points not of opposite sign. The error seems to me like there is no root found within the interval. I was not able to solve this problem. Thanks! Leaf - Original Message - From: Thomas Lumley, [EMAIL PROTECTED] Sent: 2006-07-21, 09:35:11 To: Valentin Dimitrov, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution On Fri, 21 Jul 2006, Valentin Dimitrov wrote: Dear Leaf, I modified your code as follows: gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) {abs(sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2))} alpha - optim(start, f.fn) beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } Now it works properly. First, I added an abs(). You tried to solve an equation by means of the R-function optim(), which finds a minimum. That's why you can find the solution of f(x)=a through minimization of abs(f(x)-a). Second, I deleted the optim-method BFGS from the optim() function, because it is not appropriate in this case. optim() is not appropriate at all in this case -- its help page says to use optimize() for one-dimensional problems. In fact, in one dimension there isn't any need to resort to optimization when you really want root-finding, and uniroot() is more appropriate than optimize(). -thomas [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weibull distribution
Hi William, Thanks a lot for your response. I checked the package and found that what I want to solve was the opposite, that is, from mean and sd to parameters shape and scale. Could anyone give some hints please? Any suggestion would be appreciated! Leaf - Original Message - From: William Asquith, [EMAIL PROTECTED] Sent: 2006-07-17, 16:18:31 To: Leaf Sun, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution Do not have answer per se, but if you are seeking some comparisons-- try three parameter Weibull as implemented by the lmomco package. William On Jul 17, 2006, at 1:18 PM, Leaf Sun wrote: Hi all, By its definition, the mean and variance of two-par. Weibull distribution are: (www.wikipedia.org) I was wondering, if given mean and sd. could we parameterize the distribution? I tried this in R. gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/ alpha)-(gamma(1+1/alpha))^2) alpha - optim(start, f.fn,method='BFGS') beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } But the problems come up here: 1)the return values of a and b are only related to the input mean, and nothing to do with the sd. For instance, when I apply a mean mu = 3 whatever I use sd=2, sd=4, the function returned the same scale and shape values. gamma.fun(3,4,10); ab 5.112554 3.263178 gamma.fun(3,2,10); ab 5.112554 3.263178 2) the start value determines the results: if I apply mean = 3, and sd=2, with a start of 10, it would return alpha close to 10, if I use a start = 100, it would return alpha close to 100. gamma.fun(3,2,10); ab 5.112554 3.263178 gamma.fun(3,2,100); a b 99.713.017120 Since I am not a statistician, I guess there must be some theoretical reasons wrong with this question. So I am looking forward to some correction and advice to solve these. Thanks a lot in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
@yahoo.ca Subject: Re: [R] Weibull distribution Message-ID: [EMAIL PROTECTED] X-mailer: Foxmail 6, 3, 103, 21 [cn] Mime-Version: 1.0 Content-Type: multipart/alternative; boundary==003_Dragon527446281311_= This is a multi-part message in MIME format. --=003_Dragon527446281311_= Content-Type: text/plain; charset=gb2312 Content-Transfer-Encoding: 7bit Hi William, Thanks a lot for your response. I checked the package and found that what I want to solve was the opposite, that is, from mean and sd to parameters shape and scale. Could anyone give some hints please? Any suggestion would be appreciated! Leaf - Original Message - From: William Asquith, [EMAIL PROTECTED] Sent: 2006-07-17, 16:18:31 To: Leaf Sun, [EMAIL PROTECTED] Subject: Re: [R] Weibull distribution Do not have answer per se, but if you are seeking some comparisons-- try three parameter Weibull as implemented by the lmomco package. William On Jul 17, 2006, at 1:18 PM, Leaf Sun wrote: Hi all, By its definition, the mean and variance of two-par. Weibull distribution are: (www.wikipedia.org) I was wondering, if given mean and sd. could we parameterize the distribution? I tried this in R. gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/ alpha)-(gamma(1+1/alpha))^2) alpha - optim(start, f.fn,method='BFGS') beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } But the problems come up here: 1)the return values of a and b are only related to the input mean, and nothing to do with the sd. For instance, when I apply a mean mu = 3 whatever I use sd=2, sd=4, the function returned the same scale and shape values. gamma.fun(3,4,10); ab 5.112554 3.263178 gamma.fun(3,2,10); ab 5.112554 3.263178 2) the start value determines the results: if I apply mean = 3, and sd=2, with a start of 10, it would return alpha close to 10, if I use a start = 100, it would return alpha close to 100. gamma.fun(3,2,10); ab 5.112554 3.263178 gamma.fun(3,2,100); a b 99.713.017120 Since I am not a statistician, I guess there must be some theoretical reasons wrong with this question. So I am looking forward to some correction and advice to solve these. Thanks a lot in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html --=003_Dragon527446281311_= Content-Type: text/html; charset=gb2312 Content-Transfer-Encoding: 7bit !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN HTMLHEAD META http-equiv=Content-Type content=text/html; charset=gb2312 META content=MSHTML 6.00.2900.2912 name=GENERATOR/HEAD BODY DIVHi William,/DIV DIVnbsp;/DIV DIVThanks a lot for your response. I checked the package and found that what I want to solve was the opposite, that is, from mean and sd to parameters shape and scale. Could anyone give some hints please? Any suggestion would be appreciated!/DIV DIVBRLeaf/DIV DIVnbsp;/DIV DIVnbsp;/DIV DIVnbsp;/DIV DIV- Original Message -/DIV DIVnbsp;/DIV DIVFONT size=2FONT face=TahomaSTRONGFrom:/STRONG William Asquith,nbsp;nbsp;A href=mailto:[EMAIL PROTECTED][EMAIL PROTECTED]/ABRBSent:/B 2006-07-17,nbsp; 16:18:31BRBTo:/B Leaf Sun, A href=mailto:[EMAIL PROTECTED][EMAIL PROTECTED]/ABRBSubject:/Bnbsp; Re: [R] Weibull distribution/FONT/FONT/DIV DIVnbsp;nbsp;/DIV DIV TABLE width=100% TBODY TR TD width=100% BLOCKQUOTE style=PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #00 2px solid; MARGIN-RIGHT: 0px DIVDo nbsp;not nbsp;have nbsp;answer nbsp;per nbsp;se, nbsp;but nbsp;if nbsp;you nbsp;are nbsp;seeking nbsp;some nbsp;comparisons-- nbsp;/DIV DIVtry nbsp;three nbsp;parameter nbsp;Weibull nbsp;as nbsp;implemented nbsp;by nbsp;the nbsp;lmomco nbsp;package./DIV DIVnbsp;/DIV DIVWilliam/DIV DIVOn nbsp;Jul nbsp;17, nbsp;2006, nbsp;at nbsp;1:18 nbsp;PM, nbsp;Leaf nbsp;Sun nbsp;wrote:/DIV DIVnbsp;/DIV DIVgt; nbsp;Hi nbsp;all,/DIV DIVgt;/DIV DIVgt; nbsp;By nbsp;its nbsp;definition, nbsp;the nbsp;mean nbsp;and nbsp;variance nbsp;of nbsp;two-par. nbsp;Weibull nbsp; nbsp;/DIV DIVgt; nbsp;distribution nbsp;are:/DIV DIVgt;/DIV
[R] Weibull distribution
Hi all, By its definition, the mean and variance of two-par. Weibull distribution are: (www.wikipedia.org) I was wondering, if given mean and sd. could we parameterize the distribution? I tried this in R. gamma.fun - function(mu,sd,start=100) { f.fn - function(alpha) sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2) alpha - optim(start, f.fn,method='BFGS') beta - mu/gamma(1+1/alpha$par) return(list=c(a=alpha$par,b=beta)); } But the problems come up here: 1) the return values of a and b are only related to the input mean, and nothing to do with the sd. For instance, when I apply a mean mu = 3 whatever I use sd=2, sd=4, the function returned the same scale and shape values. gamma.fun(3,4,10); ab 5.112554 3.263178 gamma.fun(3,2,10); ab 5.112554 3.263178 2) the start value determines the results: if I apply mean = 3, and sd=2, with a start of 10, it would return alpha close to 10, if I use a start = 100, it would return alpha close to 100. gamma.fun(3,2,10); ab 5.112554 3.263178 gamma.fun(3,2,100); a b 99.71 3.017120 Since I am not a statistician, I guess there must be some theoretical reasons wrong with this question. So I am looking forward to some correction and advice to solve these. Thanks a lot in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] number of iteration s exceeded maximum of 50
Thanks to Douglas and all others who responded. I applied nls(y ~ a*x^b, start = list(a = a1, b = b1), control = list(maxiter = 500), trace=TRUE) to increase the number of iterations, found it successful. The suggestion Douglas raised in plotting the data and then tracing the optim numbers is correct because I found when I gave the number of b1 oppositely(say, should be positive, then given negative), nls( ) would never reached the convergence. Thanks for the nice suggestions! Leaf Sorry, I thought it was a straightforward question inside which I was stuck . I used nls( ) to estimate a and b in this function. nls(y~ a*x^b,start=list(a=a1,b=b1) seems the start list I gave was not able to reach convergence and it gave notes: number of iteration s exceeded maximum of 50. Then I putnls.control(maxiter = 50, tol = 1e-05, minFactor = 1/1024) in nls(.. ), and modified the argument of maxiter = 500. But it worked out as the same way and noted : number of iteration s exceeded maximum of 50. I have totally no idea how to set this parameter MAXITER. Thanks for any information! I think you are assuming that values passed to nls.control are persistent and will apply to further calls to nls.They don't.If you want to increase the maximum number of iterations you do it as nls(y ~ a*x^b, start = list(a = a1, b = b1), control = list(maxiter = 500)) but I would suggest that you also use trace = TRUE in the call to nls so you can see where the iterations are going.Merely increasing the number of iterations for an optimization that has gone into never-never land isn't going to help it converge. Two other things to consider: this is a partially linear model in the the parameter `a' appears linearly in the model expression.You may be able to stabilize the iterations using nls(y ~ x^b, start = list(b = b1), control = list(maxiter = 500), trace = TRUE, alg = 'plinear') Finally, and most important, please plot the data before trying to fit a nonlinear model to it so you can see if it has the characteristics that you would expect from data generate by such a model.As Brian Joiner said, Regression without plots is truly a regression. Leaf Hiall, Ifoundr-site-researchnotworkformethese days. WhenIwasdoingnls(),therewasan errornumberofiterationsexceededmaximumof50. Isetnumberinnls.controlwhichissupposedto controlthenumberofiterationsbutitdidn'twork well.Couldanybodywiththisexperiencetellmehow tofixit?Thanksinadvance! Wecannotmakesuggestionsunlessyoutelluswhat youtriedyourself. Idpossible,pleasegib´veareproducibleexamle. UweLigges Leaf [[alternativeHTMLversiondeleted]] __ R-help@stat.math.ethz.chmailinglist https://stat.ethz.ch/mailman/listinfo/r-help PLEASEdoreadthepostingguide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] number of iteration s exceeded maximum of 50
Sorry, I thought it was a straightforward question inside which I was stuck . I used nls( ) to estimate a and b in this function. nls(y~ a*x^b,start=list(a=a1,b=b1) seems the start list I gave was not able to reach convergence and it gave notes: number of iteration s exceeded maximum of 50. Then I put nls.control(maxiter = 50, tol = 1e-05, minFactor = 1/1024) in nls(.. ), and modified the argument of maxiter = 500. But it worked out as the same way and noted : number of iteration s exceeded maximum of 50. I have totally no idea how to set this parameter MAXITER. Thanks for any information! Leaf Hi all, I found r-site-research not work for me these days. When I was doing nls( ) , there was an error number of iterations exceeded maximum of 50. I set number in nls.control which is supposed to control the number of iterations but it didn't work well. Could anybody with this experience tell me how to fix it? Thanks in advance! We cannot make suggestions unless you tell us what you tried yourself. Id possible, please gib´ve a reproducible examle. Uwe Ligges Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] number of iteration s exceeded maximum of 50
Hi all, I found r-site-research not work for me these days. When I was doing nls( ) , there was an error number of iterations exceeded maximum of 50. I set number in nls.control which is supposed to control the number of iterations but it didn't work well. Could anybody with this experience tell me how to fix it? Thanks in advance! Leaf [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to get the row name?
Hi R-listers, I have a simple question about a data frame. I sorted a data set by one of the variable in some condition (eg. X=0), the followed is part of the achieved. I was wondering how can I get the row name, i. e. (1202, 2077 , 2328, 3341,... ) and save them as a vector. Thanks! Tag Species X Y Dbh3 Recr4 mort slope elevation aspectSA SR dist1 dist2 dist3 120219103 316 856.0 430.3 21 41 9.87151.42 60.08 25.38 1.02 0.2236068 0.7211103 1.3601471 207729893 316 935.4 482.7 28 41 5.66137.28 13.86 25.14 1.01 0.6403124 0.8944272 1.0630146 232832989 316 910.7 301.5 12 41 8.07137.69 86.16 25.26 1.01 0.300 1.2806248 1.3038405 334145198 316 975.2 2.4 144 41 2.95121.10 173.60 0.00 0.00 0.5656854 1.2727922 1.3416408 ... Regards, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] read.list()
Hi all, I need to write and read a list in R. I did r.site.search, found there is a package rmutil doing this, unfortunately it is not on the list of package. In another words, I can't install it from any CRAN mirror. Anybody has idea about this? or any suggestion about the list? Thanks! Best! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Point pattern to grid
Hi Roger, Thanks again for your kind help. Yes, I still use the 200K points data applying this program but the good thing is I found it finished in no time. The questions again here are: 1) try0 - lapply(split(as(df1, data.frame), res), mean) When I tried to replace mean to sum, error looks like this: Error in [EMAIL PROTECTED], i, drop = FALSE] : undefined columns selected 2) If I just need to know the number of points in each cells, how can I modify the codes. The codes still a bit beyond me. Thanks! Leaf === At 2005-11-18, 01:39:05 you wrote: === On Thu, 17 Nov 2005, Leaf Sun wrote: Dear all, I'd like to change a point pattern to a grid of cells and use one of the variables as the output. e.g. The point pattern is of a window of (500*500) and several features such as pH, SoilType etc. I like to divide it into a grid with cell size 5*5, and use the mean of the point values falling inside the cell as the output. Is there any package in R working with this? Thanks in advance! This might have been better posted on R-sig-geo. Try this: library(sp) df1 - data.frame(x=runif(1,0,500), y=runif(1,0,500), z=rnorm(1)) coordinates(df1) - c(x, y) summary(df1) # SpatialPointsDataFrame grd - GridTopology(c(2.5,2.5), c(5,5), c(100,100)) sgrd - SpatialGrid(grd) #SpatialGrid bbox(sgrd) res - overlay(sgrd, df1) # find which grid cells the points are in str(res) try0 - lapply(split(as(df1, data.frame), res), mean) # take means by grid cell - assumes all numeric columns in df1 # (soil type??) - maybe write a custom function to handle non-numeric # columns sensibly try01 - vector(mode=list, length=prod(slot(slot(sgrd, grid), cells.dim))) nafill - rep(as.numeric(NA), ncol(as(df1, data.frame))) try01 - lapply(try01, function(x) nafill) # make a container to put the means in with the right number of columns try01[as.integer(names(try0))] - try0 # insert means into correct list elements try1 - data.frame(t(data.frame(try01))) # transpose summary(try1) sgrd1 - SpatialGridDataFrame(slot(sgrd, grid), try1) image(sgrd1, x) image(sgrd1, y) image(sgrd1, z) It goes a bit further than the short description of the sp package in the latest R-News, and will most likely be a new method for overlay in sp. If these are your 200K points, it may take a little longer ... Cheers, Leaf -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] = = = = = = = = = = = = = = = = = = = = __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Point pattern to grid
Dear all, I'd like to change a point pattern to a grid of cells and use one of the variables as the output. e.g. The point pattern is of a window of (500*500) and several features such as pH, SoilType etc. I like to divide it into a grid with cell size 5*5, and use the mean of the point values falling inside the cell as the output. Is there any package in R working with this? Thanks in advance! Cheers, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Variogram
Sorry about this, I didn't know. I guess I have posted too much garbage on here. Thanks to Edzer for your answers! Leaf === At 2005-11-09, 02:14:21 you wrote: === Leaf, please note that r-help is not the appropriate place to ask package-specific questions. We have r-sig-geo for questions related to geographic data in R, and gstat has a mailing list on its own. The answer is below. -- Edzer Leaf wrote: Dear All, Is there anybody has the experience in using variogram(gstat) ? Please kindly give me some hints about the results. I used variogram() to build a semivariogram plot as: tr.var=variogram(Incr~1,loc=~X+Y,data=TRI2TU,width=5) then fir the variogram to get the parameters as: v.fit = fit.variogram(tr.var,vgm(0.5,Exp,300,1)) v.fit modelpsillrange 1 Nug 1.484879 0.0 2 Exp 3.476700 29.70914 This is the output of v.fit. Can anybody help me write the exponential formula for this variogram? I have the problem in understanding the result. BTW The equation you're looking for is: if h = 0, gamma(h) = 0 if h 0, gamma(h) = 1.484879 + 3.4767 (1 - exp(-h/29.70914)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html = = = = = = = = = = = = = = = = = = = = __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] OLS variables
Dear all, Is there any simple way in R that can I put the all the interactions of the variables in the OLS model? e.g. I have a bunch of variables, x1,x2, x20... I expect then to have interaction (e.g. x1*x2, x3*x4*x5... ) with some combinations(2 way or higher dimensions). Is there any way that I can write the model simpler? Thanks! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] OLS variables
Thanks for the information! Leaf === At 2005-11-06, 11:07:31 you wrote: === IMHO, the details section of help(formula) provides a nicer help. Regards, Adai On Sun, 2005-11-06 at 08:27 -0500, John Fox wrote: Dear Leaf, I assume that you're using lm() to fit the model, and that you don't really want *all* of the interactions among 20 predictors: You'd need quite a lot of data to fit a model with 2^20 terms in it, and might have trouble interpreting the results. If you know which interactions you're looking for, then why not specify them directly, as in lm(y ~ x1*x2 + x3*x4*x5 + etc.)? On the other hand, it you want to include all interactions, say, up to three-way, and you've put the variables in a data frame, then lm(y ~ .^3, data=DataFrame) will do it. There are many terms in this model, however, if not quite 2^20. The introductory manual that comes with R has information on model formulas in Section 11. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun Sent: Sunday, November 06, 2005 3:11 AM To: r-help@stat.math.ethz.ch Subject: [R] OLS variables Dear all, Is there any simple way in R that can I put the all the interactions of the variables in the OLS model? e.g. I have a bunch of variables, x1,x2, x20... I expect then to have interaction (e.g. x1*x2, x3*x4*x5... ) with some combinations(2 way or higher dimensions). Is there any way that I can write the model simpler? Thanks! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Adaikalavan Ramasamy[EMAIL PROTECTED] Centre for Statistics in Medicine http://www.ihs.ox.ac.uk/csm/ Wolfson College Annexe Tel : 01865 284 408 Linton Road, Oxford OX2 6UD Fax : 01865 284 424 . = = = = = = = = = = = = = = = = = = = = Leaf Sun [EMAIL PROTECTED] 2005-11-06 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Visualizing a Data Distribution -- Was: breaks in hist()
Thanks for all the response. I think plotting a cdf or taking transformation could make the plot look better. But my further question is how to set the breaks to make the histogram concentrate in the interval of (0.01,0.2). I can even ignore the other parts of the values. Thanks! Leaf === At 2005-11-02, 12:07:12 you wrote: === Leaf Sun wrote: The histogram is highly screwed to the right, say, the range of the vector is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). I guess the histogram is as you wrote. See http://web.maths.unsw.edu.au/~tduong/seminars/intro2kde/ for a short explanation. -Original Message- From: Berton Gunter [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 02, 2005 1:10 PM To: 'Leaf Sun'; r-help@stat.math.ethz.ch Subject: [R] Visualizing a Data Distribution -- Was: breaks in hist() Leaf: An interesting question concerning graphical perception. As you have noted, choice of bin boundaries in a histogram can have a big effect on how a distribution is perceived. My $.02 (U.S.): Histograms are a relic of manual data plotting. We have much better alternatives these days that should be used instead. e.g. 1. (my preference, but properly not consumer-friendly). Plot the cdf instead (?ecdf) . 2. Plot a density estimator (?density ; ?densityplot) 3. See David Scott's ash package, perhaps the KernSmooth package also (though density() probably already has anything that you'd need from it). Cheers, -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun Sent: Wednesday, November 02, 2005 9:49 AM To: r-help@stat.math.ethz.ch Subject: [R] breaks in hist() Dear listers, A quick question about breaks in hist(). The histogram is highly screwed to the right, say, the range of the vector is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). My question is : how to set the breaks then make the histogram look even? Thanks in advance, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html = = = = = = = = = = = = = = = = = = = = __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] breaks in hist()
Dear listers, A quick question about breaks in hist(). The histogram is highly screwed to the right, say, the range of the vector is [0, 2], but 95% of the value is squeezed in the interval (0.01, 0.2). My question is : how to set the breaks then make the histogram look even? Thanks in advance, Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Finding the neighbors of the point
Hi Roger and the list, The package is working very well. What surprised me most is the speed. As I mentioned in my previous emails, I have to find the neighbors for around 200,000 individuals. It took no more than 10 minutes for the function to finish the searching and returned enough information (ldnn, lnn). As of the third dimension -Z, I applied the code you sent to me, also worked very well. I only modified some condition that meets the requirement. This package is just great for such neighbor searching. Thank you very much and all the best! Leaf === At 2005-10-25, 04:31:54 you wrote: === On Mon, 24 Oct 2005, Leaf Sun wrote: Running R 2.2.0 on winXP. Computer P4 CPU 3.2G and 1G of RAM. Please try the attached Windows binary package. Look at the help page for ann.dist(). It returns a list of three elements, the first, lnn, gives the index numbers of the neighbours closer than maxdist. From there say you have a vector z where you want the neighbour relation to apply only when z[i] z[j], so res - ann.dist(pts, maxdist=md) glist - vector(mode=list, length=length(res$lnn)) for (i in seq(along=res$lnn)) { if (length(res$lnn[[i]]) 0) { glist[[i]] - ifelse(z[i] z[res$lnn[[i]]], 1, 0) } } so glist tells you which to drop. Alternatively, you can drop them straight away: res - ann.dist(pts, maxdist=md) glist - vector(mode=list, length=length(res$lnn)) for (i in seq(along=res$lnn)) { if (length(res$lnn[[i]]) 0) { glist[[i]] - res$lnn[[i]]][z[i] z[res$lnn[[i } } (neither of these are tried, so the brackets may not match). Please let me know how you get on. Roger === At 2005-10-24, 09:46:28 you wrote: === On Mon, 24 Oct 2005, Leaf Sun wrote: No, I mean I have to find the neighbors of 200,000 points. Your R version and OS - output of version on your machine? Roger === At 2005-10-24, 03:30:41 you wrote: === On Fri, 21 Oct 2005, Leaf Sun wrote: Roger, The data frame is of 200,000 by 15 elements. Do you mean that you need to find distances in 15 dimensions? Roger I've learned some C, long time ago. But I guess I would understand the C codes. Thanks! Leaf === At 2005-10-21, 14:11:38 you wrote: === On Fri, 21 Oct 2005, Leaf Sun wrote: Dear all, I got point data of trees. I was wondering if anybody has experience in searching the neighbors within a specified distance efficiently. XY Z 99 34 65 98 35 29 98 34 28 99 33 33 98 32 23 99 33 21 99 33 22 99 32 24 99 30 23 ... What I want to do is : searching for the neighbors with a distance R for each tree the neighbor must have a bigger Z. The data set is huge so the R-codes is working slowly when I search it without subset it. And huge is how big? For very large problems, you'll need a kd-tree or r-tree approach to divide up the point locations before making the spatial query (I think the retention of neighbours with a larger z is the final step). There do not seem to be such functions in R or contributed packages at present. If you are willing to collaborate, I can pass on a draft package corrected by Christian Sangiorgio for approximate nearest neighbours (an interface to ANN by David Mount and collaborators), but it isn't working yet. So an investment in time and some knowledge of C++ will be useful. Any suggestion would be much appreciated! Leaf -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] = = = = = = = = = = = = = = = = = = = = __ Do You Yahoo!? http://mail.yahoo.com -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] . = = = = = = = = = = = = = = = = = = = = Leaf Sun [EMAIL PROTECTED] 2005-10-24 __ Do You Yahoo!? http://mail.yahoo.com -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED
[R] Errors occured
Hi all, Has anybody have the experience in the errors: Error in data.frame(..., check.names=FALSE): arguments imply differing number of rows: 343,15 This is the error occured in the middle of the program. I don't think the data frame has any problem, if there is problem with the program, why it happened in the middle? Does anybody have such an experience? It seems so weird to me. Thanks a lot! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Finding the neighbors of the point
Dear all, I got point data of trees. I was wondering if anybody has experience in searching the neighbors within a specified distance efficiently. XY Z 99 34 65 98 35 29 98 34 28 99 33 33 98 32 23 99 33 21 99 33 22 99 32 24 99 30 23 ... What I want to do is : searching for the neighbors with a distance R for each tree the neighbor must have a bigger Z. The data set is huge so the R-codes is working slowly when I search it without subset it. Any suggestion would be much appreciated! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Sorting a data frame by one of the variables
Dear all, I have a date frame like this: X Y Z 22 24 4.3 2.3 3.4 5.3 . 57.223.434 What my purpose is: to sort the data frame by either X, Y or Z. sample output is (sorted by X) : X Y Z 2.3 3.4 5.3 . .. 22 24 4.3 ... 57.2 23.4 34 I have no idea how to use sort, order or rank functions. Please help me out. Thanks! Leaf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html