[R] about lscv
Thanks in advance. Nowadays I just calculate the bandwidth h of cross validation in kernel smoothing using R language. And I just looked up the usage of function, which is lscv(x,.., exact=FALSE) My question is what does stand for and mean? do you mind specifically explaining it for me? Thanks Regards -- View this message in context: http://r.789695.n4.nabble.com/about-lscv-tp4638592.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] about dpik
Thanks in advance here. I use dpik() function to calculate the bandwidth h. Following is the related code: h-dpik(x,scalest=minim,level=2L,kernel=normal,canonical=FALSE,gridsize=401L,range.x=range(x),truncate=TRUE) But there is warning messages: 1: In bkfe(gcounts, 6L, alpha, range.x = c(sa, sb), binned = TRUE) : Binning grid too coarse for current (small) bandwidth: consider increasing 'gridsize' 2: In bkfe(gcounts, 4L, alpha, range.x = c(sa, sb), binned = TRUE) : Binning grid too coarse for current (small) bandwidth: consider increasing 'gridsize' I don't know what it means and how to deal with it. Thanks Regards -- View this message in context: http://r.789695.n4.nabble.com/about-dpik-tp4636738.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] about different bandwidths in one graph
Thank you in advance. Now I want to make comparison of the different bandwidth h in a normal distribution graph. This is the table of bandwidth h: thumb rule (normal)--0.00205; thumb rule(Epanech.)--0.00452; Plug-in (normal)--0.0009; Plug-in(Epanech.)--0.002. this is the condition: N=1010 data sample is from normal distribution N(0,0.0077^2). The grid points are taken to be [-0.05,0.05] and increment is 10. Bandwidth is taken the above h value r respectively and the kernel can be Epanechnikov kernel or Gaussian kernel. The following is my code: # # Define the Epanechnikov kernel function kernel-function(x){0.75*(1-x^2)*(abs(x)=1)} ### # Define the kernel density estimator kernden=function(x,z,h,ker){ # parameters: x=variable; h=bandwidth; z=grid point; ker=kernel nz-length(z) nx-length(x) x0=rep(1,nx*nz) dim(x0)=c(nx,nz) x1=t(x0) x0=x*x0 x1=z*x1 x0=x0-t(x1) if(ker==1){x1=kernel(x0/h)}# Epanechnikov kernel if(ker==0){x1=dnorm(x0/h)} # normal kernel f1=apply(x1,2,mean)/h return(f1) } # Simulation for different bandiwidths and different kernels n=1010 # n=1010 ker=1 # ker=1=Epan; ker=0= Gaussian h0=c(0.00452,0.001984)# set initial bandwidths z=seq(-0.05,0.05,by=10) # grid points nz=length(z)# number of grid points x=rnorm(1010, mean=0, sd=0.0077) # simulate x-N(0,0.0077^2) if(ker==1){h_o=2.34*n^{-0.2}}# bandwidth for Epanechnikov kernel if(ker==0){h_o=1.06*n^{-0.2}}# bandwidth for normal kernel f1=kernden(x,z,h0[1],ker) f2=kernden(x,z,h0[2],ker) f3=kernden(x,z,h0[3],ker) f4=kernden(x,z,h0[4],ker) text1=c(True,h=0.0025,h=0.00452,h=0.0009,h=0.002) data=cbind(dnorm(z),f1,f2,f3,f4)# combine them as a matrix win.graph() matplot(z,data,type=l,lty=1:5,col=1:5,xlab=,ylab=) legend(-1,0.2,text1,lty=1:5,col=1:5) But the error message is Error in strwidth(legend, units = user, cex = cex, font = text.font) : plot.new has not been called yet. I know something is wrong in the code but don't know where. Thanks Regards -- View this message in context: http://r.789695.n4.nabble.com/about-different-bandwidths-in-one-graph-tp4636780.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About dpik function
The following is the related code and the metadata.I have tried my best to modify and test but the error always happened.I really don't know what it means. Please check it form me. Thanks in advance. Regards Date: Sun, 15 Jul 2012 23:26:57 -0700 From: ml-node+s789695n4636611...@n4.nabble.com To: chester...@live.cn Subject: Re: About dpik function On 15/07/2012 19:57, chester123 wrote: Hi there and thanks in advance. Nowadays I am working on the plug-in bandwidth selection with R. Firstly, my So why use package KernSmooth and not the methods in R itself? 1010 data is the return rate from Yahoo Finance. Secondly, my code is following: r=read.table(/Users/user/Desktop/research/a.txt,sep=,,header=TRUE) x-r[8:1010,] library(KernSmooth) dpik(x,scalest=minim,level=2L,kernel=normal,canonical=FALSE,gridsize=401L,range.x=range(x),truncate=TRUE) But the error happens like this: Error in Summary.factor(c(233L, 917L, 381L, 748L, 272L, 242L, 269L, 963L, : range not meaningful for factors I don't know what's wrong and i am a rookie, please help with that. Thanks! So what is 'x'? We don't know (your code is not reproducible), but it sure looks like a factor. And from the help: x: vector containing the sample on which the kernel density estimate is to be constructed. A factor is not a vector (in this sense). PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. PLEASE do as you were asked. -- Brian D. Ripley, [hidden email] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/About-dpik-function-tp4636590p4636611.html To unsubscribe from About dpik function, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/About-dpik-function-tp4636590p4636625.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] about dpik
Thank you for your reply. I know the x in dpik() means the vector. But I don't know how to import into c() with a huge metadata (1000). Following is my some try, and the h is: [1] 0.001180569, which seems to be feasible. x-c(-0.00109349389485645,-0.00145304131152137,0.00023685387037116,0.00579094886320110,0.00032117330426379,0.00363758302533228,-0.00113344327121731,0.00104726223729409) library(KernSmooth) h-dpik(x,scalest=minim,level=2L,kernel=normal,canonical=FALSE,gridsize=401L,range.x=range(x),truncate=TRUE) h But the point is how to import more than 1000 numbers into the c() to vectorize them? Thanks Regards -- View this message in context: http://r.789695.n4.nabble.com/about-dpik-tp4636695.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] About dpik function
Hi there and thanks in advance. Nowadays I am working on the plug-in bandwidth selection with R. Firstly, my 1010 data is the return rate from Yahoo Finance. Secondly, my code is following: r=read.table(/Users/user/Desktop/research/a.txt,sep=,,header=TRUE) x-r[8:1010,] library(KernSmooth) dpik(x,scalest=minim,level=2L,kernel=normal,canonical=FALSE,gridsize=401L,range.x=range(x),truncate=TRUE) But the error happens like this: Error in Summary.factor(c(233L, 917L, 381L, 748L, 272L, 242L, 269L, 963L, : range not meaningful for factors I don't know what's wrong and i am a rookie, please help with that. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/About-dpik-function-tp4636590.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.