[R] Help in Compliling user -defined functions in Rpart

2005-08-26 Thread Luwis Tapiwa Diya
I have been trying to write my own user defined function in Rpart.I
imitated the anova splitting rule which is given as an example.In the
work I am doing ,I am calculating the concentration index(ci) ,which
is in between -1 and +1.So my deviance is given by
abs(ci)*(1-abs(ci)).Now when I run rpart incorporating this user
defined function i get the following error message:

 Error in user.split(yback[1:nback], wback[1:nback], xback[1:nback], parms,  : 
unused argument(s) ( ...)

Now I am failing to indentify where I am going wrong (In case I am
have made some mistake).So I was wondering if there is anybody who
have written some user defined functions of theirs and maybe if there
is any documentation with regards to user defined functions and
examples.

Regards ,

Luwis Diya

#User
defined function
#

temp.init-function(y,offset,parms,wt){
if (!is.null(offset)) y-y-offset 
if (is.matrix(y))stop (response must be a vector)

list(y=y,parms=0,numy=1,numresp=1,
   summary=function(yval,dev,wt,ylevel,digits){
 paste(mean=,format(signif(yval,digits)),
   MSE=,format(signif(dev/wt,digits)),
sep='')
})
} 


temp.eval-function(y,wt,parms){
n-length(y)
r-wt
for (i in 1:n-1) {r[i+1]=(sum(wt[1:i])+0.5*wt[i+1])/n} #fractional rank 
r[1]-0.5*wt[1]/n
wmean-sum(y*wt)/sum(wt)
ci-2*sum(wt*(y-wmean)*(r-0.5))/sum(wt*y) #concentration index 
for
socio-economic inequality
dev-abs(ci)*(1-abs(ci))  #deviance following 
the gini impurity approach
list(label=wmean,deviance=dev)
}

 
temp.split-function(y,wt,parms,continous){
n-length(y)
r-wt
for (i in 1:n-1) {r[i+1]=(sum(wt[1:i])+0.5*wt[i+1])/n}
r[1]-0.5*wt[1]/n
wmean-sum(y*wt)/sum(wt)
ci-2*sum(wt*(y-wmean)*(r-0.5))/sum(wt*y)
devci-abs(ci)*(1-abs(ci))

if(continous){
  lss-cumsum(wt*y)[-n]
  rss-sum(wt*y)-lss 
  lw-cumsum(wt)[-n]
  rw-sum(wt)-lw 
  lm-lss/lw
  rm-rss/rw
  
lcss-cumsum(wt[1:length(lm)]*(y[1:length(lm)]-lm)*(r[1:length(lm)]-0.5))
  rcss-sum(wt*(y-wmean)*(r-0.5))-lcss
  lci-2*lcss/lss #concentration index 
for left side
  rci-2*rcss/rss #concentration index 
for right side
  devlci-abs(lci)*(1-abs(lci))   #deviance for left 
side
  devrci-abs(rci)*(1-abs(rci))   #deviance for right 
side  

  goodness-devci-(lw/sum(wt))*devlci-(rw/sum(wt))*devrci
  list(goodness=goodness, direction=sign(lci))
  }
   else {
 ux-sort(unique(x))
 wtsum-tapply(wt,x,sum)
 ysum-tapply(wt*y,x,sum)
 means-ysum/wtsum

 ord-order(means)
   n-length(ord)   
 lss-cumsum(ysum[ord])[-n]
 rss-sum(ysum)-lss 
 lw-cumsum(wtsum[ord])[-n]
 rw-sum(wtsum)-lw 
 lm-lss/lw
 rm-rss/rw
 lysum-tapply(wt*(y-lm)*(r-0.5),x,sum)
 lcss-cumsum(lysum[ord])[-n]
 rcss-sum(lysum)-lcss
 lci-2*lcss/lss
 rci-2*rcss/rss
 devlci-abs(lci)*(1-abs(lci))
 devrci-abs(rci)*(1-abs(rci))

  goodness-devci-0.5*(lw/sum(wt))*devlci-0.5*(rw/sum(wt))*devrci
  list(goodness=goodness, direction=sign(lci))
 }
}

alist-list(eval=temp.eval,split=temp.split,init=temp.init)
tree-rpart(u~pcares+antcare.skilled+riskintb+child.born+married+mage1+mage2,
weights=popweight,method=alist)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Picking a subset of explanatory variables corresponding to a subset or a transformed subsert of the response

2005-07-18 Thread Luwis Tapiwa Diya
I have a dataframe in which I have used a cerain procedure to select
the response a subset of the response and obtained a vector y.Now I
need to fit a glm of the response y given x(the covariates) but now I
get an error that the variables are not of equal length.How can I come
up with a way to pick the x values corresponding to only the subset of
the response I am interested in.

Regards,

Luwis

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Accounting for clustered data in Rpart or Mvpart

2005-05-27 Thread Luwis Tapiwa Diya
I am working on data that is in clusters (eg events from persons in
the family).So I have a clustered data set up and would like to build
a tree(regression /classification tree) taking into account this
clustered nature of the data.

Do anybody know how to do this or maybe the code to take into account
the clustered data in rpart

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html