Amelia Marsh <amelia_marsh08 <at> yahoo.com> writes: > Hello! (I dont know if I can raise this query here on this forum, > but I had already raised on teh finance forum, but have not received > any sugegstion, so now raising on this list. Sorry for the same. The > query is about what to do, if no statistical distribution is fitting > to data). > I am into risk management and deal with Operatioanl risk. As a part > of BASEL II guidelines, we need to arrive at the capital charge the > banks must set aside to counter any operational risk, if it > happens. As a part of Loss Distribution Approach (LDA), we need to > collate past loss events and use these loss amounts. The usual > process as being practised in the industry is as follows - > Using these historical loss amounts and using the various > statistical tests like KS test, AD test, PP plot, QQ plot etc, we > try to identify best statistical (continuous) distribution fitting > this historical loss data. Then using these estimated parameters > w.r.t. the statistical distribution, we simulate say 1 miliion loss > anounts and then taking appropriate percentile (say 99.9%), we > arrive at the capital charge. > However, many a times, loss data is such that fitting of > distribution to loss data is not possible. May be loss data is > multimodal or has significant variability, making the fitting of > distribution impossible. Can someone guide me how to deal with such > data and what can be done to simulate losses using this historical > loss data in R. A skew-(log)-normal fit doesn't look too bad ... (whenever you have positive data that are this strongly skewed, log-transforming is a good step)
hist(log10(mydat),col="gray",breaks="FD",freq=FALSE) ## default breaks are much coarser: ## hist(log10(mydat),col="gray",breaks="Sturges",freq=FALSE) lines(density(log10(mydat)),col=2,lwd=2) library(fGarch) ss <- snormFit(log10(mydat)) xvec <- seq(2,6.5,length=101) lines(xvec,do.call(dsnorm,c(list(x=xvec),as.list(ss$par))), col="blue",lwd=2) ## or try a skew-Student-t: not very different: ss2 <- sstdFit(log10(mydat)) lines(xvec,do.call(dsstd,c(list(x=xvec),as.list(ss2$estimate))), col="purple",lwd=2) There are more flexible distributional families (Johnson, log-spline ...) Multimodal data are a different can of worms -- consider fitting a finite mixture model ... ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.