[R] [R-pkgs] ROCR source code now available on github
Dear all, the commented source code for the ROCR package (http://cran.r-project.org/web/packages/ROCR) is now available on github -- feel free to fork, add improvements, and contribute back! https://github.com/ipa-tys/ROCR Kind regards, Tobias ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging
Tim, if I understand correctly, you are trying to get the numerical values of averaged cross-validation curves. Unfortunately the plot function of ROCR does not return anything in the current version (it's a good suggestion to change this). If you want a quick fix, you could change the plot.performance function of ROCR to return back the values you wanted. Kind regards, Tobias On Thu, Sep 24, 2009 at 3:09 PM, Tim Howard tghow...@gw.dec.state.ny.us wrote: All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat - rnorm(100) # grab histogram data hdat - hist(dat) hdat #provides details of the hist output #grab boxplot data bdat - boxplot(dat) bdat #provides details of the boxplot output # the same works for randomForest library(randomForest) data(mtcars) RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), log=y) RFdat ##But, I can't use this method in ROCR library(ROCR) data(ROCR.xval) RCdat - plot(perf, avg=threshold) RCdat ## output: NULL Does anyone have any tricks for piping or extracting these data? Or, perhaps for steering me in another direction? Thanks, Tim From: Tim Howard tghow...@gw.dec.state.ny.us Subject: [R] ROCR.plot methods, cross validation averaging To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de, r-help@r-project.org Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us Content-Type: text/plain; charset=US-ASCII Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) - I think my first question is generic and could apply to many methods, which is why I'm directing this initially to R-help as well as Tobias and Oliver. Question 1. The plot function in ROCR will average your cross validation data if asked. I'd like to use that averaged data to find a best cutoff but I can't figure out how to grab the actual data that get plotted. A simple redirect of the plot (such as test - plot(mydata)) doesn't do it. Question 2. I am asking ROCR to average lists with varying lengths for each list entry. See my example below. None of the ROCR examples have data structured in this manner. Can anyone speak to whether the averaging methods in ROCR allow for this? If I can't easily grab the data as desired from Question 1, can someone help me figure out how to average the lists, by threshold, similarly? Question 3. If my cross validation data happen to have a list entry whose length = 2, ROCR errors out. Please see the second part of my example. Any suggestions? #reproducible examples exemplifying my questions ##part one## library(ROCR) data(ROCR.xval) # set up data so it looks more like my real data sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25) testSet - ROCR.xval # do the extraction for (i in 1:length(ROCR.xval[[1]])){ y - sample(c(1:350),sampSize[i]) testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y] testSet$labels[[i]] - ROCR.xval$labels[[i]][y] } # now massage the data using ROCR, set up for a ROC plot # if it errors out here, run the above sample again. pred - prediction(testSet$predictions, testSet$labels) perf - performance(pred,tpr,fpr) # create the ROC plot, averaging by cutoff value plot(perf, avg=threshold) # check out the structure of the data str(perf) # note the ragged edges of the list and that I assume averaging # whether it be vertical, horizontal, or threshold, somehow # accounts for this? ## part two ## # add a list entry with only two values p...@x.values[[1]] - c(0,1) p...@y.values[[1]] - c(0,1) p...@alpha.values[[1]] - c(Inf,0) plot(perf, avg=threshold) ##output results in an error with this message # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from, : # missing value where TRUE/FALSE needed Thanks in advance for your help Tim Howard New York Natural Heritage Program __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing Reports from R in Microsoft Office Open XML format (follow-up)
Dear Duncan and other R users, The department in which I work will soon make some decisions to improve our reporting. Since I hope that our solution will support R and Sweave-like functionality (otherwise it wouldn't be an improvement), I hope it's ok to repeat my question back from June if there are any news on an odfWeave-like package for weaving Microsoft Word documents? (in the Office Open XML format). Duncan, any news on the package? I am also asking on the list again because there might be developments by others in parallel to what Duncan has mentioned below? (For example, maybe someone is thinking of adapting Max Kuhn's excellent odfWeave package to support the XML format of Microsoft Word?) Kind regards, Tobias On Tue, Jun 9, 2009 at 4:22 PM, Duncan Temple Lang dun...@wald.ucdavis.edu wrote: Yes. We will release a version in the next few weeks when I have time to wrap it all up. There is also a Docbook-based version that uses R extensions to Docbook for authoring structured documents. D. Tobias Sing wrote: Dear all, has someone implemented functionality for writing reports from R in Office Open XML format (*), similar to what odfWeave does for the ODF format of OpenOffice? It would be great to have a kind of ooxmlWeave at least for those of us who are forced to work in an MS ecosystem. (*) Office Open XML is the default, XML-based, file format for MS Word: http://en.wikipedia.org/wiki/Office_Open_XML Kind regards, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing Reports from R in Microsoft Office Open XML format (follow-up)
Thanks Duncan and Greg for the replies so far. Duncan, many thanks for your continued work on this; please let us (or at least me) know when your package will be available. Greg, the DCOM option sounds great, but we run R on a Linux cluster, and therefore it would be good to be able to write the reports in MS Word XML format from there without relying on Windows-specific functionality. Kind regards, Tobias On Fri, Sep 18, 2009 at 7:23 PM, Greg Snow greg.s...@imail.org wrote: I read the original post as asking if there is something like odfWeave that works for msword (I assumed windows, but I guess they could be asking about MSword on other platforms, it just sounds like a windows shop). But yes, sword only works on windows (and is in beta version still) and uses a different interface from the standard sweave and odfWeave (process from inside word rather than process a file through R). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Duncan Temple Lang [mailto:dun...@wald.ucdavis.edu] Sent: Friday, September 18, 2009 11:00 AM To: Greg Snow Cc: Tobias Sing; r help Subject: Re: [R] Writing Reports from R in Microsoft Office Open XML format (follow-up) I believe that their approach is based on DCOM and the post was about Office Open XML. We have had the ability to do this via DCOM for at least 6 years, but unfortunately DCOM is limited to Windows. Greg Snow wrote: The people who brought us rexcel are working on sword which is a sweave for ms word, the current version is at: http://rcom.univie.ac.at/download.html hope this helps, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange error with ROCR
Is the probability of the true label the best prediction to feed to the ROCR package, or is it better to use the decision.value Since AFAIK they are related by a monotonous transformation, both approaches should lead to the same ROC curve, shouldn't they? (not tested) On Tue, Aug 4, 2009 at 8:14 PM, Noah Silvermann...@smartmediacorp.com wrote: Good point. I'm not sure how I missed that. This does lead to an additional question: Is the probability of the true label the best prediction to feed to the ROCR package, or is it better to use the decision.value Anybody have any experience on this one? Thanks! -N On 8/4/09 3:28 AM, Christian Schulz wrote: Hi, you need the score value , have a look at ?svm.predict and in the ROCR example. traindata - as.data.frame(matrix(runif(1000),ncol=10)) trainlabels - as.factor(sample(c(win,lose),nrow(data),replace=T,prob=c(0.5,0.5))) model - svm(traindata,trainlabels, type=C-classification, kernel=radial, cost=10, class.weights=c(win=3,lose=1), scale=FALSE, probability = TRUE) prediction - predict(model, traindata, decision.values = TRUE, probability = TRUE) probs - attr(prediction, probabilities)[,1] pred - prediction(probs,trainlabels) HTH Christian Hello, I've come across a strange error... Here is what happens: model - svm(traindata,trainlabels, type=C-classification, kernel=radial, cost=10, class.weights=c(win=3,lose=1), scale=FALSE, probability = TRUE) predictions - predict(model, traindata) pred - prediction(predictions, trainlabels) This returns an error: Error in prediction(predictions, trainlabels) : Format of predictions is invalid. Yet my predictions is just a matrix of predicted labels. Nothing fancy. (In fact, my step follow the exact example on the ROCR homepage.) A search through google for Format of predictions is invalid returns zero results. Can anyone suggest how I might fix this problem? Thank You, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR package question
Waverley, use @ (instead of $) to extract the slots from the performance object (it's S4 class system). HTH, Tobias On Sat, Jul 25, 2009 at 8:20 AM, Waverleywaverley.paloa...@gmail.com wrote: I use ROCR to plot multiple runs' performance. Using the sample code as example: # plot ROC curves for several cross-validation runs (dotted # in grey), overlaid by the vertical average curve and boxplots # showing the vertical spread around the average. data(ROCR.xval) pred - prediction(ROCR.xval$predictions, ROCR.xval$labels) perf - performance(pred,tpr,fpr) plot(perf,col=grey82,lty=3) plot(perf,lwd=3,avg=vertical,spread.estimate=boxplot,add=TRUE) I can follow the code and plot without any problem. However, I don't know how to extract the averaged ROC area under curve value. Can someone help? Thanks. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR package question
Waverley, see help('performance-class') for a description of the slots. Your AUCs will be in p...@y.values, which itself is a list (one list element per run). Thus, you can use functions like unlist or s/lapply to access them, e.g. mean(unlist(p...@y.values)) Kind regards, Tobias On Sat, Jul 25, 2009 at 5:44 PM, Waverleywaverley.paloa...@gmail.com wrote: Thanks for the reply. I am not sure I am following: 1. for the sample code. I tried p...@auc but get auc object not found 2. I am SPECIFICALLY interested in the averaged auc value of the multiple runs. How to get that out? I typed perf and it comes out as a list. 3. as for the plot using whisker plot to see the distribution of the multiple runs, the outliers outside the whisker is very annoying. How to get rid of the outline which is outside the whisker? I tried to use boxplot option and put in the following plot code as an option outline=FALSE and it did not work. Please help me with the specifics of the above 3 questions. Use code instead of description would be helpful. Thanks a lot in advance. Waverley, use @ (instead of $) to extract the slots from the performance object (it's S4 class system). HTH, Tobias On Sat, Jul 25, 2009 at 8:20 AM, Waverleywaverley.paloa...@gmail.com wrote: I use ROCR to plot multiple runs' performance. Using the sample code as example: # plot ROC curves for several cross-validation runs (dotted # in grey), overlaid by the vertical average curve and boxplots # showing the vertical spread around the average. data(ROCR.xval) pred - prediction(ROCR.xval$predictions, ROCR.xval$labels) perf - performance(pred,tpr,fpr) plot(perf,col=grey82,lty=3) plot(perf,lwd=3,avg=vertical,spread.estimate=boxplot,add=TRUE) I can follow the code and plot without any problem. However, I don't know how to extract the averaged ROC area under curve value. Can someone help? Thanks. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR package question
Waverley, if you want to modify components of the ROCR plot, you need to direct the parameters to the component functions by prefixing them with the name of that component function. In your case, you should add boxplot.outline=FALSE as follows: plot(perf, avg= vertical, spread.estimate=boxplot, boxplot.outline=FALSE) That should solve your issue. Please see below for the full explanation which is part of the ROCR reference. You can read this either by typing help(package=ROCR), or by looking at the reference PDF, e.g. here: http://rocr.bioinf.mpi-sb.mpg.de/ROCR.pdf You may also want to have a look at the examples in this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Optional graphical parameters to adjust different components of the performance plot. Parameters are directed to their target component by prefixing them with the name of the component (component.parameter, e.g. text.cex). The following components are available: xaxis, yaxis, coloraxis, box (around the plotting region), points, text, plotCI (error bars), boxplot. The names of these components are influenced by the R functions that are used to create them. Thus, par(component) can be used to see which parameters are available for a given component (with the expection of the three axes; use par(axis) here). To adjust the canvas or the performance curve(s), the standard plot parameters can be used without any prefix. Good luck, Tobias On Sat, Jul 25, 2009 at 7:38 PM, Waverleywaverley.paloa...@gmail.com wrote: Thanks for the quick reply. That is very clear for my question 1, 2. How about question 3? When I plot, is there way not to show the whisker plot outliers for evaluating the multiple runs? I have tried to put the option from boxplot command outline=FALSE, however, it did not work. Can you help? Thanks again for your kind help. Waverley, see help('performance-class') for a description of the slots. Your AUCs will be in p...@y.values, which itself is a list (one list element per run). Thus, you can use functions like unlist or s/lapply to access them, e.g. mean(unlist(p...@y.values)) Kind regards, Tobias On Sat, Jul 25, 2009 at 5:44 PM, Waverleywaverley.paloa...@gmail.com wrote: Thanks for the reply. I am not sure I am following: 1. for the sample code. I tried p...@auc but get auc object not found 2. I am SPECIFICALLY interested in the averaged auc value of the multiple runs. How to get that out? I typed perf and it comes out as a list. 3. as for the plot using whisker plot to see the distribution of the multiple runs, the outliers outside the whisker is very annoying. How to get rid of the outline which is outside the whisker? I tried to use boxplot option and put in the following plot code as an option outline=FALSE and it did not work. Please help me with the specifics of the above 3 questions. Use code instead of description would be helpful. Thanks a lot in advance. Waverley, use @ (instead of $) to extract the slots from the performance object (it's S4 class system). HTH, Tobias On Sat, Jul 25, 2009 at 8:20 AM, Waverleywaverley.paloa...@gmail.com wrote: I use ROCR to plot multiple runs' performance. Using the sample code as example: # plot ROC curves for several cross-validation runs (dotted # in grey), overlaid by the vertical average curve and boxplots # showing the vertical spread around the average. data(ROCR.xval) pred - prediction(ROCR.xval$predictions, ROCR.xval$labels) perf - performance(pred,tpr,fpr) plot(perf,col=grey82,lty=3) plot(perf,lwd=3,avg=vertical,spread.estimate=boxplot,add=TRUE) I can follow the code and plot without any problem. However, I don't know how to extract the averaged ROC area under curve value. Can someone help? Thanks. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Classification] lifting score in R
Michael, a lift chart for evaluating binary scoring classifiers, as I understand it, plots... lift score: P(Yhat = + | Y = +)/P(Yhat = +) against rate of rate of positive predictions: P(Yhat = +). ...across the continuum of possible cutoffs. If you want to do this, here is how you would do this with ROCR: library(ROCR) x - your.predicted.scores y - your.true.class.labels pred - prediction(x, y) perf - performance(pred, 'lift', 'rpp') plot(perf) x and y can be vectors, or, in the case of cross-validation, data frames or lists representing the individual cross-validation runs. See the ROCR help pages ?performance, help(package=ROCR) and this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt HTH, Tobias On Wed, Jun 24, 2009 at 5:17 PM, Michaelcomtech@gmail.com wrote: Hi all, Could anybody give me some pointers to Cross Validation using Lifting Score as error function, as commonly used in data-mining and classification field in marketing and e-commerce research? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing Reports from R in Office Open XML format (ooxmlWeave?)
Dear all, has someone implemented functionality for writing reports from R in Office Open XML format (*), similar to what odfWeave does for the ODF format of OpenOffice? It would be great to have a kind of ooxmlWeave at least for those of us who are forced to work in an MS ecosystem. (*) Office Open XML is the default, XML-based, file format for MS Word: http://en.wikipedia.org/wiki/Office_Open_XML Kind regards, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR: auc and logarithm plot
1. I have tried to understand how to extract area-under-curve value by looking at the ROCR document and googling. Still I am not sure if I am doing the right thing. Here is my code, is auc1 the auc value? pred1 - prediction(resp1,label1) perf1 - performance(pred1,tpr,fpr) plot( perf1, type=l,col=1 ) auc1 - performance(pred1,auc) auc1 - a...@y.values[[2]] If you have only one set of predictions and matching class labels, it would be in a...@y.values[[1]]. If you have multiple sets (as from cross-validation or bootstrapping), the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc. You can collect all of them for example by unlist(p...@y.values). Btw, you can use str(auc1) to see the structure of objects. 2. I have to compare two models that have very close ROCs. I'd like to have a more distinguishable plot of the ROCs. So is it possible to have a logarithm FP axis which might probably separate them well? Or zoom in the part close to the leftup corner of ROC plot? Or any other ways to make the ROCs more separate? To zoom in to a specific part: plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1)) plot(perf2, add=TRUE, lty=2, col='red') If you want logarithmic axes (though I wouldn't personally do this for a ROC plot), you can set up an empty canvas and add ROC curves to it: plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n') plot(perf, add=TRUE) You can adjust all components of the performance plots. See ?plot.performance and the examples in this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Hope that helps, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROCR: auc and logarithm plot
To color the error bars in ROCR the same way as the performance curve, you need to add one more argument (plotCI.col='red') to your plot call: plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev, plotCI.col=2) The use of 'plotCI.col' is an example for the general mechanism of ROCR to propagate arguments to the components of a plot (also explained in ?plot.performance): Optional graphical parameters to adjust different components of the performance plot. Parameters are directed to their target component by prefixing them with the name of the component (component.parameter, e.g. text.cex). The following components are available: xaxis, yaxis, coloraxis, box (around the plotting region), points, text, plotCI (error bars), boxplot. The names of these components are influenced by the R functions that are used to create them. Thus, par(component) can be used to see which parameters are available for a given component (with the expection of the three axes; use par(axis) here). To adjust the canvas or the performance curve(s), the standard plot parameters can be used without any prefix. Good luck, Tobias On Tue, May 12, 2009 at 1:48 PM, Tim timlee...@yahoo.com wrote: Thanks Tobias! A new question: if I want to draw an average ROC from cross-validation, how to make the bar color same as the line color? Here is my code: plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev,barcol=2) Even I specify barcol=2, the color of bars are still black, the default one, instead of red 2. --Tim --- On Tue, 5/12/09, Tobias Sing tobias.s...@gmail.com wrote: From: Tobias Sing tobias.s...@gmail.com Subject: Re: [R] ROCR: auc and logarithm plot To: timlee...@yahoo.com, r-help@r-project.org Date: Tuesday, May 12, 2009, 5:54 AM 1. I have tried to understand how to extract area-under-curve value by looking at the ROCR document and googling. Still I am not sure if I am doing the right thing. Here is my code, is auc1 the auc value? pred1 - prediction(resp1,label1) perf1 - performance(pred1,tpr,fpr) plot( perf1, type=l,col=1 ) auc1 - performance(pred1,auc) auc1 - a...@y.values[[2]] If you have only one set of predictions and matching class labels, it would be in a...@y.values[[1]]. If you have multiple sets (as from cross-validation or bootstrapping), the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc. You can collect all of them for example by unlist(p...@y.values). Btw, you can use str(auc1) to see the structure of objects. 2. I have to compare two models that have very close ROCs. I'd like to have a more distinguishable plot of the ROCs. So is it possible to have a logarithm FP axis which might probably separate them well? Or zoom in the part close to the leftup corner of ROC plot? Or any other ways to make the ROCs more separate? To zoom in to a specific part: plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1)) plot(perf2, add=TRUE, lty=2, col='red') If you want logarithmic axes (though I wouldn't personally do this for a ROC plot), you can set up an empty canvas and add ROC curves to it: plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n') plot(perf, add=TRUE) You can adjust all components of the performance plots. See ?plot.performance and the examples in this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Hope that helps, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting questions (ROCR)
To have several performance curves on a single plot, use the add=TRUE option, e.g. as follows: plot(perf1) plot(perf2, add=TRUE, col='red') Please read the help to ?plot.performance. It also tells you how you can adjust all graphical parameters for the individual curves. This slide deck contains several examples that might help you: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt HTH, Tobias On Fri, May 8, 2009 at 3:28 PM, lehe timlee...@yahoo.com wrote: Thanks! I am now also trying to plot several ROCs in the same figure using ROCR package. The following code: pred1 - prediction(yest1,ytest) perf1 - performance( pred1, tpr, fpr ) plot( perf1 ) pred2 - prediction(yest2,ytest) perf2 - performance( pred2, tpr, fpr ) lines( perf2 ) will result in error at lines( perf2 ): Error in as.double(y) : cannot coerce type 'S4' to vector of type 'double' Is there any way to solve it? Regards, Richard Cotton wrote: 1. How to plot several lines in a figure? Suppose I have several sets of points (xi,yi), where xi and yi are equal-length vector. plot(x1,y1) will give a line connecting these points. Another plot(x2,y2) will erase what plot before and plot the new line. Can I have these lines all drawn in the same figure? #Draw your plot plot(seq(0,1,length.out=20)) #Add lines to the existing plot lines(runif(20)) #Add points to the existing plot points(runif(20), col=red) 2. How to open another figure window? Repeating plot will redraw in the same window instead of opening another one. On Windows, windows() will open a new figure window; quartz() on Mac OSX and x11() on Linux do the same. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Regards, Richie. Mathematical Sciences Unit HSL -- View this message in context: http://www.nabble.com/Plotting-questions-tp23442445p23446000.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prediction-class ROCR
Regina, to get a simple ROC curve, use the following sequence of commands: pred - prediction(predictions, labels) perf - performance(pred, tpr, fpr) plot(perf) In the first line, 'predictions' are the raw predictions (usually numerical) of your classifier, and labels (as you correctly guessed) the true (binary) classes of your items. The true positive and false positive rates _at various cutoffs_ are then calculated from the raw predictions. The purpose of ROCR is to obtain these (and other) rates --- if you already have them, I don't understand from your email what else you want. Just in case you are uncertain about the overall framework of classification, take a look at this tutorial: Fawcett, T. (2003): ROC graphs: Notes and practical considerations for data mining researchers http://www.hpl.hp.com/techreports/2003/HPL-2003-4.pdf The following slide deck also contains a brief introduction of the framework, as well as usage examples of ROCR: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Hope that helps, Tobias On Thu, Mar 19, 2009 at 3:01 AM, Regina Beretta Mazaro rberet...@hotmail.com wrote: Hi, I'm involved in a bioinformatics project at my university, and we're doing a comparison paper between some methods of classification of nc-RNA. I've been encharged of ploting the ROC curves' graphs. But I'm new on working with R and I'm having some difficulty with the prediction-class. I don't get where the values of ROCR.simple$predictions, for example, came from ($labels I understand that represents the real classisfication of that item). And I just have the values for true positive, false positive, true negative and false positive, obtained from the methods tests. So, I can't plot a graph with my own values. How can I convert these values that I have into $predictions-type needed to run ROCR? Is there any function that does this? Or I have to redo the tests using another kind of measuring? If someone could help me, I'll be very grateful. Regina Beretta Mazaro. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about ROCR package
Waverley, you can also use p...@y.values to access the slot (see help(performance-class) for a description of the slots). You might also want have a look at the code for demo(ROCR) and at this slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt HTH, Tobias On Sat, Feb 7, 2009 at 10:40 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Hi Waverley, I forgot to tell you that perf is your performance object. Here is an example from the ROCR package: ## computing a simple ROC curve (x-axis: fpr, y-axis: tpr) library(ROCR) data(ROCR.simple) pred - prediction( ROCR.simple$predictions, ROCR.simple$labels) perf - performance(pred,tpr,fpr) # y.values unlist(slot(perf,y.values)) HTH, Jorge On Sat, Feb 7, 2009 at 3:17 PM, Waverley waverley.paloa...@gmail.comwrote: Hi, I have a question about ROCR package. I got the ROC curve plotted without any problem following the manual. However, I don't know to extract the values, e.g. y.values ( I think it is the area under the curve auc measure). The return is an object of class performance which have Slots and one of the slot is y.values. I type the object and I can see them in screen. But I want to extract the value for further programming and computation. I did a summary of the object and it is a S4 mode which I don't understand. Can someone help? Thanks a lot in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting slots from ROCR prediction objects
Hi Stacey, ROCR uses S4 classes. The elements are accessed using @ instead of $. You can find an example on slide 12 of the following slide deck: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Also have a look at the R code that appears when you type demo(ROCR) in R which contains some more examples related to your question. Hope that helps, Tobias On Thu, May 22, 2008 at 8:32 PM, Jorge Ivan Velez [EMAIL PROTECTED] wrote: Hi Stacey, Try this: library(ROCR) data(ROCR.simple) pred - prediction(ROCR.simple$predictions,ROCR.simple$labels) perf - performance(pred,sens,spec) cuts - unlist(slot(perf,alpha.values)) cuts HTH, Jorge On Thu, May 22, 2008 at 2:08 PM, Stacey Burrows [EMAIL PROTECTED] wrote: Hi, I have an object from the prediction function from the ROCR package and I would like to extract one of the slots from the object, for example the cutoffs slot. However the usual techniques ($, [[name]]) of subsetting don't work. How can I assess the lists in the slots? Here is an example of what I am working with: library(ROCR) data(ROCR.simple) pred - prediction(ROCR.simple$predictions,ROCR.simple$labels) str(pred) Formal class 'prediction' [package ROCR] with 11 slots ..@ predictions:List of 1 .. ..$ : num [1:200] 0.613 0.364 0.432 0.140 0.385 ... ..@ labels :List of 1 .. ..$ : Ord.factor w/ 2 levels 01: 2 2 1 1 1 2 2 2 2 1 ... ..@ cutoffs:List of 1 .. ..$ : num [1:201] Inf 0.991 0.985 0.985 0.983 ... ..@ fp :List of 1 .. ..$ : num [1:201] 0 0 0 0 1 1 2 3 3 3 ... ..@ tp :List of 1 .. ..$ : num [1:201] 0 1 2 3 3 4 4 4 5 6 ... ..@ tn :List of 1 .. ..$ : num [1:201] 107 107 107 107 106 106 105 104 104 104 ... ..@ fn :List of 1 .. ..$ : num [1:201] 93 92 91 90 90 89 89 89 88 87 ... ..@ n.pos :List of 1 .. ..$ : int 93 ..@ n.neg :List of 1 .. ..$ : int 107 ..@ n.pos.pred :List of 1 .. ..$ : num [1:201] 0 1 2 3 4 5 6 7 8 9 ... ..@ n.neg.pred :List of 1 .. ..$ : num [1:201] 200 199 198 197 196 195 194 193 192 191 ... Thanks in advance, Stacey - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] odfWeave: in multi-page plots only last page appears in document
Dear all, Max, first of all, many thanks for providing the odfWeave package. My problem: Whenever I have multiple plots in one single chunk of my ODF file, only the last plot gets shown. The problem can be reproduced with this toy example (to be used in an ODF file together with odfWeave -- I'm using the newest version 0.7.3): plot1, echo=FALSE, fig=TRUE= for (i in 1:3) { plot(1,1, main=paste('Plot',i)) } @ I thought the solution (I hope there is one) might be found with setImageDefs (e.g. by setting type and/or device to postscript and working with the 'onefile' argument) , but I couldn't solve the problem. So maybe this is not the right idea for a solution. In any case, here is my current getImageDefs: getImageDefs() $type [1] png $device [1] png $plotHeight [1] 480 $plotWidth [1] 480 $dispHeight [1] 5 $dispWidth [1] 5 $args list() Any help appreciated. Kind regards, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave: in multi-page plots only last page appears in document
Sarah, thanks for your reply. On Mon, Apr 14, 2008 at 8:32 PM, Sarah Goslee [EMAIL PROTECTED] wrote: If you ran that code outside ODFWeave, you'd only get one plot, so why would you expect to get more within ODFWeave? No, it depends on the device that is used. If I use PDF or postscript they all go into different pages of a single file. This is why I was referring to setImageDefs as a guess for a solution in my original post, but couldn't get it to work. Your suggestion of par/layout is unfortunately not what I'm looking for. I was hoping that the individual plots would come one after the other in the ODF document. And I also hope that 'pedestrian' solution of breaking into separate chunks in the ODF file has an alternative, because often this would required a rewrite of functions. Any other hints? Thanks, Tobias for (i in 1:3) { plot(1,1, main=paste('Plot',i)) } You need to add some sort of par() command, or use layout(), to create a single plot that contains all three of the plots created by the loop. par(mfrow=c(2,2)) for (i in 1:3) { plot(1,1, main=paste('Plot',i)) } for example. Or, if you want ODFWeave to handle placement, then you need to break that into three separate plots. Sarah On Mon, Apr 14, 2008 at 2:20 PM, Tobias Sing [EMAIL PROTECTED] wrote: Dear all, Max, first of all, many thanks for providing the odfWeave package. My problem: Whenever I have multiple plots in one single chunk of my ODF file, only the last plot gets shown. The problem can be reproduced with this toy example (to be used in an ODF file together with odfWeave -- I'm using the newest version 0.7.3): plot1, echo=FALSE, fig=TRUE= for (i in 1:3) { plot(1,1, main=paste('Plot',i)) } @ -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to create ROC curve for 2 dimensional classifiers
The are some papers on the topic (google for roc surfaces), but no R packages for multi-class ROC analysis. I personally have some doubts about the practical value of these approaches in the case of more than two classes, but others may disagree. Kind regards, Tobias On Thu, Feb 21, 2008 at 8:26 PM, Waverley [EMAIL PROTECTED] wrote: Hi, I understand for 1 d classifiers, you can use ROCR package. Is there a package you can plot ROC curve for 2d classifiers? One of my colleagues asked me about this. I have been quite puzzled, conceptually, how you can do the ROC curve for 2d classifiers. Can someone share his/her knowledge or experience? Thanks in advance. -- Waverley @ Palo Alto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transfer Crosstable to Word-Document
On Feb 17, 2008 2:49 PM, Udo König [EMAIL PROTECTED] wrote: [...] Greg: To the odfWeave package: in [2] I found the sentence The package is currently limited to creating text documents using OpenOffice. So it doesn´t seem work with MS-Word? Udo, I think odfWeave is exactly what you need here. You can also use it with MS-Word via the SUN ODF Plugin for MS Office. It adds the capability to load and save odf docs within MS Office. Get the (free as in beer) plugin from here: http://www.sun.com/software/star/odf_plugin/whats_new.jsp Then write your document in Word, save as ODF, run odfWeave, load the result in Word and save as .doc. HTH, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to search for packages
Hadley, On Feb 4, 2008 5:03 PM, hadley wickham [EMAIL PROTECTED] wrote: [...] Before Christmas I started working on a solution for this - http://crantastic.org - a site for searching, reviewing and tagging R packages. Unfortunately I've run out of steam lately (and the lack of a 64-bit ubuntu package for R means it's a bit out of date), but the basic ideas are there. If you like how the site is looking so far please let me know, as it will be motivation for me to get the site finished. it's just amazing how you still find some time for things besides ggplot2 et al... Appreciating all your work so far, it'd be great to keep you motivated for crantastic as well. I think crantastic would be a good complement to task views: - more likely to be up-to-date when used by many people - reflecting the opinion/experiences of various users rather than that of a single task view maintainer Some additional benefit could be gained by adding a rating system and by allowing packages to be sorted by number of comments a package has received, average rating, etc. Such sorting options could even be useful within a view for certain selected tags, because I guess some tags - e.g. 'graphics' - might ultimately be given to a large number of packages). Cheers, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating AUC from ROCR
Dear Ilham, see ?performance for a list of available performance measures ('auc' gives AUC, 'rmse' gives root-mean-squared error). Here is a link to a slide deck with several examples: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt HTH, Tobias On 11/21/07, G Ilhamto [EMAIL PROTECTED] wrote: Dear R-helper, I am working with ROCR of Tobias Sing et. al. to compare the performances of logistic and nnet models on a binary response. I had the performance plots, but I have problem finding out other performance statistics (eg. MSE/ASE, AUC). Any help on this? Thanks Ilham [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.