[R] [R-pkgs] ROCR source code now available on github

2012-05-05 Thread Tobias Sing
Dear all,

the commented source code for the ROCR package
(http://cran.r-project.org/web/packages/ROCR) is now available on
github -- feel free to fork, add improvements, and contribute back!
https://github.com/ipa-tys/ROCR

Kind regards,
  Tobias

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009-09-24 Thread Tobias Sing
Tim,

if I understand correctly, you are trying to get the numerical values
of averaged cross-validation curves.
Unfortunately the plot function of ROCR does not return anything in
the current version (it's a good suggestion to change this).

If you want a quick fix, you could change the plot.performance
function of ROCR to return back the values you wanted.

Kind regards,
  Tobias

On Thu, Sep 24, 2009 at 3:09 PM, Tim Howard tghow...@gw.dec.state.ny.us wrote:
 All,
  I'm trying again with a slightly more generic version of my first question. 
 I can extract the
 plotted values from hist(), boxplot(), and even plot.randomForest(). Observe:

  # get some data
 dat - rnorm(100)
  # grab histogram data
 hdat - hist(dat)
 hdat     #provides details of the hist output

  #grab boxplot data
 bdat - boxplot(dat)
 bdat     #provides details of the boxplot output

  # the same works for randomForest
 library(randomForest)
 data(mtcars)
 RFdat - plot(randomForest(mpg ~ ., mtcars, keep.forest=FALSE, ntree=100), 
 log=y)
 RFdat


 ##But, I can't use this method in ROCR
 library(ROCR)
 data(ROCR.xval)
 RCdat - plot(perf, avg=threshold)

 RCdat
 ## output:  NULL

 Does anyone have any tricks for piping or extracting these data?
 Or, perhaps for steering me in another direction?

 Thanks,
 Tim


 From: Tim Howard tghow...@gw.dec.state.ny.us
 Subject: [R] ROCR.plot methods, cross validation averaging
 To: osan...@mpi-sb.mpg.de, tobias.s...@mpi-sb.mpg.de,
        r-help@r-project.org
 Message-ID: 4aba1079.6d16.00d...@gw.dec.state.ny.us
 Content-Type: text/plain; charset=US-ASCII

 Dear R-help and ROCR developers (Tobias Sing and Oliver Sander) -

 I think my first question is generic and could apply to many methods,
 which is why I'm directing this initially to R-help as well as Tobias and 
 Oliver.

 Question 1. The plot function in ROCR will average your cross validation
 data if asked. I'd like to use that averaged data to find a best cutoff
 but I can't figure out how to grab the actual data that get plotted.
 A simple redirect of the plot (such as test - plot(mydata)) doesn't do it.

 Question 2. I am asking ROCR to average lists with varying lengths for
 each list entry. See my example below. None of the ROCR examples have data
 structured in this manner. Can anyone speak to whether the averaging
 methods in ROCR allow for this? If I can't easily grab the data as desired
 from Question 1, can someone help me figure out how to average the lists,
 by threshold, similarly?

 Question 3. If my cross validation data happen to have a list entry whose
 length = 2, ROCR errors out. Please see the second part of my example.
 Any suggestions?

 #reproducible examples exemplifying my questions
 ##part one##
 library(ROCR)
 data(ROCR.xval)
  # set up data so it looks more like my real data
 sampSize - c(4, 55, 20, 75, 350, 250, 6, 120, 200, 25)
 testSet - ROCR.xval
  # do the extraction
 for (i in 1:length(ROCR.xval[[1]])){
  y - sample(c(1:350),sampSize[i])
  testSet$predictions[[i]] - ROCR.xval$predictions[[i]][y]
  testSet$labels[[i]] - ROCR.xval$labels[[i]][y]
  }
  # now massage the data using ROCR, set up for a ROC plot
  # if it errors out here, run the above sample again.
 pred - prediction(testSet$predictions, testSet$labels)
 perf - performance(pred,tpr,fpr)
  # create the ROC plot, averaging by cutoff value
 plot(perf, avg=threshold)
  # check out the structure of the data
 str(perf)
  # note the ragged edges of the list and that I assume averaging
  # whether it be vertical, horizontal, or threshold, somehow
  # accounts for this?

 ## part two ##
 # add a list entry with only two values
 p...@x.values[[1]] - c(0,1)
 p...@y.values[[1]] - c(0,1)
 p...@alpha.values[[1]] - c(Inf,0)

 plot(perf, avg=threshold)

 ##output results in an error with this message
 # Error in if (from == to) rep.int(from, length.out) else as.vector(c(from,  :
 # missing value where TRUE/FALSE needed


 Thanks in advance for your help
 Tim Howard
 New York Natural Heritage Program

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing Reports from R in Microsoft Office Open XML format (follow-up)

2009-09-18 Thread Tobias Sing
Dear Duncan and other R users,

The department in which I work will soon make some decisions to
improve our reporting. Since I hope that our solution will support R
and Sweave-like functionality (otherwise it wouldn't be an
improvement), I hope it's ok to repeat my question back from June
if there are any news on an odfWeave-like package for weaving
Microsoft Word documents? (in the Office Open XML format).

Duncan, any news on the package? I am also asking on the list again
because there might be developments by others in parallel to what
Duncan has mentioned below?
(For example, maybe someone is thinking of adapting Max Kuhn's
excellent odfWeave package to support the XML format of Microsoft
Word?)

Kind regards,
  Tobias


On Tue, Jun 9, 2009 at 4:22 PM, Duncan Temple Lang
dun...@wald.ucdavis.edu wrote:
 Yes. We will release a version in the next few weeks
 when I have time to wrap it all up.
 There is also a Docbook-based version that uses
 R extensions to Docbook for authoring structured
 documents.

  D.

 Tobias Sing wrote:

 Dear all,

 has someone implemented functionality for writing reports from R in
 Office Open XML format (*), similar to what odfWeave does for the ODF
 format of OpenOffice? It would be great to have a kind of
 ooxmlWeave at least for those of us who are forced to work in an
 MS ecosystem.

 (*) Office Open XML is the default, XML-based, file format for MS
 Word: http://en.wikipedia.org/wiki/Office_Open_XML

 Kind regards,
  Tobias

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing Reports from R in Microsoft Office Open XML format (follow-up)

2009-09-18 Thread Tobias Sing
Thanks Duncan and Greg for the replies so far.

Duncan, many thanks for your continued work on this; please let us (or
at least me) know when your package will be available.

Greg, the DCOM option sounds great, but we run R on a Linux cluster,
and therefore it would be good to be able to write the reports in MS
Word XML format from there without relying on Windows-specific
functionality.

Kind regards,
  Tobias

On Fri, Sep 18, 2009 at 7:23 PM, Greg Snow greg.s...@imail.org wrote:
 I read the original post as asking if there is something like odfWeave that 
 works for msword (I assumed windows, but I guess they could be asking about 
 MSword on other platforms, it just sounds like a windows shop).

 But yes, sword only works on windows (and is in beta version still) and uses 
 a different interface from the standard sweave and odfWeave (process from 
 inside word rather than process a file through R).



 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: Duncan Temple Lang [mailto:dun...@wald.ucdavis.edu]
 Sent: Friday, September 18, 2009 11:00 AM
 To: Greg Snow
 Cc: Tobias Sing; r help
 Subject: Re: [R] Writing Reports from R in Microsoft Office Open XML
 format (follow-up)


 I believe that their approach is based on DCOM and the post was about
 Office Open XML.
 We have had the ability to do this via DCOM for at least 6 years, but
 unfortunately
 DCOM is limited to Windows.


 Greg Snow wrote:
  The people who brought us rexcel are working on sword which is a
 sweave for ms word, the current version  is at:
 
  http://rcom.univie.ac.at/download.html
 
  hope this helps,
 
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange error with ROCR

2009-08-04 Thread Tobias Sing
 Is the probability of the true label the best prediction to feed to
 the ROCR package, or is it better to use the decision.value

Since AFAIK they are related by a monotonous transformation, both
approaches should lead to the same ROC curve, shouldn't they? (not
tested)

On Tue, Aug 4, 2009 at 8:14 PM, Noah Silvermann...@smartmediacorp.com wrote:
 Good point.  I'm not sure how I missed that.

 This does lead to an additional question:

 Is the probability of the true label the best prediction to feed to
 the ROCR package, or is it better to use the decision.value

 Anybody have any experience on this one?

 Thanks!

 -N

 On 8/4/09 3:28 AM, Christian Schulz wrote:
 Hi,

 you need  the score value , have a look at ?svm.predict and in the
 ROCR example.

 traindata - as.data.frame(matrix(runif(1000),ncol=10))
 trainlabels -
 as.factor(sample(c(win,lose),nrow(data),replace=T,prob=c(0.5,0.5)))

 model - svm(traindata,trainlabels, type=C-classification,
 kernel=radial, cost=10,
 class.weights=c(win=3,lose=1), scale=FALSE, probability = TRUE)

 prediction - predict(model, traindata, decision.values = TRUE,
 probability = TRUE)
 probs -  attr(prediction, probabilities)[,1]
 pred - prediction(probs,trainlabels)

 HTH Christian

 Hello,

 I've come across a strange error...


 Here is what happens:

 model - svm(traindata,trainlabels, type=C-classification,
 kernel=radial, cost=10,  class.weights=c(win=3,lose=1),
 scale=FALSE, probability = TRUE)
 predictions - predict(model, traindata)
 pred - prediction(predictions, trainlabels)


 This returns an error:
 Error in prediction(predictions, trainlabels) :
    Format of predictions is invalid.

 Yet my predictions is just a matrix of predicted labels.  Nothing
 fancy.  (In fact, my step follow the exact example on the ROCR
 homepage.)

 A search through google for Format of predictions is invalid
 returns zero results.

 Can anyone suggest how I might fix this problem?

 Thank You,





     [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR package question

2009-07-25 Thread Tobias Sing
Waverley,

use @ (instead of $) to extract the slots from the performance object
(it's S4 class system).

HTH,
  Tobias

On Sat, Jul 25, 2009 at 8:20 AM, Waverleywaverley.paloa...@gmail.com wrote:
 I use ROCR to plot multiple runs' performance.  Using the sample code
 as example:

 # plot ROC curves for several cross-validation runs (dotted
 # in grey), overlaid by the vertical average curve and boxplots
 # showing the vertical spread around the average.
 data(ROCR.xval)
 pred - prediction(ROCR.xval$predictions, ROCR.xval$labels)
 perf - performance(pred,tpr,fpr)
 plot(perf,col=grey82,lty=3)
 plot(perf,lwd=3,avg=vertical,spread.estimate=boxplot,add=TRUE)

 I can follow the code and plot without any problem.  However, I don't
 know how to extract the averaged ROC area under curve value.

 Can someone help?

 Thanks.

 --
 Waverley @ Palo Alto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR package question

2009-07-25 Thread Tobias Sing
Waverley, see help('performance-class') for a description of the slots.

Your AUCs will be in p...@y.values, which itself is a list (one list
element per run).

Thus, you can use functions like unlist or s/lapply to access them, e.g.
mean(unlist(p...@y.values))

Kind regards,
  Tobias

On Sat, Jul 25, 2009 at 5:44 PM, Waverleywaverley.paloa...@gmail.com wrote:
 Thanks for the reply.  I am not sure I am following:

 1. for the sample code.  I tried p...@auc but get auc object not found
 2. I am SPECIFICALLY interested in the averaged auc value of the
 multiple runs.  How to get that out?  I typed perf and it comes out as
 a list.
 3. as for the plot using whisker plot to see the distribution of the
 multiple runs, the outliers outside the whisker is very annoying.  How
 to get rid of the outline which is outside the whisker?  I tried to
 use boxplot option and put in the following plot code as an option
 outline=FALSE and it did not work.

 Please help me with the specifics of the above 3 questions.  Use code
 instead of description would be helpful.

 Thanks a lot in advance.



Waverley,

use @ (instead of $) to extract the slots from the performance object (it's 
S4 class system).

HTH,
  Tobias

 On Sat, Jul 25, 2009 at 8:20 AM, Waverleywaverley.paloa...@gmail.com wrote:
 I use ROCR to plot multiple runs' performance.  Using the sample code
 as example:

 # plot ROC curves for several cross-validation runs (dotted # in
 grey), overlaid by the vertical average curve and boxplots # showing
 the vertical spread around the average.
 data(ROCR.xval)
 pred - prediction(ROCR.xval$predictions, ROCR.xval$labels) perf -
 performance(pred,tpr,fpr)
 plot(perf,col=grey82,lty=3)
 plot(perf,lwd=3,avg=vertical,spread.estimate=boxplot,add=TRUE)

 I can follow the code and plot without any problem.  However, I don't
 know how to extract the averaged ROC area under curve value.

 Can someone help?

 Thanks.

 --
 Waverley @ Palo Alto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Waverley @ Palo Alto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR package question

2009-07-25 Thread Tobias Sing
Waverley,

if you want to modify components of the ROCR plot, you need to direct
the parameters to the component functions by prefixing them with the
name of that component function. In your case, you should add
boxplot.outline=FALSE as follows:
plot(perf, avg= vertical, spread.estimate=boxplot, boxplot.outline=FALSE)

That should solve your issue. Please see below for the full
explanation which is part of the ROCR reference.
You can read this either by typing help(package=ROCR), or by looking
at the reference PDF, e.g. here:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR.pdf

You may also want to have a look at the examples in this slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

Optional graphical parameters to adjust different components of the performance
plot. Parameters are directed to their target component by prefixing them with
the name of the component (component.parameter, e.g. text.cex).
The following components are available: xaxis, yaxis, coloraxis, box
(around the plotting region), points, text, plotCI (error bars), boxplot.
The names of these components are influenced by the R functions that are used
to create them. Thus, par(component) can be used to see which parameters
are available for a given component (with the expection of the three axes;
use par(axis) here). To adjust the canvas or the performance curve(s), the
standard plot parameters can be used without any prefix.

Good luck,
  Tobias



On Sat, Jul 25, 2009 at 7:38 PM, Waverleywaverley.paloa...@gmail.com wrote:
 Thanks for the quick reply.  That is very clear for my question 1, 2.

 How about question 3? When I plot, is there way not to show the
 whisker plot outliers for evaluating the multiple runs?  I have tried
 to put the option from boxplot command outline=FALSE, however, it did
 not work.

 Can you help?

 Thanks again for your kind help.






 Waverley, see help('performance-class') for a description of the slots.

 Your AUCs will be in p...@y.values, which itself is a list (one list
 element per run).

 Thus, you can use functions like unlist or s/lapply to access them, e.g.
 mean(unlist(p...@y.values))

 Kind regards,
  Tobias

 On Sat, Jul 25, 2009 at 5:44 PM, Waverleywaverley.paloa...@gmail.com wrote:
 Thanks for the reply.  I am not sure I am following:

 1. for the sample code.  I tried p...@auc but get auc object not found
 2. I am SPECIFICALLY interested in the averaged auc value of the
 multiple runs.  How to get that out?  I typed perf and it comes out as
 a list.
 3. as for the plot using whisker plot to see the distribution of the
 multiple runs, the outliers outside the whisker is very annoying.  How
 to get rid of the outline which is outside the whisker?  I tried to
 use boxplot option and put in the following plot code as an option
 outline=FALSE and it did not work.

 Please help me with the specifics of the above 3 questions.  Use code
 instead of description would be helpful.

 Thanks a lot in advance.



Waverley,

use @ (instead of $) to extract the slots from the performance object (it's 
S4 class system).

HTH,
  Tobias

 On Sat, Jul 25, 2009 at 8:20 AM, Waverleywaverley.paloa...@gmail.com wrote:
 I use ROCR to plot multiple runs' performance.  Using the sample code
 as example:

 # plot ROC curves for several cross-validation runs (dotted # in
 grey), overlaid by the vertical average curve and boxplots # showing
 the vertical spread around the average.
 data(ROCR.xval)
 pred - prediction(ROCR.xval$predictions, ROCR.xval$labels) perf -
 performance(pred,tpr,fpr)
 plot(perf,col=grey82,lty=3)
 plot(perf,lwd=3,avg=vertical,spread.estimate=boxplot,add=TRUE)

 I can follow the code and plot without any problem.  However, I don't
 know how to extract the averaged ROC area under curve value.

 Can someone help?

 Thanks.

 --
 Waverley @ Palo Alto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Waverley @ Palo Alto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Waverley @ Palo Alto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Classification] lifting score in R

2009-06-24 Thread Tobias Sing
Michael,

a lift chart for evaluating binary scoring classifiers, as I
understand it, plots...

lift score: P(Yhat = + | Y = +)/P(Yhat = +)
against
rate of rate of positive predictions: P(Yhat = +).

...across the continuum of possible cutoffs. If you want to do this,
here is how you would do this with ROCR:

library(ROCR)
x - your.predicted.scores
y - your.true.class.labels
pred - prediction(x, y)
perf - performance(pred, 'lift', 'rpp')
plot(perf)

x and y can be vectors, or, in the case of cross-validation, data
frames or lists representing the individual cross-validation runs.
See the ROCR help pages ?performance, help(package=ROCR) and this slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

HTH,
  Tobias



On Wed, Jun 24, 2009 at 5:17 PM, Michaelcomtech@gmail.com wrote:
 Hi all,

 Could anybody give me some pointers to Cross Validation using Lifting
 Score as error function, as commonly used in data-mining and
 classification field in marketing and e-commerce research?

 Thanks!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing Reports from R in Office Open XML format (ooxmlWeave?)

2009-06-09 Thread Tobias Sing
Dear all,

has someone implemented functionality for writing reports from R in
Office Open XML format (*), similar to what odfWeave does for the ODF
format of OpenOffice? It would be great to have a kind of
ooxmlWeave at least for those of us who are forced to work in an
MS ecosystem.

(*) Office Open XML is the default, XML-based, file format for MS
Word: http://en.wikipedia.org/wiki/Office_Open_XML

Kind regards,
  Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR: auc and logarithm plot

2009-05-12 Thread Tobias Sing
 1. I have tried to understand how to extract area-under-curve value by 
 looking at the ROCR document and googling. Still I am not sure if I am doing 
 the right thing. Here is my code, is auc1 the auc value?
 
 pred1 - prediction(resp1,label1)

 perf1 - performance(pred1,tpr,fpr)
 plot( perf1, type=l,col=1 )

 auc1 - performance(pred1,auc)
 auc1 - a...@y.values[[2]]
 


If you have only one set of predictions and matching class labels, it
would be in a...@y.values[[1]].
If you have multiple sets (as from cross-validation or bootstrapping),
the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc.
You can collect all of them for example by unlist(p...@y.values).

Btw, you can use str(auc1) to see the structure of objects.


 2. I have to compare two models that have very close ROCs. I'd like to have a 
 more distinguishable plot of the ROCs. So is it possible to have a logarithm 
 FP axis which might probably separate them well? Or zoom in the part close to 
 the leftup corner of ROC plot? Or any other ways to make the ROCs more 
 separate?


To zoom in to a specific part:
plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1))
plot(perf2, add=TRUE, lty=2, col='red')

If you want logarithmic axes (though I wouldn't personally do this for
a ROC plot), you can set up an empty canvas and add ROC curves to it:
plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n')
plot(perf, add=TRUE)

You can adjust all components of the performance plots. See
?plot.performance and the examples in this slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

Hope that helps,
  Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROCR: auc and logarithm plot

2009-05-12 Thread Tobias Sing
To color the error bars in ROCR the same way as the performance curve,
you need to add one more argument (plotCI.col='red') to your plot
call:

plot( perf2,avg=threshold,lty=2,col=2, spread.estimate=stddev, plotCI.col=2)

The use of 'plotCI.col' is an example for the general mechanism of
ROCR to propagate arguments to the components of a plot (also
explained in ?plot.performance):

Optional graphical parameters to adjust different components of the performance
plot. Parameters are directed to their target component by prefixing them with
the name of the component (component.parameter, e.g. text.cex).
The following components are available: xaxis, yaxis, coloraxis, box
(around the plotting region), points, text, plotCI (error bars), boxplot.
The names of these components are influenced by the R functions that are used
to create them. Thus, par(component) can be used to see which parameters
are available for a given component (with the expection of the three axes;
use par(axis) here). To adjust the canvas or the performance curve(s), the
standard plot parameters can be used without any prefix.

Good luck,
  Tobias


On Tue, May 12, 2009 at 1:48 PM, Tim timlee...@yahoo.com wrote:
 Thanks Tobias!
 A new question: if I want to draw an average ROC from cross-validation, how
 to make the bar color same as the line color? Here is my code:

 plot( perf2,avg=threshold,lty=2,col=2,
 spread.estimate=stddev,barcol=2)

 Even I specify barcol=2, the color of bars are still black, the default
 one, instead of red 2.

 --Tim


 --- On Tue, 5/12/09, Tobias Sing tobias.s...@gmail.com wrote:

 From: Tobias Sing tobias.s...@gmail.com
 Subject: Re: [R] ROCR: auc and logarithm plot
 To: timlee...@yahoo.com, r-help@r-project.org
 Date: Tuesday, May 12, 2009, 5:54 AM

 1. I have tried to understand how to extract area-under-curve value by
 looking at the ROCR document and
  googling. Still I am not sure if I am doing the
 right thing. Here is my code, is auc1 the auc value?
 
 pred1 - prediction(resp1,label1)

 perf1 - performance(pred1,tpr,fpr)
 plot( perf1, type=l,col=1 )

 auc1 - performance(pred1,auc)
 auc1 - a...@y.values[[2]]
 


 If you have only one set of predictions and matching class labels, it
 would be in a...@y.values[[1]].
 If you have multiple sets (as from cross-validation or bootstrapping),
 the AUCs would be in a...@y.values[[1]], a...@y.values[[2]], etc.
 You can collect all of them for example by unlist(p...@y.values).

 Btw, you can use str(auc1) to see the structure of objects.


 2. I have to compare two models that have very close ROCs. I'd like to
 have a more distinguishable plot of the ROCs. So is it possible to have a
 logarithm FP axis which might probably separate
  them well? Or zoom in the part
 close to the leftup corner of ROC plot? Or any other ways to make the ROCs
 more
 separate?


 To zoom in to a specific part:
 plot(perf1, xlim=c(0,0.2), ylim=c(0.7,1))
 plot(perf2, add=TRUE, lty=2, col='red')

 If you want logarithmic axes (though I wouldn't personally do this for
 a ROC plot), you can set up an empty canvas and add ROC curves to it:
 plot(1,1, log='x', xlim=c(0.001,1), ylim=c(0,1), type='n')
 plot(perf, add=TRUE)

 You can adjust all components of the performance plots. See
 ?plot.performance and the examples in this slide deck:
 http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

 Hope that helps,
   Tobias



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting questions (ROCR)

2009-05-08 Thread Tobias Sing
To have several performance curves on a single plot, use the
add=TRUE option, e.g. as follows:

plot(perf1)
plot(perf2, add=TRUE, col='red')

Please read the help to ?plot.performance. It also tells you how you
can adjust all graphical parameters for the individual curves.
This slide deck contains several examples that might help you:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

HTH,
  Tobias


On Fri, May 8, 2009 at 3:28 PM, lehe timlee...@yahoo.com wrote:

 Thanks!
 I am now also trying to plot several ROCs in the same figure using ROCR
 package. The following code:

 pred1 - prediction(yest1,ytest)
 perf1 - performance( pred1, tpr, fpr )
 plot( perf1 )
 pred2 - prediction(yest2,ytest)
 perf2 - performance( pred2, tpr, fpr )
 lines( perf2 )

 will result in error at lines( perf2 ):

 Error in as.double(y) :   cannot coerce type 'S4' to vector of type
 'double'

 Is there any way to solve it?

 Regards,




 Richard Cotton wrote:

 1. How to plot several lines in a figure? Suppose I have several sets of
 points (xi,yi), where xi and yi are equal-length vector. plot(x1,y1)
 will
 give a line connecting these points. Another plot(x2,y2) will erase what
 plot before and plot the new line. Can I have these lines all drawn in
 the
 same figure?

 #Draw your plot
 plot(seq(0,1,length.out=20))

 #Add lines to the existing plot
 lines(runif(20))

 #Add points to the existing plot
 points(runif(20), col=red)

 2. How to open another figure window? Repeating plot will redraw in the
 same
 window instead of opening another one.

 On Windows, windows() will open a new figure window; quartz() on Mac OSX
 and x11() on Linux do the same.

 Regards,
 Richie.

 Mathematical Sciences Unit
 HSL



 
 ATTENTION:

 This message contains privileged and confidential inform...{{dropped:20}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 -
 Regards,
 Richie.

 Mathematical Sciences Unit
 HSL


 --
 View this message in context: 
 http://www.nabble.com/Plotting-questions-tp23442445p23446000.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prediction-class ROCR

2009-03-19 Thread Tobias Sing
Regina,

to get a simple ROC curve, use the following sequence of commands:
pred - prediction(predictions, labels)
perf - performance(pred, tpr, fpr)
plot(perf)
In the first line, 'predictions' are the raw predictions (usually
numerical) of your classifier, and labels (as you correctly guessed)
the true (binary) classes of your items. The true positive and false
positive rates _at various cutoffs_ are then calculated from the raw
predictions. The purpose of ROCR is to obtain these (and other) rates
--- if you already have them, I don't understand from your email what
else you want.

Just in case you are uncertain about the overall framework of
classification, take a look at this tutorial:
Fawcett, T. (2003):  ROC graphs: Notes and practical considerations
for data mining researchers
http://www.hpl.hp.com/techreports/2003/HPL-2003-4.pdf

The following slide deck also contains a brief introduction of the
framework, as well as usage examples of ROCR:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

Hope that helps,
  Tobias



On Thu, Mar 19, 2009 at 3:01 AM, Regina Beretta Mazaro
rberet...@hotmail.com wrote:




 Hi,

 I'm involved in a bioinformatics project at my university, and we're doing a 
 comparison paper between some methods of classification of nc-RNA. I've been 
 encharged of ploting the ROC curves' graphs. But I'm new on working with R 
 and I'm having some difficulty with the prediction-class. I don't get where 
 the values of ROCR.simple$predictions, for example, came from ($labels I 
 understand that represents the real classisfication of that item). And I just 
 have the values for true positive, false positive, true negative and false 
 positive, obtained from the methods tests. So, I can't plot a graph with my 
 own values. How can I convert these values that I have into $predictions-type 
 needed to run ROCR? Is there any function that does this? Or I have to redo 
 the tests using another kind of measuring? If someone could help me, I'll be 
 very grateful.
 Regina Beretta Mazaro.
 _


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about ROCR package

2009-02-08 Thread Tobias Sing
Waverley,

you can also use p...@y.values to access the slot (see
help(performance-class) for a description of the slots).

You might also want have a look at the code for demo(ROCR) and at this
slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

HTH,
  Tobias

On Sat, Feb 7, 2009 at 10:40 PM, Jorge Ivan Velez
jorgeivanve...@gmail.com wrote:
 Hi Waverley,
 I forgot to tell you that perf is your performance object. Here is an
 example from the ROCR package:
 ## computing a simple ROC curve (x-axis: fpr, y-axis: tpr)
 library(ROCR)
 data(ROCR.simple)
 pred - prediction( ROCR.simple$predictions, ROCR.simple$labels)
 perf - performance(pred,tpr,fpr)

 # y.values
 unlist(slot(perf,y.values))

 HTH,

 Jorge



 On Sat, Feb 7, 2009 at 3:17 PM, Waverley waverley.paloa...@gmail.comwrote:

 Hi,

 I have a question about ROCR package.  I got the ROC curve plotted
 without any problem following the manual.  However, I don't know to
 extract the values, e.g. y.values ( I think it is the area under the
 curve auc measure).  The return is an object of class performance
 which have Slots and one of the slot is y.values.  I type the object
 and I can see them in screen.  But I want to extract the value for
 further programming and computation.  I did a summary of the object
 and it is a S4 mode which I don't understand.

 Can someone help?

 Thanks a lot in advance.

 --
 Waverley @ Palo Alto

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting slots from ROCR prediction objects

2008-05-22 Thread Tobias Sing
Hi Stacey,

ROCR uses S4 classes. The elements are accessed using @ instead of $.
You can find an example on slide 12 of the following slide deck:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

Also have a look at the R code that appears when you type demo(ROCR) in R
which contains some more examples related to your question.

Hope that helps,
  Tobias


On Thu, May 22, 2008 at 8:32 PM, Jorge Ivan Velez [EMAIL PROTECTED]
wrote:

 Hi Stacey,
 Try this:

 library(ROCR)
 data(ROCR.simple)
 pred - prediction(ROCR.simple$predictions,ROCR.simple$labels)
 perf - performance(pred,sens,spec)
 cuts - unlist(slot(perf,alpha.values))
 cuts

 HTH,

 Jorge


 On Thu, May 22, 2008 at 2:08 PM, Stacey Burrows [EMAIL PROTECTED]
 wrote:

  Hi,
 
  I have an object from the prediction function from the ROCR package and I
  would like to extract one of the slots from the object, for example the
  cutoffs slot. However the usual techniques ($, [[name]]) of subsetting
  don't work. How can I assess the lists in the slots?
 
  Here is an example of what I am working with:
 
  library(ROCR)
  data(ROCR.simple)
  pred - prediction(ROCR.simple$predictions,ROCR.simple$labels)
 
   str(pred)
  Formal class 'prediction' [package ROCR] with 11 slots
   ..@ predictions:List of 1
   .. ..$ : num [1:200] 0.613 0.364 0.432 0.140 0.385 ...
   ..@ labels :List of 1
   .. ..$ : Ord.factor w/ 2 levels 01: 2 2 1 1 1 2 2 2 2 1 ...
   ..@ cutoffs:List of 1
   .. ..$ : num [1:201]   Inf 0.991 0.985 0.985 0.983 ...
   ..@ fp :List of 1
   .. ..$ : num [1:201] 0 0 0 0 1 1 2 3 3 3 ...
   ..@ tp :List of 1
   .. ..$ : num [1:201] 0 1 2 3 3 4 4 4 5 6 ...
   ..@ tn :List of 1
   .. ..$ : num [1:201] 107 107 107 107 106 106 105 104 104 104 ...
   ..@ fn :List of 1
   .. ..$ : num [1:201] 93 92 91 90 90 89 89 89 88 87 ...
   ..@ n.pos  :List of 1
   .. ..$ : int 93
   ..@ n.neg  :List of 1
   .. ..$ : int 107
   ..@ n.pos.pred :List of 1
   .. ..$ : num [1:201] 0 1 2 3 4 5 6 7 8 9 ...
   ..@ n.neg.pred :List of 1
   .. ..$ : num [1:201] 200 199 198 197 196 195 194 193 192 191 ...
  
 
  Thanks in advance,
  Stacey
 
 
  -
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave: in multi-page plots only last page appears in document

2008-04-14 Thread Tobias Sing
Dear all,

Max, first of all, many thanks for providing the odfWeave package.

My problem: Whenever I have multiple plots in one single chunk of my
ODF file, only the last plot gets shown. The problem can be reproduced
with this toy example (to be used in an ODF file together with
odfWeave -- I'm using the newest version 0.7.3):

plot1, echo=FALSE, fig=TRUE=
for (i in 1:3) {
plot(1,1, main=paste('Plot',i))
}
@

I thought the solution (I hope there is one) might be found with
setImageDefs (e.g. by setting type and/or device to postscript and
working with the 'onefile' argument) , but I couldn't solve the
problem. So maybe this is not the right idea for a solution. In any
case, here is my current getImageDefs:

 getImageDefs()
$type
[1] png

$device
[1] png

$plotHeight
[1] 480

$plotWidth
[1] 480

$dispHeight
[1] 5

$dispWidth
[1] 5

$args
list()

Any help appreciated.
Kind regards,
  Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave: in multi-page plots only last page appears in document

2008-04-14 Thread Tobias Sing
Sarah, thanks for your reply.

On Mon, Apr 14, 2008 at 8:32 PM, Sarah Goslee [EMAIL PROTECTED] wrote:
 If you ran that code outside ODFWeave, you'd only get one plot,
  so why would you expect to get more within ODFWeave?

No, it depends on the device that is used. If I use PDF or postscript
they all go into different pages of a single file. This is why I was
referring to setImageDefs as a guess for a solution in my original
post, but couldn't get it to work. Your suggestion of par/layout is
unfortunately not what I'm looking for. I was hoping that the
individual plots would come one after the other in the ODF document.
And I also hope that 'pedestrian' solution of breaking into separate
chunks in the ODF file has an alternative, because often this would
required a rewrite of functions.

Any other hints?

Thanks,
  Tobias





  for (i in 1:3) {
   plot(1,1, main=paste('Plot',i))
}

  You need to add some sort of par() command, or use layout(), to create
  a single plot that contains all three of the plots created by the loop.

  par(mfrow=c(2,2))

 for (i in 1:3) {
 plot(1,1, main=paste('Plot',i))
  }
  for example.

  Or, if you want ODFWeave to handle placement, then you need to
  break that into three separate plots.

  Sarah




  On Mon, Apr 14, 2008 at 2:20 PM, Tobias Sing [EMAIL PROTECTED] wrote:
   Dear all,
  
Max, first of all, many thanks for providing the odfWeave package.
  
My problem: Whenever I have multiple plots in one single chunk of my
ODF file, only the last plot gets shown. The problem can be reproduced
with this toy example (to be used in an ODF file together with
odfWeave -- I'm using the newest version 0.7.3):
  
plot1, echo=FALSE, fig=TRUE=
for (i in 1:3) {
   plot(1,1, main=paste('Plot',i))
}
@
  

  --
  Sarah Goslee
  http://www.functionaldiversity.org


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to create ROC curve for 2 dimensional classifiers

2008-02-22 Thread Tobias Sing
The are some papers on the topic (google for roc surfaces), but no R
packages for multi-class ROC analysis.
I personally have some doubts about the practical value of these
approaches in the case of more than two classes, but others may
disagree.

Kind regards,
  Tobias


On Thu, Feb 21, 2008 at 8:26 PM, Waverley [EMAIL PROTECTED] wrote:
 Hi,

  I understand for 1 d classifiers, you can use ROCR package.

  Is there a package you can plot ROC curve for 2d classifiers?  One of
  my colleagues asked me about this.  I have been quite puzzled,
  conceptually, how you can do the ROC curve for 2d classifiers.  Can
  someone share his/her knowledge or experience?

  Thanks in advance.

  --
  Waverley @ Palo Alto

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transfer Crosstable to Word-Document

2008-02-17 Thread Tobias Sing
On Feb 17, 2008 2:49 PM, Udo König [EMAIL PROTECTED] wrote:
 [...]
 Greg:
 To the odfWeave package: in [2] I found the sentence The package is currently
 limited to creating text documents using OpenOffice. So it doesn´t seem work
 with MS-Word?

Udo,

I think odfWeave is exactly what you need here. You can also use it
with MS-Word via the SUN ODF Plugin for MS Office. It adds the
capability to load and save odf docs within MS Office.

Get the (free as in beer) plugin from here:
http://www.sun.com/software/star/odf_plugin/whats_new.jsp

Then write your document in Word, save as ODF, run odfWeave, load the
result in Word and save as .doc.

HTH,
  Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to search for packages

2008-02-04 Thread Tobias Sing
Hadley,

On Feb 4, 2008 5:03 PM, hadley wickham [EMAIL PROTECTED] wrote:
 [...]
 Before Christmas I started working on a solution for this -
 http://crantastic.org - a site for searching, reviewing and tagging R
 packages.  Unfortunately I've run out of steam lately (and the lack of
 a 64-bit ubuntu package for R means it's a bit out of date), but the
 basic ideas are there.  If you like how the site is looking so far
 please let me know, as it will be motivation for me to get the site
 finished.

it's just amazing how you still find some time for things besides
ggplot2 et al... Appreciating all your work so far, it'd be great to
keep you motivated for crantastic as well.

I think crantastic would be a good complement to task views:
- more likely to be up-to-date when used by many people
- reflecting the opinion/experiences of various users rather than that
of a single task view maintainer

Some additional benefit could be gained by adding a rating system and
by allowing packages to be sorted by number of comments a package has
received, average rating, etc. Such sorting options could even be
useful within a view for certain selected tags, because I guess some
tags - e.g. 'graphics' - might ultimately be given to a large number
of packages).

Cheers,
  Tobias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating AUC from ROCR

2007-11-21 Thread Tobias Sing
Dear Ilham,

see ?performance for a list of available performance measures ('auc'
gives AUC, 'rmse' gives root-mean-squared error).

Here is a link to a slide deck with several examples:
http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt

HTH,
  Tobias


On 11/21/07, G Ilhamto [EMAIL PROTECTED] wrote:
 Dear R-helper,

 I am working with ROCR of Tobias Sing et. al. to compare the performances of
 logistic and nnet models on a binary response.

 I had the performance plots, but I have problem finding out other
 performance statistics (eg. MSE/ASE, AUC). Any help on this?

 Thanks
 Ilham

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.