Re: [R] ROCR predictions

2010-08-19 Thread Assa Yeroslaviz
Hello everybody, yes I'm sorry. I can see it is not so easy to understand. I'l try to explain a bit more. The experiment was used to compare two (protein domain) data bases and find out whether or not the results founded in one are comparable to the second DB. the first column shows the list of

Re: [R] ROCR predictions

2010-08-19 Thread Frank Harrell
At the heart of this you have a problem in incomplete conditioning. You are computing things like Prob(X x) when you know X=x. Working with a statistician who is well versed in probability models will undoubtedly help. Frank Frank E Harrell Jr Professor and ChairmanSchool of

Re: [R] ROCR predictions

2010-08-17 Thread Claudia Beleites
Dear Assa, I am having a problem building a ROC curve with my data using the ROCR package. I have 10 lists of proteins such as attached (proteinlist.xls). each of the your file didn't make it to the list. lists was calculated with a different p-value. The goal is to find the optimal

Re: [R] ROCR predictions

2010-08-17 Thread Assa Yeroslaviz
Dear Claudia, thank you for your fast answer. I add again the table of the data as an example. Protein ID Pfam Domain p-value Expected Is Expected True Postive False Negative False Positive True Negative NP_11.2 APH 1.15E-05 APH TRUE 1 0 0 0 NP_11.2 MutS_V 0.0173 APH FALSE 0 0 1 0

Re: [R] ROCR predictions

2010-08-17 Thread Claudia Beleites
Dear Assa, you need to call prediction with continuous predictions and a _binary_ true class label. You are the only one who can tell whether the p-values are actually predictions and what the class labels are. For the list readers p is just the name of whatever variable, and you didn't