Re: [R] ON MAC, how to copy a plot on to Word document?
2008/9/7 asdfjkl; [EMAIL PROTECTED]: Yes, I don't know how to copy the plot on Mac and paste on to Word because you can't right click on the graph and say copy as metafile. I'm so surprised I can't find any information about this anywhere on the Internet... The obvious way would be to save it as a PNG/JPEG/BMP/TIFF and then just import it. Regards, Nicky Chorley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Corrupted PDF files
How are you executing the script? Quite possibly FAQ Q7.22 applies (it will it you use source(), for example). On Sat, 6 Sep 2008, Nathan Teuscher wrote: I have the following code that when executed from the command line works properly and produces a proper PDF. When the script is executed, the PDF produced is considered corrupt. I am using R 2.7.2 on Mac OSX 10.5.4. Thank you in advance for the help! library(lattice) pdf(file=CLDiag2.pdf) xyplot( CL ~ HT + WT + AGE + CREA + SEX, data=data2, outer=TRUE, scales=list(x=list(relation=free)), panel=function(...){ panel.loess(..., col=red) panel.xyplot(..., pch=.) } ) dev.off() Nathan Teuscher [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XML - get node by name
Hi there, I try to rewrite some Java-code with R. It deals with reading XML files. I started with the XML package. In Java, I had a very useful method which gave me a node by using: name of the node index of appearance start point: global (false) / local (true) So, I could do something like this. setCurrentChildNode(data, 0); getValueOfElement(val,1,true); -- gives 45 setCurrentChildNode(data, 1); getValueOfElement(val,1,true); -- gives 11 getValueOfElement(val,1,false); -- gives 45 root data loc=1 val i=t1 22 /val val i=t2 45 /val /data data loc=2 val i=t1 44 /val val i=t2 11 /val /data /root Now, I'd like to do something like this in R. Most important would be to retrieve a node just by its name, not by the whole path. How is it possible? Can anybody help me with this issue? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for equality of complicatedly related average correlations
Thank you very much, Adam. I have to get a bit more familiar with the model you propose in order to understand if it applies to my problem as well. My question is not really does time show a different effect but which one of two measures is more reliable: My respondents have completed exactly the same questionnaire twice (t=1 and t=2). The questionnaire consisted of two ways of measuring attribute importance, and the better method of measuring these importances is the one that gives the same importances for each respondent in t=1 and t=2. In other words: I want to examine test-retest reliability of the two measures. Naturally, if X(t=1,t=2)-correlation is higher for a specific respondent than the Y(t=1,t=2)-corralation, than for this respondent the method that yields the X-importances is more reliable. All I want to do is to see if this holds for the whole sample as well... Anyway, thank you again, I will think of your approach. Ralph Adam D. I. Kramer-3 wrote: Hi Ralph, I had the same problem you do a few months ago, and realized that the question I had (does time show a different effect for X than Y) was not best modeled as differences between correlations across individuals, but as whether time interacts with condition. I answered this question with library(nlme) lme(obs ~ cond*time, random=~cond*time|subj) ...where obs is the responses on the X or Y variable, cond is a factor of either X or Y, and subj is your subject variable. This fits a heirarchical linear model to the data. The relationship between X and time is sig. diff. from the relationship between Y and time if the cond:time fixed effect is true. This approach makes better use of your data, because when you correlate the observations, you're effectively losing variability (because correlations are doubly standardized) as well as degrees of freedom (you have 9 df within each individual, but each correlation is only one number). --Adam On Sat, 6 Sep 2008, Ralph79 wrote: Dear R-Users, I am currently looking for a way to test the equality of two correlations that are related in a very special way. Let me describe the situation with an example. - There are 100 respondents, and there are 2 points in time, t=1 and t=2. - For each of the respondents and at each of the time points, I have information on 10 X-variables and on 10 Y-variables. - Based on this information, I calculate two correlations for each respondent: cor(X[t=1],X[t=2]) and cor(Y[t=1],Y[t=2]), with X and Y being the vectors of the corresponding 10 variables. - Now I get the average correlations over the whole sample using Fishers Z-transformation, i.e. I have mean(cor(X[t=1],X[t=2])) and mean(cor(X[t=1],X[t=2])) and want to know if the mean correlations are significantly different! I haven't found any test that deals with exactly my situation. Therefore, I simply apply a paired t-test based on the individual z-correlations. From my point of view this should be ok, because of the z's normality. However, I am unsure if there is a better way to test the hypothesis that I am interested in? I'd be grateful for any comment or hint. Thank you very much, Ralph - Ralph Wirth University Erlangen-Nuremberg, Chair of Statistics GfK Group, Department of Methods and Product Development -- View this message in context: http://www.nabble.com/Test-for-equality-of-complicatedly-related-average-correlations-tp19346312p19346312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Ralph Wirth University Erlangen-Nuremberg, Chair of Statistics GfK Group, Department of Methods and Product Development -- View this message in context: http://www.nabble.com/Test-for-equality-of-complicatedly-related-average-correlations-tp19346312p19355825.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop
It's exactely what I was looking for. Thanks a lot -- View this message in context: http://www.nabble.com/loop-tp19346683p19356409.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mode value
Carlos Morales wrote: Hello everyone, I would like to know if there is any function to calculate the mode value, or I have to build one to do it. Hi Carlos, If you mean the mode of a sample from a discrete distribution, try Mode in the prettyR package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R_USER - in which file should I include it?
Hello I am a newbie. I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I decided to install all 2.7 versions under c:\program files\R\2.7 from now on (2.7.1 is located under .\2.7.1) Although I don't like the idea (I am running Vista), I have edited etc\Renviron.site to contain: R_USER=c:/Users/eduardo/Documents/R R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7 As far as R starting always from the same location, that is, c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help. So I wonder whether someone from the list could help me to: a) force R to start always from the same location b) force R to install all new packages in the same location Many thanks Ed PS. Before sending this email, I read windows FAQ and browsed the archives (too many posts in the subject!). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with starting and using R
Dear all, I encountered a problem on starting and using the R v 2.7.2 installation on my PC running Windows Vista and would appreciate your help. When R was first started, the Rgui returned several error messages: Error in structure(.Internal(Sys.getenv(as.character(x), as.character(unset)$ unsupported conversion Error in file.exists(name) : unsupported conversion in 'filenameToWchar' In addition, a dialog box called 'Information' popped up with the following message: Fatal error: unable to restore saved data in .RData On clicking 'OK', R closed immediately and the same thing occurs on restarting R. After checking for previous related messages online, I followed one of the recommendations from before and appended --no-restore-data to the R shorcut target line. After that, R could start without the 'fatal error'. However, some functions such as 'help' and 'setwd' do not work: e.g. help() Error: could not find function help setwd(DirName) Error in setwd(DirName) : unsupported conversion in 'filenameToWchar' I then typed 'Sys.getlocale()' and got this: Sys.getlocale() [1] LC_COLLATE=Chinese (Traditional)_Hong Kong S.A.R..950;LC_CTYPE=Chinese (Traditional)_Hong Kong S.A.R..950;LC_MONETARY=Chinese (Traditional)_Hong Kong S.A.R..950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Hong Kong S.A.R..950 Setting LC_ALL=en in the shortcut target does not appear to work in this case as I got During startup - Warning message: Setting LC_CTYPE=en failed Furthermore, I tried the patched version of R 2.7.2 and the same problem occurs. I would be very grateful if anybody could help. Many thx. Thomas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Label 2 groups in PCA different colours
here is a simple approach to, for instance, plot scores for PC1 and PC2 using diff colors: scores - prcomp(yourdata)$x plot(scores[1:100,1], scores[1:100,2], pch = 20, col = blue) points(scores[101:200,1], scores[101:200,2], pch = 20, col = red) PM On Sat, Sep 6, 2008 at 11:44 PM, pgseye [EMAIL PROTECTED] wrote: Hi, I'm wanting to do a PCA on some data which is comprised of two different groups (to see how well the groups are discriminated). Is there a way to change the colour of the datapoints in a biplot so that I can easily see which group is which (eg objects 1-100, red, 101-200, black). Might be simple, but I'm new to R and can't seem to find how to do this. Thanks. Paul -- View this message in context: http://www.nabble.com/Label-2-groups-in-PCA-different-colours-tp19354077p19354077.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cohen's kappa
Dear all, I have a question on Cohen's kappa: Assume I have two datasets, one has 500 objects, 10 methods and the other, 1000 different objects, 20 different methods. Could I compare between the two datasets to conclude the 10 methods are more concordant than the 20 ones by looking at some output, for example, cohen.kappa{concord} ? One more, could anyone explain in brief, what's the difference between kappa(Cohen) and kappa(Siegel)? Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ON MAC, how to copy a plot on to Word document?
I think you need to save the plot and import it into Word. AFAIK you can only copy and paste a plot in Windows. Have a look at ?png (There are other formats available) --- On Sun, 9/7/08, asdfjkl; [EMAIL PROTECTED] wrote: From: asdfjkl; [EMAIL PROTECTED] Subject: [R] ON MAC, how to copy a plot on to Word document? To: r-help@r-project.org Received: Sunday, September 7, 2008, 1:46 AM Yes, I don't know how to copy the plot on Mac and paste on to Word because you can't right click on the graph and say copy as metafile. I'm so surprised I can't find any information about this anywhere on the Internet... -- View this message in context: http://www.nabble.com/ON-MAC%2C-how-to-copy-a-plot-on-to-Word-document--tp19354558p19354558.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML - get node by name
well not sure how its done in R , but heres a way to do it in simple Excel. http://decisionstats.com/2008/parsing-xml-files-easily/ Parsing XML files easily To parse a XML (or KML or PMML) file easily without using any complicated softwares, here is a piece of code that fits right in your excel sheet. Just import this file using Excel, and then use the function getElement, after pasting the XML code in 1 cell. xml-getelement It is used for simply reading the xml/kml code as a text string. Just pasted all the xml code in one cell, and used the start ,end function (for example start=constraints and end=/constraints to get the value of constraints in the xml code). Simply read into the value in another cell using the getElement function. heres the code if you ever need it.Just paste it into the VB editor of Excel to create the GetElement function (if not there already) or simply import the file in the link above. Attribute VB_Name = Module1″ Public Function getElement(xml As String, start As String, finish As String) For i = 1 To Len(xml) If Mid(xml, i, Len(start)) = start Then For j = i + Len(start) To Len(xml) If Mid(xml, j, Len(finish)) = finish Then getElement = Mid(xml, i + Len(start), j - i - Len(start)) Exit Function End If Next j End If Next i End Function On Sun, Sep 7, 2008 at 1:52 PM, Antje [EMAIL PROTECTED] wrote: Hi there, I try to rewrite some Java-code with R. It deals with reading XML files. I started with the XML package. In Java, I had a very useful method which gave me a node by using: name of the node index of appearance start point: global (false) / local (true) So, I could do something like this. setCurrentChildNode(data, 0); getValueOfElement(val,1,true); -- gives 45 setCurrentChildNode(data, 1); getValueOfElement(val,1,true); -- gives 11 getValueOfElement(val,1,false); -- gives 45 root data loc=1 val i=t1 22 /val val i=t2 45 /val /data data loc=2 val i=t1 44 /val val i=t2 11 /val /data /root Now, I'd like to do something like this in R. Most important would be to retrieve a node just by its name, not by the whole path. How is it possible? Can anybody help me with this issue? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Regards, Ajay Ohri http://tinyurl.com/liajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML - get node by name
On 7 September 2008 at 10:22, Antje wrote: | I try to rewrite some Java-code with R. It deals with reading XML files. I [...] | Now, I'd like to do something like this in R. Most important would be to | retrieve a node just by its name, not by the whole path. How is it possible? | | Can anybody help me with this issue? Have you looked at the XML package for R ? Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: request: most repeated sequnce
-- Forwarded message -- From: jim holtman [EMAIL PROTECTED] Date: Sun, Sep 7, 2008 at 11:42 AM Subject: Re: [R] request: most repeated sequnce To: Muhammad Azam [EMAIL PROTECTED] This should do it for you: x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4, + 0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0) x=array(x,dim=c(3,6,7)) apply(x,3,function(.mat){ + + rows - table(apply(.mat,1,function(z){ + # remove the zeros + z - z[z != 0] + + paste(z,collapse=' ') + })) + # remove empty strings + rows - rows[names(rows) != ] + + if (!is.null(rows)){ + return(names(rows)[which.max(rows)]) + } else return(NULL) + }) [[1]] [1] 1 [[2]] [1] 1 2 3 [[3]] [1] 1 2 3 4 [[4]] [1] 1 2 3 4 [[5]] [1] 2 2 3 4 [[6]] character(0) [[7]] [1] 1 On Sun, Sep 7, 2008 at 8:08 AM, Muhammad Azam [EMAIL PROTECTED] wrote: Dear Jim Holtman Thanks a lot for your help. The problem is still there. Please consider this set of values x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4, 0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0) x=array(x,dim=c(3,6,7)) apply(x,3,function(.mat){ rows - table(apply(.mat,1,function(z){ # remove the zeros z - z[z != 0] if (length(z) == 0) return(NULL) paste(z,collapse=' ') })) names(rows[which.max(rows)]) }) output is: Error in as.vector(x, mode) : invalid argument 'mode' Note: the obtained rows consist of all zeros should not take part in most repeated sequence process. best regards Muhammad Azam - Original Message From: jim holtman [EMAIL PROTECTED] To: Muhammad Azam [EMAIL PROTECTED] Cc: R-help request [EMAIL PROTECTED]; R Help r-help@r-project.org Sent: Sunday, September 7, 2008 12:36:18 AM Subject: Re: [R] request: most repeated sequnce This may come closer since it removes the zeros before comparison: x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4, + 0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0) x=array(x,dim=c(3,6,5)) apply(x,3,function(.mat){ +rows - table(apply(.mat,1,function(z){ +# remove the zeros +z - z[z != 0] +if (length(z) == 0) return(NULL) +paste(z,collapse=' ') +})) +names(rows[which.max(rows)]) + }) [1] 1 1 2 3 1 2 3 4 1 2 3 4 2 2 3 4 On Sat, Sep 6, 2008 at 12:48 PM, Muhammad Azam [EMAIL PROTECTED] wrote: Dear R community Initially i thought my problem has been solved but one thing which i found e.g. if 1. All the elements of a sector are zero e.g , , 7 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]0000000000 [2,]0000000000 [3,]0000000000 [4,]0000000000 [5,]0000000000 2. Majority of the rows consist of zeros e.g. , , 5 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]4400000000 [2,]4400000000 [3,]0000000000 [4,]0000000000 [5,]0000000000 Actually zeros are not my values. I get values and fill the remaining parts with zeros like x=array(0,dim=c(3,6,5)). Now according to first strategy 0000000000 are most repeated sequence of rows in both of above cases. But i don't want to consider cases where all elements are zeros and interested to get 44 00000000 or just 4 4 in case 2. Thanks and best regards Muhammad Azam - Original Message From: jim holtman [EMAIL PROTECTED] To: Muhammad Azam [EMAIL PROTECTED] Cc: R Help r-help@r-project.org; R-help request [EMAIL PROTECTED] Sent: Saturday, September 6, 2008 2:39:19 PM Subject: Re: [R] request: most repeated sequnce Here is a start. You can delete the zeros: x=c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,0,0,0,0,0,0,0,0,0,0,1,1,1,2,2,3,3,3,4,4,4,0,0,0,0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4, + 0,0,0,0,0,0,1,2,2,2,2,2,0,3,3,0,4,4,0,0,0,0,0,0) x=array(x,dim=c(3,6,5)) apply(x,3,function(.mat){ +rows - table(apply(.mat,1,function(z){ +paste(z,collapse=' ') +})) +names(rows[which.max(rows)]) + }) [1] 1 0 0 0 0 0 1 2 3 0 0 0 1 2 3 4 0 0 1 2 3 4 0 0 2 2 3 4 0 0 On Sat, Sep 6, 2008 at 4:54 AM, Muhammad Azam [EMAIL PROTECTED] wrote:
Re: [R] XML - get node by name
In particular try this: Lines - ' + root + data loc=1 +val i=t1 22 /val +val i=t2 45 /val + /data + data loc=2 +val i=t1 44 /val +val i=t2 11 /val + /data + /root + ' library(XML) doc - xmlTreeParse(Lines, asText = TRUE, trim = TRUE, useInternalNodes = TRUE) root - xmlRoot(doc) data1 - getNodeSet(root, //data)[[1]] xmlValue(getNodeSet(data1, //val)[[1]]) [1] 22 On Sun, Sep 7, 2008 at 11:42 AM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote: On 7 September 2008 at 10:22, Antje wrote: | I try to rewrite some Java-code with R. It deals with reading XML files. I [...] | Now, I'd like to do something like this in R. Most important would be to | retrieve a node just by its name, not by the whole path. How is it possible? | | Can anybody help me with this issue? Have you looked at the XML package for R ? Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ON MAC, how to copy a plot on to Word document?
I just CMD-C'd it and pasted it into OpenOffice with CMD-V. el On 07 Sep 2008, at 16:57 , John Kane wrote: I think you need to save the plot and import it into Word. AFAIK you can only copy and paste a plot in Windows. Have a look at ?png (There are other formats available) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axis tick label format and rotation
Hi Kurt, Please tell me how to format data in a data frame so when currency amount is displayed in a chart the axis tick labels contain leading $ signs. The easiest way is add a custom scale: vals - seq(0, 100, by = 10) qplot(...) + scale_x_continuous(breaks = vals, labels = paste($, vals, sep = )) Please also tell me if it is possible to rotate x axis labels using ggplot2. Not easily, but there will be in the next version. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using nls to fit a curve to data
Thanks Ben! Switching over to the gamma pdf and using the algorithm=plinear did the trick. jpl -- View this message in context: http://www.nabble.com/using-nls-to-fit-a-curve-to-data-tp19332210p19360761.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML - get node by name
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Antje Well, the XML package gives you a variety of ways to parse an XML document and manipulate it in R. Perhaps the approach that best matches the Java-style you outline is to use XPath to access nodes. To do this, you use doc = xmlTreeParse(filename.xml, useInternalNodes = TRUE) and then access the elements of interest with XPath queries, e.g. to get the value of the second val element within each data element, use xpathApply(doc, //data, function(n) xmlValue(n[[2]])) To get the first val node in the first data you could use doc[ //data/val ] [[1]] or doc[[ //data[1]/val[1] ]] (Note the indexing/subsetting is being done in different languages.) Being able to access a node by just its name is convenient, but it may not be adequate. You may pick up too many matching nodes. So XPath is a powerful way to be able to use simplicity when it is adequate and more explicit constrantts on the path when more specificity is necessary. And XPath is a widespread standard mechanism for XML rather than specific to R or Java. HTH, D. Antje wrote: Hi there, I try to rewrite some Java-code with R. It deals with reading XML files. I started with the XML package. In Java, I had a very useful method which gave me a node by using: name of the node index of appearance start point: global (false) / local (true) So, I could do something like this. setCurrentChildNode(data, 0); getValueOfElement(val,1,true); -- gives 45 setCurrentChildNode(data, 1); getValueOfElement(val,1,true); -- gives 11 getValueOfElement(val,1,false); -- gives 45 root data loc=1 val i=t1 22 /val val i=t2 45 /val /data data loc=2 val i=t1 44 /val val i=t2 11 /val /data /root Now, I'd like to do something like this in R. Most important would be to retrieve a node just by its name, not by the whole path. How is it possible? Can anybody help me with this issue? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkjD4osACgkQ9p/Jzwa2QP7ZUACfYpsezY4T2AeKb3G7Jo6Vr0N0 RmwAnAtKCY5s8vBoDx7C1DFP24eveCtk =XWJ8 -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] an error to call 'gee' function in R
Dear List: I found an error when I called the 'gee' function. I cannot solve and explain it. There are no errors when I used the 'geeglm' function. Both functions fit the gee model. The project supervisor recommends me to use the 'gee' function. But I cannot explain to him why this error happens. Would you help me solve this problem? I appreciate your help. In this project I will use the 'gee' or 'geeglm' and 'glmer' to fit the simulated multivariate count responses. I generated the data like this: Set β0 = β1 = 1, μ0 = 3, and n = 50. For each 1 ≤ i ≤ n, Simulate xi from N (1, 1). Simulate zi0 and zit from zi0 follows i.d. Poisson (μ0) , zit | xi follows i.d. Poisson (μit) , 1 ≤ t ≤ 3, log (μit) = log(E (zit | xi)) = β0t + xiβ1t = 1+xi. Let yit = zi0 + zit, 1 ≤ t ≤ 3. So my data frame, let me call it 'simdata', the first 10 rows look like this: id y.1 y.2 y.3 x 1 3 5 6 -0.06588626 2 6 7 6 -0.08265981 3 6 8 13 0.58307719 4 22 21 28 2.21099940 5 5 12 8 1.06299869 6 8 21 24 1.47615784 7 11 8 9 0.83748390 8 16 15 16 1.67011313 9 9 7 7 -0.14181264 10 31 37 40 2.56751453 This is the longitudinal data. I will change its shape to analyze it. The changed 'newdata' looks like this: id x time y 1 -0.065886261 3 2 -0.082659811 6 3 0.583077191 6 4 2.210999401 22 5 1.062998691 5 6 1.476157841 8 7 0.837483901 11 8 1.670113131 16 9 -0.141812641 9 10 2.567514531 31 ... 1 -0.065886262 5 2 -0.082659812 7 3 0.583077192 8 4 2.210999402 21 5 1.062998692 12 6 1.476157842 21 7 0.837483902 8 8 1.670113132 15 9 -0.141812642 7 10 2.567514532 37 ... 1 -0.065886263 6 2 -0.082659813 6 3 0.583077193 13 4 2.210999403 28 5 1.062998693 8 6 1.476157843 24 7 0.837483903 9 8 1.670113133 16 9 -0.141812643 7 10 2.567514533 40 ... My data 'y' comes from x. So their correlations are not independent. What does the argument 'corstr' mean it defined in the function. I tried all choices. But the error was still there. Here was the function I used in my programming: mfit1 - gee(y~x,data=newdata,family=poisson(link=log),id=id,corstr=exchangeable) GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998) Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable Call: gee(formula = y ~ x, id = id, data = newdata, family = poisson(link = log), corstr = exchangeable) Number of observations : 150 Maximum cluster size : 1 Coefficients: (Intercept) x 1.5849653 0.7937203 Estimated Scale Parameter: 1.162505 Number of Iterations: 1 Working Correlation[1:4,1:4] Error in print(x$working.correlation[1:4, 1:4], digits = digits) : subscript out of bounds Is this kind of data not fit for the function 'gee'? Because when I tested this two functions by using the R data 'warpbreaks' they worked perfect although some returned objects were different. I used the following to do it: 1. (summary(gee(breaks ~ tension, id=wool, data=warpbreaks, corstr=exchangeable)) 2. summary(geeglm(breaks ~ tension, id=wool, data=warpbreaks, corstr=exchangeable))). The first one is from the example of ?gee file. I will attach the part of my programming as .R file. You can excute in R software. Thanks a lot. I appreciate your help. Best regards. Sincerely, Cynthia Wu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re ferring to a group of vectors without explicit enumeration
Thanks very much. This is what I implemented. I found a similar example elsewhere too. It works fine now. jholtman wrote: I would suggest that you use a list to store the values since it is easier to create and reference: output - list() for (i in 1:10) output[[i]] - seq(i) output [[1]] [1] 1 [[2]] [1] 1 2 [[3]] [1] 1 2 3 [[4]] [1] 1 2 3 4 [[5]] [1] 1 2 3 4 5 [[6]] [1] 1 2 3 4 5 6 [[7]] [1] 1 2 3 4 5 6 7 [[8]] [1] 1 2 3 4 5 6 7 8 [[9]] [1] 1 2 3 4 5 6 7 8 9 [[10]] [1] 1 2 3 4 5 6 7 8 9 10 On Sat, Sep 6, 2008 at 5:33 PM, shalu [EMAIL PROTECTED] wrote: I am trying to define 25 vectors of varying lengths, say y1 to y25 in a loop, and then store the results of some computations in them. My problem is about using some sort of concatenation for names. For example, instead of initializing each of y1 through y25, I would like to do it in a loop. Similar to cat and paste for texts, is there anyway of using yi for the vector name where i ranges from 1 to 25, so ultimately it refers to the vector y1,..,y25? Varying lengths is not a problem. To start with each has only length 1 and then I will be adding to each vector based on some results. -- View this message in context: http://www.nabble.com/referring-to-a-group-of-vectors-without-explicit-enumeration-tp19351518p19351518.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/referring-to-a-group-of-vectors-without-explicit-enumeration-tp19351518p19359352.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regression with nominal data
Hi, y is nominal (3 categories), x1 to 3 is scale. What I want is a regression, showing the probability to fall in one of the three categories of y according to the x. How can I perform such a regression in R? Thanks for your help Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression with nominal data
check: help(multinom, package = nnet) I hope it helps. Best, Dimitris [EMAIL PROTECTED] wrote: Hi, y is nominal (3 categories), x1 to 3 is scale. What I want is a regression, showing the probability to fall in one of the three categories of y according to the x. How can I perform such a regression in R? Thanks for your help Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043399 Fax: +31/(0)10/7044657 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML - get node by name
On Sun, Sep 7, 2008 at 12:10 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote: In particular try this: Lines - ' + root + data loc=1 +val i=t1 22 /val +val i=t2 45 /val + /data + data loc=2 +val i=t1 44 /val +val i=t2 11 /val + /data + /root + ' library(XML) doc - xmlTreeParse(Lines, asText = TRUE, trim = TRUE, useInternalNodes = TRUE) root - xmlRoot(doc) data1 - getNodeSet(root, //data)[[1]] xmlValue(getNodeSet(data1, //val)[[1]]) [1] 22 The last line should be the following (although in this case it actually gives the same answer): xmlValue(getNodeSet(data1, val)[[1]]) On Sun, Sep 7, 2008 at 11:42 AM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote: On 7 September 2008 at 10:22, Antje wrote: | I try to rewrite some Java-code with R. It deals with reading XML files. I [...] | Now, I'd like to do something like this in R. Most important would be to | retrieve a node just by its name, not by the whole path. How is it possible? | | Can anybody help me with this issue? Have you looked at the XML package for R ? Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML - get node by name
Thanks a lot to Gabor and Duncan! I didn't know that XPath is a standard. I'll give it a deeper look to better understand it. Oh, I guess I understand a bit more xpathApply(doc, //val, function(n) xmlValue(n)) would search globally for all nodes named val and return its values :-) So that's excactly what I was looking for. Not caring about the exact location of a node. I think, in my case it should be okay, to parse for nodes just by their names. Thanks again! @ Ajay: Sorry, but I was looking for a solution with R @ Dirk: I already used the XML package but didn't know the possibilities to access data as I was used to. Antje schrieb: Hi there, I try to rewrite some Java-code with R. It deals with reading XML files. I started with the XML package. In Java, I had a very useful method which gave me a node by using: name of the node index of appearance start point: global (false) / local (true) So, I could do something like this. setCurrentChildNode(data, 0); getValueOfElement(val,1,true); -- gives 45 setCurrentChildNode(data, 1); getValueOfElement(val,1,true); -- gives 11 getValueOfElement(val,1,false); -- gives 45 root data loc=1 val i=t1 22 /val val i=t2 45 /val /data data loc=2 val i=t1 44 /val val i=t2 11 /val /data /root Now, I'd like to do something like this in R. Most important would be to retrieve a node just by its name, not by the whole path. How is it possible? Can anybody help me with this issue? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Averaging 'blocks' of data
Dear all, I have a large dataset which I hope to reduce in size, to make it more useable. I hope to do this by taking an average of each 60 x 60 blockof values and forming a new data frame out of the averaged values. How would I go about taking averages of 60 x 60 'blocks' in R, and cycling through the whole dataset, recording each calculated value in a new table/data frame? Many thanks for any advice offered. Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML - get node by name
On Sun, Sep 7, 2008 at 2:56 PM, Antje [EMAIL PROTECTED] wrote: Thanks a lot to Gabor and Duncan! I didn't know that XPath is a standard. I'll give it a deeper look to better understand it. Oh, I guess I understand a bit more xpathApply(doc, //val, function(n) xmlValue(n)) or just xpathApply(doc, //val, xmlValue) would search globally for all nodes named val and return its values :-) So that's excactly what I was looking for. Not caring about the exact location of a node. I think, in my case it should be okay, to parse for nodes just by their names. Thanks again! @ Ajay: Sorry, but I was looking for a solution with R @ Dirk: I already used the XML package but didn't know the possibilities to access data as I was used to. Antje schrieb: Hi there, I try to rewrite some Java-code with R. It deals with reading XML files. I started with the XML package. In Java, I had a very useful method which gave me a node by using: name of the node index of appearance start point: global (false) / local (true) So, I could do something like this. setCurrentChildNode(data, 0); getValueOfElement(val,1,true); -- gives 45 setCurrentChildNode(data, 1); getValueOfElement(val,1,true); -- gives 11 getValueOfElement(val,1,false); -- gives 45 root data loc=1 val i=t1 22 /val val i=t2 45 /val /data data loc=2 val i=t1 44 /val val i=t2 11 /val /data /root Now, I'd like to do something like this in R. Most important would be to retrieve a node just by its name, not by the whole path. How is it possible? Can anybody help me with this issue? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging 'blocks' of data
This was answered last month: http://tolstoy.newcastle.edu.au/R/e4/help/08/08/19091.html On Sun, Sep 7, 2008 at 3:32 PM, Steve Murray [EMAIL PROTECTED] wrote: Dear all, I have a large dataset which I hope to reduce in size, to make it more useable. I hope to do this by taking an average of each 60 x 60 blockof values and forming a new data frame out of the averaged values. How would I go about taking averages of 60 x 60 'blocks' in R, and cycling through the whole dataset, recording each calculated value in a new table/data frame? Many thanks for any advice offered. Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging 'blocks' of data
On Sun, Sep 7, 2008 at 12:32 PM, Steve Murray [EMAIL PROTECTED] wrote: Dear all, I have a large dataset which I hope to reduce in size, to make it more useable. I hope to do this by taking an average of each 60 x 60 blockof values and forming a new data frame out of the averaged values. what does the data look like? vector / matrix / list ? How would I go about taking averages of 60 x 60 'blocks' in R, and cycling through the whole dataset, recording each calculated value in a new table/data frame? some form of apply(), tapply(), mapply(), or lapply() would probably do what you want Many thanks for any advice offered. Steve Here is a start: # step 1. too much data: 10x10 matrix m - matrix(runif(100), ncol=10) # step 2. reduce down to a 10x1 vector, averaging-by-row: apply(m, 1, mean) # step 3 profit. Dylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging 'blocks' of data
Gabor - thanks for your suggestion... I had checked the previous post, but I found (as a new user of R) this approach to be too complicated and I had problems gaining the correct output values. If there is a simpler way of doing this, then please feel free to let me know. Dylan - thanks, your approach is a good start. In answer to your questions, my data are 43200 columns and 16800 rows as a data frame - I will probably have to read the dataset in segments though, as it won't fit into the memory! I've been able to follow your example - how would I be able to apply this technique for finding the average of each 60 x 60 block? Any other suggestions are of course welcome! Many thanks again, Steve _ Discover Bird's Eye View now with Multimap from Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging 'blocks' of data
Gabor - thanks for your suggestion... I had checked the previous post, but I found (as a new user of R) this approach to be too complicated and I had problems gaining the correct output values. If there is a simpler way of doing this, then please feel free to let me know. Dylan - thanks, your approach is a good start. In answer to your questions, my data are 43200 columns and 16800 rows as a data frame - I will probably have to read the dataset in segments though, as it won't fit into the memory! I've been able to follow your example - how would I be able to apply this technique for finding the average of each 60 x 60 block? Any other suggestions are of course welcome! Many thanks again, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to draw a vertical line from points to x-axis
Hello, I want to know how to draw a line connecting each point to the x-axis perpendicularly (i.e. a vertical line). abline(v=...) seems not to work for my purpose, because it runs over the data point. Can anyone help? Thanks. Anny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating a vignette
Dear R Gurus: How would I create a vignette, please? Why would a vignette be better than examples, please? Thanks, Edna Bell __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] restructuring datset problem
This should do it for you: CODE NAME 13 aaa 23 aab 33 aac 44 bba 54 bbb 64 bbc 74 bbd 85 cca 95 ccb x.s - split(x$NAME, x$CODE) maxLine - max(table(x$CODE)) # pad out the lines x.pad - lapply(x.s, function(line){ + # convert to character + line - as.character(line) + length(line) - maxLine + line + }) as.data.frame(do.call(rbind, x.pad)) V1 V2 V3 V4 3 aaa aab aac NA 4 bba bbb bbc bbd 5 cca ccb NA NA On Sun, Sep 7, 2008 at 2:23 PM, Gellrich Mario [EMAIL PROTECTED] wrote: Hi, I've got a question regarding the restructering of a data set. What I have are municipality zip-codes and the names of 5'000 built-up areas within municipalities. The following example shows, what I would like to do: Input (Zip-Codes and Names): # CODE NAME #1 3 aaa #2 3 aab #3 3 aac #4 4 bba #5 4 bbb #6 4 bbc #7 4 bbd #8 5 cca #9 5 ccb Desired Output (Zip-Codes and restructured names) # CODE V2V3V4V5 #1 3 aaa aab aacNA #2 4 bba bbb bbc bbd #3 5 cca ccb NA NA I tougth about this problem several hours and tried functions like aggregate() and t() in combination with for-loops but didn't came to the output above. Can anybody help me? Best regards, Mario __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to draw a vertical line from points to x-axis
Anny Here's one way: plot(0:10, 0:10, pch=16) lines(rep(0:10, each=3), t(matrix(c(0:10, rep(c(0,NA), each=11)), ncol=3))) HTH Peter Alspach -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Anny Huang Sent: Monday, 8 September 2008 8:49 a.m. To: r-help@r-project.org Subject: [R] how to draw a vertical line from points to x-axis Hello, I want to know how to draw a line connecting each point to the x-axis perpendicularly (i.e. a vertical line). abline(v=...) seems not to work for my purpose, because it runs over the data point. Can anyone help? Thanks. Anny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are privileged and/or confidential to the named recipient and are not to be used by any other person and/or organisation. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging 'blocks' of data
Here is a way to do it by reading in 60 lines at a time and computing the means: # create some test data n - 360 x - matrix(runif(360*16800), nrow=16800) cat(x, file=/tempxx.txt) # now process the data 60 lines at a time, averaging each 60x60 block result - matrix(0, nrow=6, ncol=280) nextLine - 1 # next output in the result # create a list of indices to use to partition the input matrix colIndex - split(seq(16800), (seq(16800) - 1) %/% 60) input - file(/tempxx.txt, r) while (TRUE){ # use 'scan' to read in 60 lines at a time block - scan(input, what=0, n=60*16800) if (length(block) != 60 * 16800) break # exit if done # convert to a matrix block - matrix(block, nrow=60, byrow=TRUE) # compute the mean and store it result[nextLine,] - sapply(colIndex, function(.blk){ mean(block[, .blk]) }) nextLine - nextLine + 1 } On Sun, Sep 7, 2008 at 4:46 PM, Steve Murray [EMAIL PROTECTED] wrote: Gabor - thanks for your suggestion... I had checked the previous post, but I found (as a new user of R) this approach to be too complicated and I had problems gaining the correct output values. If there is a simpler way of doing this, then please feel free to let me know. Dylan - thanks, your approach is a good start. In answer to your questions, my data are 43200 columns and 16800 rows as a data frame - I will probably have to read the dataset in segments though, as it won't fit into the memory! I've been able to follow your example - how would I be able to apply this technique for finding the average of each 60 x 60 block? Any other suggestions are of course welcome! Many thanks again, Steve _ Discover Bird's Eye View now with Multimap from Live Search __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R_USER - in which file should I include it?
Hello I am a newbie. I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I decided to install all 2.7 versions under c:\program files\R\2.7 from now on (2.7.1 is located under .\2.7.1) Although I don't like the idea (I am running Vista), I have edited etc\Renviron.site to contain: R_USER=c:/Users/eduardo/Documents/R R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7 As far as R starting always from the same location, that is, c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help. So I wonder whether someone from the list could help me to: a) force R to start always from the same location b) force R to install all new packages in the same location Many thanks Ed PS. Before sending this email, I read windows FAQ and browsed the archives (too many posts in the subject!). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to draw a vertical line from points to x-axis
2008/9/7 Anny Huang [EMAIL PROTECTED]: Hello, I want to know how to draw a line connecting each point to the x-axis perpendicularly (i.e. a vertical line). abline(v=...) seems not to work for my purpose, because it runs over the data point. Can anyone help? Thanks. If your x-axis is at y=zero then plot with type='h' will do this: plot(1:10,runif(10),type='h',ylim=c(0,1)) but it will draw lines *up* if the value is negative: plot(1:10,(1:10)-5,type='h') Or do you really want the lines to come right down to the axis line? In which case a modified version of Peter Alspach's solution which goes down to the limit of the plot instead of zero should work. See help(par) for what par()$usr is all about. y= 6+0:10 x=0:10 plot(x,y,pch=16,ylim=c(-2,17)) lines(rep(x,each=3),t(matrix(c(y,rep(c(par()$usr[3],NA),each=11)),ncol=3))) Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] restructuring datset problem
Try this: # read in data ensuring NAME is character, not factor Lines - CODE NAME + 13 aaa + 23 aab + 33 aac + 44 bba + 54 bbb + 64 bbc + 74 bbd + 85 cca + 95 ccb + DF - read.table(textConnection(Lines), header = TRUE, as.is = TRUE) DF$seq = ave(DF$CODE, DF$CODE, FUN = seq_along) tapply(DF$NAME, DF[c(CODE, seq)], c) seq CODE 1 2 3 4 3 aaa aab aac NA 4 bba bbb bbc bbd 5 cca ccb NANA On Sun, Sep 7, 2008 at 2:23 PM, Gellrich Mario [EMAIL PROTECTED] wrote: Hi, I've got a question regarding the restructering of a data set. What I have are municipality zip-codes and the names of 5'000 built-up areas within municipalities. The following example shows, what I would like to do: Input (Zip-Codes and Names): # CODE NAME #1 3 aaa #2 3 aab #3 3 aac #4 4 bba #5 4 bbb #6 4 bbc #7 4 bbd #8 5 cca #9 5 ccb Desired Output (Zip-Codes and restructured names) # CODE V2V3V4V5 #1 3 aaa aab aacNA #2 4 bba bbb bbc bbd #3 5 cca ccb NA NA I tougth about this problem several hours and tried functions like aggregate() and t() in combination with for-loops but didn't came to the output above. Can anybody help me? Best regards, Mario __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] run optim() on a list
Hi, I am at the end of my wit to figure out how to run the optim function on a list. Basically, I have a data set of three columns as Site, Pool and Positivity ( the full data set is copied at the end). I want to run the maximal likelihood estimation separately on subsets split by Site data-read.table(...) sp-split(data,data$Site) # My likelihood function is like-function(p,...){ for(i in 1:length(Pool)){ if(Positivity[i]==1) log.l[i]-log(1-(1-p)^Pool[i]) else log.l[i]-Pool[i]*log(1-p) } return(sum(log.l)) } # Then I run lapply(sp,function(x) optim (0.1,like,control=list(fnscale=-1))) #But it gives an estimation based on the full data, not separately on sp[[1]], sp[[2]],... I tried do.call without success. So, your help would be appreciated. Weidong Gu Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 Site Pool Positivity UBA_1 22 0 UBA_1 50 0 UBA_1 23 0 UBA_1 25 0 UBA_1 35 0 UBA_1 24 0 UBA_1 26 0 Cham_res 43 0 Cham_res 45 0 Cham_res 34 0 Cham_res 24 0 Cham_res 21 0 Cham_res 16 0 Cham_res 28 0 Cham_res 50 0 Cham_res 50 1 Cham_res 39 1 UBA_2 16 0 UBA_2 18 1 UBA_2 42 1 UBA_2 35 1 UBA_2 50 1 UBA_2 26 0 UBA_2 20 0 UBA_2 16 0 UBA_2 19 0 UBA_2 50 0 UBA_2 26 0 UBA_2 13 1 UBA_2 30 1 UBA_3 17 0 UBA_3 20 0 UBA_3 19 0 UBA_3 50 0 UBA_3 24 1 UBA_3 18 1 UBA_3 16 1 UBA_3 14 0 UBA_3 12 0 UBA_3 15 0 UBA_3 11 0 UBA_3 20 1 UBA_3 19 1 UBA_3 31 1 UBA_4 12 0 UBA_4 11 0 UBA_4 12 0 UBA_4 21 0 UBA_4 33 0 UBA_4 15 0 UBA_4 10 0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Request for advice on character set conversions (those damn Excel files, again ...)
Dear list, I have to read a not-so-small bunch of not-so-small Excel files, which seem to have traversed Window 3.1, Windows95 and Windows NT versions of the thing (with maybe a Mac or two thrown in for good measure...). The problem is that 1) I need to read strings, and 2) those strings may have various encodings. In the same sheet of the same file, some cells may be latin1, some UTF-8 and some CP437 (!). read.xls() alows me to read those things in sets of dataframes. my problem is to convert the encodings to UTF8 without cloberring those who are already (looking like) UTF8. I came to the following solution : foo-function(d, from=latin1,to=UTF-8){ # Semi-smart conversion of a dataframe between charsets. # Needed to ease use of those [EMAIL PROTECTED] Excel files # that have survived the Win3.1 -- Win95 -- NT transition, # usually in poor shape.. conv1-function(v,from,to) { condconv-function(v,from,to) { cnv-is.na(iconv(v,to,to)) v[cnv]-iconv(v[cnv],from,to) return(v) } if (is.factor(v)) { l-condconv(levels(v),from,to) levels(v)-l return(v) } else if (is.character(v)) return(condconv(v,from,to)) else return(v) } for(i in names(d)) d[,i]-conv1(d[,i],from,to) return(d) } Any advice for enhancement is welcome... Sincerely yours, Emmanuel Charpentier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an error to call 'gee' function in R
From the gee() help page: Data are assumed to be sorted so that observations on a cluster are contiguous rows for all entities in the formula. -thomas On Sun, 7 Sep 2008, Qinglin Wu wrote: Dear List: I found an error when I called the 'gee' function. I cannot solve and explain it. There are no errors when I used the 'geeglm' function. Both functions fit the gee model. The project supervisor recommends me to use the 'gee' function. But I cannot explain to him why this error happens. Would you help me solve this problem? I appreciate your help. In this project I will use the 'gee' or 'geeglm' and 'glmer' to fit the simulated multivariate count responses. I generated the data like this: Set ??0 = ??1 = 1, ??0 = 3, and n = 50. For each 1 ?? i ?? n, Simulate xi from N (1, 1). Simulate zi0 and zit from zi0 follows i.d. Poisson (??0) , zit | xi follows i.d. Poisson (??it) , 1 ?? t ?? 3, log (??it) = log(E (zit | xi)) = ??0t + xi??1t = 1+xi. Let yit = zi0 + zit, 1 ?? t ?? 3. So my data frame, let me call it 'simdata', the first 10 rows look like this: id y.1 y.2 y.3 x 1 3 5 6 -0.06588626 2 6 7 6 -0.08265981 3 6 8 13 0.58307719 4 22 21 28 2.21099940 5 5 12 8 1.06299869 6 8 21 24 1.47615784 7 11 8 9 0.83748390 8 16 15 16 1.67011313 9 9 7 7 -0.14181264 10 31 37 40 2.56751453 This is the longitudinal data. I will change its shape to analyze it. The changed 'newdata' looks like this: id x time y 1 -0.065886261 3 2 -0.082659811 6 3 0.583077191 6 4 2.210999401 22 5 1.062998691 5 6 1.476157841 8 7 0.837483901 11 8 1.670113131 16 9 -0.141812641 9 10 2.567514531 31 ... 1 -0.065886262 5 2 -0.082659812 7 3 0.583077192 8 4 2.210999402 21 5 1.062998692 12 6 1.476157842 21 7 0.837483902 8 8 1.670113132 15 9 -0.141812642 7 10 2.567514532 37 ... 1 -0.065886263 6 2 -0.082659813 6 3 0.583077193 13 4 2.210999403 28 5 1.062998693 8 6 1.476157843 24 7 0.837483903 9 8 1.670113133 16 9 -0.141812643 7 10 2.567514533 40 ... My data 'y' comes from x. So their correlations are not independent. What does the argument 'corstr' mean it defined in the function. I tried all choices. But the error was still there. Here was the function I used in my programming: mfit1 - gee(y~x,data=newdata,family=poisson(link=log),id=id,corstr=exchangeable) GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998) Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable Call: gee(formula = y ~ x, id = id, data = newdata, family = poisson(link = log), corstr = exchangeable) Number of observations : 150 Maximum cluster size : 1 Coefficients: (Intercept) x 1.5849653 0.7937203 Estimated Scale Parameter: 1.162505 Number of Iterations: 1 Working Correlation[1:4,1:4] Error in print(x$working.correlation[1:4, 1:4], digits = digits) : subscript out of bounds Is this kind of data not fit for the function 'gee'? Because when I tested this two functions by using the R data 'warpbreaks' they worked perfect although some returned objects were different. I used the following to do it: 1. (summary(gee(breaks ~ tension, id=wool, data=warpbreaks, corstr=exchangeable)) 2. summary(geeglm(breaks ~ tension, id=wool, data=warpbreaks, corstr=exchangeable))). The first one is from the example of ?gee file. I will attach the part of my programming as .R file. You can excute in R software. Thanks a lot. I appreciate your help. Best regards. Sincerely, Cynthia Wu Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a vignette
On 07/09/2008 4:51 PM, Edna Bell wrote: Dear R Gurus: How would I create a vignette, please? Why would a vignette be better than examples, please? You can create a vignette any way you like, but the most common way is with Sweave: write a document that is mainly LaTeX, but with R code inclusions that are executed and displayed. The main value of a vignette is that it documents more: instead of documenting one or a few functions, it can document a whole package, or a whole type of analysis using several different packages. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help parametric boot
Hi R users Is there any example for nonlinear parametric boot? I google it but I can't find it. I am interested in the parameter estimators of a nonlinear model. But I really don't know how to code it in the ran.gen statement (data set from ?nls) fm1 - nls(weight ~ Asym/(1+exp((xmid-Time)/scal)), data = ChickWeight, +start=c(Asym=337, xmid=16, scal=8)) fm1.fun-function(data){coef(update(fm1,data=data))} ran.sim-function(data,mle){out-rnorm(n=nrow(data,mle));out} fm1.boot-boot(ChickWeight, statistic = fm1.fun, R=99, sim=parametric, +ran.gen=ran.sim, mle=coef(fm1)) Error in nrow(data, mle) : unused argument(s) (c(337.605336871528, 16.0688379710354, 8.00747460385483)) Any suggestion would be very helpful. many thanks in advance Chunhao __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an error to call 'gee' function in R
Dear Thomas Lumley: According to your suggestion I have sorted the 'newdata' by id. When I print the results the error was still there. So I tried to sort the 'newdata' by y, x although I don't think it make sense. The error was still there. Here it was the part of the code: data1=newdata[order(newdata$id),] print(gee model:) mfit1 - gee(y~x,data=data1,family=poisson(link=log),id=id,corstr=exchangeable) print(mfit1) the error is still in the results: GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998) Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable Call: gee(formula = y ~ x, id = id, data = data1, family = poisson(link = log), corstr = exchangeable) Number of observations : 150 Maximum cluster size : 3 Coefficients: (Intercept) x 1.6389 0.7619 Estimated Scale Parameter: 1.012 Number of Iterations: 1 Working Correlation[1:4,1:4] Error in print(x$working.correlation[1:4, 1:4], digits = digits) : subscript out of bounds I don't know why it is. Thanks for your help. I still attach my part of the programming. This programming you can run in R. Thanks again. Best regards. Sincerely, Cynthia Wu --- On Sun, 9/7/08, Thomas Lumley [EMAIL PROTECTED] wrote: From: Thomas Lumley [EMAIL PROTECTED] Subject: Re: [R] an error to call 'gee' function in R To: Qinglin Wu [EMAIL PROTECTED] Cc: r-help@r-project.org Date: Sunday, September 7, 2008, 6:41 PM From the gee() help page: Data are assumed to be sorted so that observations on a cluster are contiguous rows for all entities in the formula. -thomas On Sun, 7 Sep 2008, Qinglin Wu wrote: Dear List: I found an error when I called the 'gee' function. I cannot solve and explain it. There are no errors when I used the 'geeglm' function. Both functions fit the gee model. The project supervisor recommends me to use the 'gee' function. But I cannot explain to him why this error happens. Would you help me solve this problem? I appreciate your help. In this project I will use the 'gee' or 'geeglm' and 'glmer' to fit the simulated multivariate count responses. I generated the data like this: Set ¦Â0 = ¦Â1 = 1, ¦Ì0 = 3, and n = 50. For each 1 ¡Ü i ¡Ü n, Simulate xi from N (1, 1). Simulate zi0 and zit from zi0 follows i.d. Poisson (¦Ì0) , zit | xi follows i.d. Poisson (¦Ìit) , 1 ¡Ü t ¡Ü 3, log (¦Ìit) = log(E (zit | xi)) = ¦Â0t + xi¦Â1t = 1+xi. Let yit = zi0 + zit, 1 ¡Ü t ¡Ü 3. So my data frame, let me call it 'simdata', the first 10 rows look like this: id y.1 y.2 y.3 x 1 3 5 6 -0.06588626 2 6 7 6 -0.08265981 3 6 8 13 0.58307719 4 22 21 28 2.21099940 5 5 12 8 1.06299869 6 8 21 24 1.47615784 7 11 8 9 0.83748390 8 16 15 16 1.67011313 9 9 7 7 -0.14181264 10 31 37 40 2.56751453 This is the longitudinal data. I will change its shape to analyze it. The changed 'newdata' looks like this: id x time y 1 -0.065886261 3 2 -0.082659811 6 3 0.583077191 6 4 2.210999401 22 5 1.062998691 5 6 1.476157841 8 7 0.837483901 11 8 1.670113131 16 9 -0.141812641 9 10 2.567514531 31 ... 1 -0.065886262 5 2 -0.082659812 7 3 0.583077192 8 4 2.210999402 21 5 1.062998692 12 6 1.476157842 21 7 0.837483902 8 8 1.670113132 15 9 -0.141812642 7 10 2.567514532 37 ... 1 -0.065886263 6 2 -0.082659813 6 3 0.583077193 13 4 2.210999403 28 5 1.062998693 8 6 1.476157843 24 7 0.837483903 9 8 1.670113133 16 9 -0.141812643 7 10 2.567514533 40 ... My data 'y' comes from x. So their correlations are not independent. What does the argument 'corstr' mean it defined in the function. I tried all choices. But the error was still there. Here was the function I used in my programming: mfit1 - gee(y~x,data=newdata,family=poisson(link=log),id=id,corstr=exchangeable) GEE: GENERALIZED LINEAR MODELS FOR DEPENDENT DATA gee S-function, version 4.13 modified 98/01/27 (1998) Model: Link: Logarithm Variance to Mean Relation: Poisson Correlation Structure: Exchangeable Call: gee(formula = y ~ x, id = id, data = newdata, family = poisson(link = log), corstr = exchangeable) Number of observations : 150 Maximum cluster size : 1 Coefficients: (Intercept) x 1.5849653 0.7937203 Estimated Scale Parameter: 1.162505 Number of Iterations: 1 Working Correlation[1:4,1:4] Error in
Re: [R] extracting max row from data matrix
Dear Srini, Here is one way: # Data set x=read.table(textConnection(fruit weight 1 apple1.3 2 apple1.5 3 apple1.6 4 orange1.4 5 orange1.6),header=TRUE) x[tapply(x$weight,x$fruit,which.max),] apple orange 1.61.6 or Try also x[cumsum(tapply(x$weight,x$fruit,which.max)),] fruit weight 3 apple1.6 5 orange1.6 HTH, Jorge On Sun, Sep 7, 2008 at 10:24 PM, Srinivas Iyyer [EMAIL PROTECTED]wrote: dear group, i have a data matrix with some replicate items with different values. I want to extract the row with max value. for example: x fruit weight 1 apple1.3 2 apple1.5 3 apple1.6 4 orange1.4 5 orange1.6 x is a data frame. I want to extract unique items from fruits that has max weight. that is: 3 apple1.6 5 orange1.6 I want to be able to use apply functions. Could some one lend some help please. Thanks srini __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging 'blocks' of data
I'm not sure I exactly understand your problem, but if you are looking for a recursive algorithm for calculating the average by addition of one record only at a time, consider: y[k] = y[k-1] + (x[k] - y[k-1])/k, where y(0) = 0, k = 1, 2, ... At each stage, y[k] = (x[1]+...+x[k])/k. At 04:46 PM 9/7/2008, Steve Murray wrote: Gabor - thanks for your suggestion... I had checked the previous post, but I found (as a new user of R) this approach to be too complicated and I had problems gaining the correct output values. If there is a simpler way of doing this, then please feel free to let me know. Dylan - thanks, your approach is a good start. In answer to your questions, my data are 43200 columns and 16800 rows as a data frame - I will probably have to read the dataset in segments though, as it won't fit into the memory! I've been able to follow your example - how would I be able to apply this technique for finding the average of each 60 x 60 block? Any other suggestions are of course welcome! Many thanks again, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: [EMAIL PROTECTED] Least Cost Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239Fax: 757-467-2947 Vere scire est per causas scire __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cohen's kappa
one more question, The third value (kappa (2*PA-1)) is adjusted for prevalence using the method proposed by Byrt, Bishop and Carlin (1993) --- from ?cohen.kappa What does the prevalence refer to? On Sun, Sep 7, 2008 at 10:43 PM, Weiwei Shi [EMAIL PROTECTED] wrote: Dear all, I have a question on Cohen's kappa: Assume I have two datasets, one has 500 objects, 10 methods and the other, 1000 different objects, 20 different methods. Could I compare between the two datasets to conclude the 10 methods are more concordant than the 20 ones by looking at some output, for example, cohen.kappa{concord} ? One more, could anyone explain in brief, what's the difference between kappa(Cohen) and kappa(Siegel)? Thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression with nominal data
Soren, It sounds like you are new to R so I will refer you to some packages that I think some people would find more user friendly as beginners. Zelig is excellent. You could run a series of logistic regressions coding your dependent variables as follows (a versus b, a versus c, b versus c) See the website below http://gking.harvard.edu/zelig/docs/index.html Alternatively you could try Rattle See the website below http://rattle.togaware.com/rattle-features.html Or you could try Rcmder HTH Paul [EMAIL PROTECTED] wrote: Hi, y is nominal (3 categories), x1 to 3 is scale. What I want is a regression, showing the probability to fall in one of the three categories of y according to the x. How can I perform such a regression in R? Thanks for your help Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Poisson Distribution - Chi Square Test for Goodness of Fit
Dear R-help,  Chi Square Test for Goodness of Fit   Problem Faced :  I have got a discrete data as given below (R script)  No_of_Frauds -c 1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,2,2,2,1,1,2,1,1,1,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,5,1,2,1,1,1,1,1,1,1,3,2,1,1,1,2,1,1,2,1,1,1,1,1,2,1,3,1,2,1,2,14,2,1,1,38,3,3,2,44,1,4,1,4,1,2,2,1,3)  I am trying to fit Poisson distribution to this data using R.  When I run this script using R â console,  I am getting value of Chi â Square Statistics as high as â6.95753e+37â  When I did the same calculations in Excel, I got the Chi Square Statistics value = 138.34.  Although it is clear that the sample data doesnât follow Poisson distribution, and I will have to look for other discrete distribution, my problem is the HIGH Value of Chi Square test statistics. When I analyzed further, I understood the problem.  (A) By convention, if your Expected frequency is less than 5, then by we put together such classes and form a new class such that Expected frequency is greater than 5 and also accordingly adjust the observed frequencies.  X Oi Ei ((Oi - Ei)^2)/Ei 0 0 10 9.96 1 72 23 103.79 2 17 27 3.54 3 5 21 11.85 4 3 12 6.71 5 4 9 2.51 Total 101 101 138.34   When I apply this logic in Excel, I am getting the reasonable result (i.e. 138.34), however in Excel also, if I donât apply this logic, my Chi square test statistic value is as high as 4.70043E+37.  My question is how do I modify my R â script, so that the logic mentioned in (A) i.e. adjusting the Expected frequencies (and accordingly Observed frequencies) is applied so that the expected frequency becomes greater than 5 for a given class, thereby resulting in reasonable value of Chi Square test Statistics.  My R â script is given below -  # R SCRIPT for Fitting Poisson Distribution  No_of_Frauds -c 1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,2,2,2,1,1,2,1,1,1,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,5,1,2,1,1,1,1,1,1,1,3,2,1,1,1,2,1,1,2,1,1,1,1,1,2,1,3,1,2,1,2,14,2,1,1,38,3,3,2,44,1,4,1,4,1,2,2,1,3)   N             -            length(No_of_Frauds)  Average    -            mean(No_of_Frauds)  Lambda    -            Average  i              -            c(0:(N-1))  pmf         -            dpois(i, Lambda, log = FALSE)   #   # Ho: The data follow Poisson Distribution Vs H1: Not Ho   # observed frequencies (Oi)  variable.cnts      -    table(No_of_Frauds) variable.cnts.prs -    dpois(as.numeric(names(variable.cnts)), lambda) variable.cnts      -    c(variable.cnts, 0)  variable.cnts.prs -    c(variable.cnts.prs, 1-sum(variable.cnts.prs)) tst                   -    chisq.test(variable.cnts, p=variable.cnts.prs)  chi_squared       -    as.numeric(unclass(tst)$statistic) p_value           -    as.numeric(unclass(tst)$p.value) df                    -    tst[2]$parameter   cv1                   -    qchisq(p=.01, df=tst[2]$parameter, lower.tail = FALSE, log.p = FALSE)  cv2                   -    qchisq(p=.05, df=tst[2]$parameter, lower.tail = FALSE, log.p = FALSE)  cv3                   -    qchisq(p=.1, df=tst[2]$parameter, lower.tail = FALSE, log.p = FALSE) #-  # Expected value  # variable.cnts.prs * sum(variable.cnts)   # if tst cv reject Ho at alpha confidence level  #-  if(chi_squared cv1)  Conclusion1 - 'Sample does not come from the postulated probability distribution at 1% los' else Conclusion1 - 'Sample comes from postulated prob. distribution at 1% los'   if(chi_squared cv2)  Conclusion2 - 'Sample does not come from the postulated probability distribution at 5% los' else Conclusion2 - 'Sample comes from postulated prob. distribution at 1% los'  if(chi_squared cv3) Conclusion3 - 'Sample does not come from the postulated probability distribution at 10% los' else Conclusion3 - 'Sample come from postulated prob distribution at 1% los' Â