[R] Self-describing data files
Hello all, I have an outstanding request to documents units [1] or other variable-specific facts in data files. I am thinking in particular of tab-delimited files that are suitable to be read into a data frame. One suggestion/request is to add a line of text under the column headings. My first thought was that it might be preferable to consume the first line in the file (perhaps making it easier to skip??), but then I found the comment.char option which appears made for the task. Is there a common/preferred way to tackle this? Bill [1] http://articles.cnn.com/1999-09-30/tech/9909_30_mars.metric.02_1_climate-orbiter-spacecraft-team-metric-system?_s=PM:TECH [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot(aModel) vs. influence.measures()
A while back I asked about getting a list of points that R considers influential after fitting a linear model, and very quickly got a helpful pointer to influence.measures(). But it has happened again. The trouble I am having is that points marked on plots are not flagged in the output from influence.measures(), and I can't read them on the plots. I tried some successive deletion, but then other points (naturally) start to look troublesome). Is there a good way to get a list of suspicious entries at the beginning? In this case, I am trying to help identify possible data entry errors, and I am interested in knowing what R bothered to mark up front. Perhaps the defaults should be telling me that what I want to do is silly, but it sure _seems_ like it would be helpful. Is there a way to control the threshold used by influence.measures() to get it to flag more items at one time? I am learning the hard way, so feel free to tell me that I should be trying to do this some other way. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] List of influential points?
Hello all, I fit a linear model to some data and used plot() to create diagnostic plots for the fit; I am having trouble reading the points that R is flagging as influential. Is there a way to get the list of influential points from the fit or its summary, etc.? Most likely, there are a few points appearing in almost the same place, making it difficult to read from the plots. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] List of influential points?
Michael, That looks like what I need - thanks! Bill From: Michael Bedward [michael.bedw...@gmail.com] Sent: Monday, November 29, 2010 7:36 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] List of influential points? Hi Bill, Have a look at the influence.measures function... my.lm - lm( ... ) influence.measures( my.lm ) Hope this helps, Michael On 30 November 2010 00:13, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: Hello all, I fit a linear model to some data and used plot() to create diagnostic plots for the fit; I am having trouble reading the points that R is flagging as influential. Is there a way to get the list of influential points from the fit or its summary, etc.? Most likely, there are a few points appearing in almost the same place, making it difficult to read from the plots. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [OT] (slightly) - OpenOffice Calc and text files
Hello all, I had a very strange looking problem that turned out to be due to unexpected (by me at least) format changes to one of my data files. We have a small lab study in which each run is represented by a row in a tab-delimited file; each row identifies a repetition of the experiment and associates it with some subjective measurements and times from our notes that get used to index another file with lots of automatically collected data. In short, nothing shocking. In a moment of weakness, I opened the file using (I think it's version 3.2) of OpenOffice Calc to edit something that I had mangled when I first entered it, saved it (apparently the mistake), and reran my analysis code. The results were goofy, and the problem was in my code that runs before R ever sees the data. That code was confused by things that I would like to ensure don't happen again, and I suspect that some of you might have thoughts on it. The problems specifically: (1) OO seems to be a little stingy about producing tab-delimited text; there is stuff online about using the csv and editing the filter and folks (presumably like us) saying that it deserves to be a separate option. (2) Dates that I had formatted as got chopped to YY (did we not learn anything last time?g) and times that I had formatted in 24 hours ended up AM/PM. Have any of you found a nice (or at least predictable) way to use OO Calc to edit files like this? If it insists on thinking for me, I wish it would think in 24 hour time and 4 digit years :) I work on Linux, so Excel is off the table, but another spreadsheet or text editor would be a viable option, as would configuration changes to Calc. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] (slightly) - OpenOffice Calc and text files
Albyn, I'll look into it. In fact, I have a small book on it that I bought in my very early days of using Linux. I quickly found TeX Maker (for the obvious), Code::Blocks for C/C++ and I would not have started the move without a working Smalltalk (http://pharo-project.org/home). For editing data files, I really just want something that shows data in an understandable grid and does not do weird stuff thinking it's being helpful. Bill From: Albyn Jones [jo...@reed.edu] Sent: Wednesday, October 13, 2010 1:39 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] [OT] (slightly) - OpenOffice Calc and text files How about emacs? albyn On Wed, Oct 13, 2010 at 01:13:03PM -0400, Schwab,Wilhelm K wrote: Hello all, . Have any of you found a nice (or at least predictable) way to use OO Calc to edit files like this? If it insists on thinking for me, I wish it would think in 24 hour time and 4 digit years :) I work on Linux, so Excel is off the table, but another spreadsheet or text editor would be a viable option, as would configuration changes to Calc. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Albyn Jones Reed College jo...@reed.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] (slightly) - OpenOffice Calc and text files
Peter, vi is *really* primitive =:0 R is a little late because I tend to do shape changes prior to invoking R. However, I could load tweak and re-save and then bring R back into it later. I never would have thought of it. Thanks! Bill From: Peter Langfelder [peter.langfel...@gmail.com] Sent: Wednesday, October 13, 2010 1:41 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] [OT] (slightly) - OpenOffice Calc and text files On Wed, Oct 13, 2010 at 10:13 AM, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: Hello all, I had a very strange looking problem that turned out to be due to unexpected (by me at least) format changes to one of my data files. We have a small lab study in which each run is represented by a row in a tab-delimited file; each row identifies a repetition of the experiment and associates it with some subjective measurements and times from our notes that get used to index another file with lots of automatically collected data. In short, nothing shocking. In a moment of weakness, I opened the file using (I think it's version 3.2) of OpenOffice Calc to edit something that I had mangled when I first entered it, saved it (apparently the mistake), and reran my analysis code. The results were goofy, and the problem was in my code that runs before R ever sees the data. That code was confused by things that I would like to ensure don't happen again, and I suspect that some of you might have thoughts on it. The problems specifically: (1) OO seems to be a little stingy about producing tab-delimited text; there is stuff online about using the csv and editing the filter and folks (presumably like us) saying that it deserves to be a separate option. (2) Dates that I had formatted as got chopped to YY (did we not learn anything last time?g) and times that I had formatted in 24 hours ended up AM/PM. Have any of you found a nice (or at least predictable) way to use OO Calc to edit files like this? If it insists on thinking for me, I wish it would think in 24 hour time and 4 digit years :) I work on Linux, so Excel is off the table, but another spreadsheet or text editor would be a viable option, as would configuration changes to Calc. No idea about Calc, I use it regularly but only to view files (and that mostly csv, not tab-delinited). The most primitive solution is to use a plain text editor such as vi that will save everything as it loaded it except for what you change. The second most primitive idea (or maybe not so primitive after all) is to read the table into R and manually fix it there such as table$column[row] = ABCD (this is my favorite way of changing things :)). The third most primitive idea which I have actually never used but which may be viable is to load it into R and use the fix() function that pulls up a rather primitive but functional data editor. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] (slightly) - OpenOffice Calc and text files
It will get a good look, as will gnumeric - thanks to all! Bill From: Albyn Jones [jo...@reed.edu] Sent: Wednesday, October 13, 2010 2:14 PM To: Schwab,Wilhelm K Subject: Re: [R] [OT] (slightly) - OpenOffice Calc and text files emacs shows you exactly what is there, nothing more nor less. it isn't a spreadsheet, but tabs will align columns. albyn On Wed, Oct 13, 2010 at 01:53:46PM -0400, Schwab,Wilhelm K wrote: Albyn, I'll look into it. In fact, I have a small book on it that I bought in my very early days of using Linux. I quickly found TeX Maker (for the obvious), Code::Blocks for C/C++ and I would not have started the move without a working Smalltalk (http://pharo-project.org/home). For editing data files, I really just want something that shows data in an understandable grid and does not do weird stuff thinking it's being helpful. Bill From: Albyn Jones [jo...@reed.edu] Sent: Wednesday, October 13, 2010 1:39 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] [OT] (slightly) - OpenOffice Calc and text files How about emacs? albyn On Wed, Oct 13, 2010 at 01:13:03PM -0400, Schwab,Wilhelm K wrote: Hello all, . Have any of you found a nice (or at least predictable) way to use OO Calc to edit files like this? If it insists on thinking for me, I wish it would think in 24 hour time and 4 digit years :) I work on Linux, so Excel is off the table, but another spreadsheet or text editor would be a viable option, as would configuration changes to Calc. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Albyn Jones Reed College jo...@reed.edu -- Albyn Jones Reed College jo...@reed.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [OT] (slightly) - OpenOffice Calc and text files
I know *what* happened (Calc reformatted the data in ways I did not want or expect). It is not end-of-line conventions; they reformatted the data leaving the structure intact. As to why/how, that could depend on the sequence of operations, so I thought to ask here to see if you had collectively either found something specific to do or to avoid. Gnumeric is now freshly installed and will get some testing; if I don't care for it, I'll look more at emacs. I don't ask much of a spreadsheet (show/edit a grid and maybe hide/show columns for complex data sets), but it would be nice if it did not reformat everything every time I open a file :( So far, gnumeric successfully opened a file; I will be a little less trusting when it comes to saving one. Thanks!! Bill From: Mike Marchywka [marchy...@hotmail.com] Sent: Wednesday, October 13, 2010 3:10 PM To: dwinsem...@comcast.net; Schwab,Wilhelm K Cc: r-help@r-project.org Subject: RE: [R] [OT] (slightly) - OpenOffice Calc and text files From: dwinsem...@comcast.net To: bsch...@anest.ufl.edu Date: Wed, 13 Oct 2010 14:52:21 -0400 CC: r-help@r-project.org Subject: Re: [R] [OT] (slightly) - OpenOffice Calc and text files On Oct 13, 2010, at 1:13 PM, Schwab,Wilhelm K wrote: Hello all, I had a very strange looking problem that turned out to be due to unexpected (by me at least) format changes to one of my data files. We have a small lab study in which each run is represented by a row in a tab-delimited file; each row identifies a repetition of the experiment and associates it with some subjective measurements and times from our notes that get used to index another file with lots of automatically collected data. In short, nothing shocking. In a moment of weakness, I opened the file using (I think it's version 3.2) of OpenOffice Calc to edit something that I had mangled when I first entered it, saved it (apparently the mistake), and reran my analysis code. The results were goofy, and the problem was in my code that runs before R ever sees the data. That code was confused by things that I would like to ensure don't happen again, and I suspect that some of you might have thoughts on it. The problems specifically: (1) OO seems to be a little stingy about producing tab-delimited text; there is stuff online about using the csv and editing the filter and folks (presumably like us) saying that it deserves to be a separate option. You have been little stingy yourself about describing what you did. I see no specifics about the actual data used as input nor the specific operations. I just opened an OO.o Calc workbook and dropped a character vector, 1969-12-31 23:59:50 copied from help(POSIXct) into Have any of you found a nice (or at least predictable) way to use OO Calc to edit files like this? I didn't do anything I thought was out of the ordinary and so cannot reproduce your problem. (This was on a Mac, but OO.o is probably going to behave the same across *NIX cultures.) -- David If it insists on thinking for me, I wish it would think in 24 hour time and 4 digit years :) Is it possible that you have not done enough thinking for _it_? I work on Linux, so Excel is off the table, but another spreadsheet or text editor would be a viable option, as would configuration changes to Calc. Bill Probably instead of guessing and seeing how various things react, you could go get a utility like octal dump or open in an editor that has a hex mode and see what happened. This could be anything- crlf convention, someone turned it to unicode, etc. On linux or cygwin I think you have od available. Then of course, if you know what R likes, you can use sed to fix it... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting observations - can't see the data after that
Josh, How about the direct approach: grep or otherwise search the source? If you can guess the form of a printf()-like statement, then it might be easy to find. More likely, one would have to overlay results of a few different searches. Maybe someone who is actively working on the code will see this and stumble on it?? The idea is simply that one might be able to work backwards from the behavior (which we have both observed) to possible causes and give a little thought to possible absorbing states that maybe should not exist. Bill From: Joshua Wiley [jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 9:14 PM To: Schwab,Wilhelm K Subject: Re: [R] Deleting observations - can't see the data after that I mostly meant look at whatever past commands you had typed using the up arrow (only useful for a very limited number). I am sure this has already taken enough of your time, and since its working for you now, I would not worry about looking into it further. I know that I have seen something in the past that produced a short output about an object exactly like what you described (data frame with n rows and m columns), but I cannot remember what it was for the life of me. On Thu, Oct 7, 2010 at 5:32 PM, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: Josh, Sounds good. Where would I find the history? I'm working on Linux (Ubuntu 9.10, R 2.9.2); if it's history(), we're out of luck. You guys are allowed to hound; whether or not I can create a suitable example is another story. As far as what was happening, a summary of the object makes a lot of sense, and that's pretty much what it was. Something like a data frame with 16 rows and 5 columns or there abouts. Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 7:22 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that Dear Bill, We hound because we care---through repeated painful experiences, I have developed an avoidance to using function names for my functions/objects (and against irons near my fingers...but that is another story). If you still have the output from R when you attempted to print your data frame, I would be interested in seeing it. It almost sounds like some sort of summary of the object, rather than the object itself (if that makes any sense). Maybe its still in your history? As a side note, depending on the situation, you might get some mileage out of with() to lessen the this$that burden. If you didn't know about it, hopefully it saves you at least a bit of time :) Here's to a better next two days than your last, Cheers, Josh On Thu, Oct 7, 2010 at 5:16 PM, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: First, no lasting hard feelings - I've had two days of people riding me over minutia like you can't imagine. When you put this in the context of a possible bug, I'll see what I can turn up for you. FWIW, I think it just the variable name. Bill -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 7:10 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:58, Schwab,Wilhelm K wrote: Foolish? Try convenient. Can't win for losing today. Anyway, I most certainly did not make the mistake you suggest, though some other mistake is possible. I never said it printed nothing; I was very explicit that it described it as a data frame with the correct number of rows and columns; it simply would not print the data. I didn't mean to be critical. I'm just trying to understand how you managed to get to the stage where R will show you that 'data' is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame. This should be reproducible. Who knows, you may have found a bug that should be fixed. So what was the precise message from R when it told you that it had the dataframe but wouldn't print it. Can you make up a reproducible example? -Peter Ehlers -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 6:53 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:13, Schwab,Wilhelm K wrote: Josh, Jim, Thanks for responding. So far, it looks like my use of the name data was the problem - that could have taken some time to find. I typically do not attach frames (and did not here), so I end up with lots of this$that in my code. While I think it's foolish to call your data.frame 'data', I really doubt that that's the cause of your troubles. More likely you did something else afterwards that caused your data
[R] Deleting observations - can't see the data after that
Hello all, I am loading a data frame, fitting a model, getting diagnostic plots and they are flagging a couple of observations as problematic. Fair enough, and I want re-fit without them. After I delete an offending row (identified by one of the diagnostic plots), something like data = data[-3,]; then R will no longer print the contents of the data frame; it tells me it is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame like it does before the deletion. Is there a way to get around that, either using a different deletion technique or another function? print(data) and show(data) are not helping. Ultimately, I am trying to go through a couple of iterations of find pathologic points, delete and re-fit. In this case I could guess at what is wrong and probably be correct, but I want to follow the clues as a learning exercise. Once that is complete, I plan to plot everything with the deleted points emphasized. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting observations - can't see the data after that
Josh, Jim, Thanks for responding. So far, it looks like my use of the name data was the problem - that could have taken some time to find. I typically do not attach frames (and did not here), so I end up with lots of this$that in my code. If it gives me any more trouble, I will indeed post an example. Thanks! Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 4:46 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that Hi Bill, Several things come to mind. First, try naming your data frame something besides a function name (data() is also a function). Second, have you attached the data frame? Using: data = data[-3, ] worked fine for me when I made up some data. Perhaps you can create a minimal and reproducible example? You might also send us the results of: sessionInfo() ls() search() Cheers, Josh On Thu, Oct 7, 2010 at 2:30 PM, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: Hello all, I am loading a data frame, fitting a model, getting diagnostic plots and they are flagging a couple of observations as problematic. Fair enough, and I want re-fit without them. After I delete an offending row (identified by one of the diagnostic plots), something like data = data[-3,]; then R will no longer print the contents of the data frame; it tells me it is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame like it does before the deletion. Is there a way to get around that, either using a different deletion technique or another function? print(data) and show(data) are not helping. Ultimately, I am trying to go through a couple of iterations of find pathologic points, delete and re-fit. In this case I could guess at what is wrong and probably be correct, but I want to follow the clues as a learning exercise. Once that is complete, I plan to plot everything with the deleted points emphasized. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting observations - can't see the data after that
Foolish? Try convenient. Can't win for losing today. Anyway, I most certainly did not make the mistake you suggest, though some other mistake is possible. I never said it printed nothing; I was very explicit that it described it as a data frame with the correct number of rows and columns; it simply would not print the data. -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 6:53 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:13, Schwab,Wilhelm K wrote: Josh, Jim, Thanks for responding. So far, it looks like my use of the name data was the problem - that could have taken some time to find. I typically do not attach frames (and did not here), so I end up with lots of this$that in my code. While I think it's foolish to call your data.frame 'data', I really doubt that that's the cause of your troubles. More likely you did something else afterwards that caused your data to be 'unprintable'. Or perhaps you goofed up the subsetting with something like data = data(-3,); But I would have expected R to print _some_ thing, if only an error message. Anyway, I'm glad the problem is resolved (for now). -Peter Ehlers If it gives me any more trouble, I will indeed post an example. Thanks! Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 4:46 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that Hi Bill, Several things come to mind. First, try naming your data frame something besides a function name (data() is also a function). Second, have you attached the data frame? Using: data = data[-3, ] worked fine for me when I made up some data. Perhaps you can create a minimal and reproducible example? You might also send us the results of: sessionInfo() ls() search() Cheers, Josh On Thu, Oct 7, 2010 at 2:30 PM, Schwab,Wilhelm Kbsch...@anest.ufl.edu wrote: Hello all, I am loading a data frame, fitting a model, getting diagnostic plots and they are flagging a couple of observations as problematic. Fair enough, and I want re-fit without them. After I delete an offending row (identified by one of the diagnostic plots), something like data = data[-3,]; then R will no longer print the contents of the data frame; it tells me it is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame like it does before the deletion. Is there a way to get around that, either using a different deletion technique or another function? print(data) and show(data) are not helping. Ultimately, I am trying to go through a couple of iterations of find pathologic points, delete and re-fit. In this case I could guess at what is wrong and probably be correct, but I want to follow the clues as a learning exercise. Once that is complete, I plan to plot everything with the deleted points emphasized. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deleting observations - can't see the data after that
First, no lasting hard feelings - I've had two days of people riding me over minutia like you can't imagine. When you put this in the context of a possible bug, I'll see what I can turn up for you. FWIW, I think it just the variable name. Bill -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 7:10 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:58, Schwab,Wilhelm K wrote: Foolish? Try convenient. Can't win for losing today. Anyway, I most certainly did not make the mistake you suggest, though some other mistake is possible. I never said it printed nothing; I was very explicit that it described it as a data frame with the correct number of rows and columns; it simply would not print the data. I didn't mean to be critical. I'm just trying to understand how you managed to get to the stage where R will show you that 'data' is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame. This should be reproducible. Who knows, you may have found a bug that should be fixed. So what was the precise message from R when it told you that it had the dataframe but wouldn't print it. Can you make up a reproducible example? -Peter Ehlers -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 6:53 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:13, Schwab,Wilhelm K wrote: Josh, Jim, Thanks for responding. So far, it looks like my use of the name data was the problem - that could have taken some time to find. I typically do not attach frames (and did not here), so I end up with lots of this$that in my code. While I think it's foolish to call your data.frame 'data', I really doubt that that's the cause of your troubles. More likely you did something else afterwards that caused your data to be 'unprintable'. Or perhaps you goofed up the subsetting with something like data = data(-3,); But I would have expected R to print _some_ thing, if only an error message. Anyway, I'm glad the problem is resolved (for now). -Peter Ehlers If it gives me any more trouble, I will indeed post an example. Thanks! Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 4:46 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that Hi Bill, Several things come to mind. First, try naming your data frame something besides a function name (data() is also a function). Second, have you attached the data frame? Using: data = data[-3, ] worked fine for me when I made up some data. Perhaps you can create a minimal and reproducible example? You might also send us the results of: sessionInfo() ls() search() Cheers, Josh On Thu, Oct 7, 2010 at 2:30 PM, Schwab,Wilhelm Kbsch...@anest.ufl.edu wrote: Hello all, I am loading a data frame, fitting a model, getting diagnostic plots and they are flagging a couple of observations as problematic. Fair enough, and I want re-fit without them. After I delete an offending row (identified by one of the diagnostic plots), something like data = data[-3,]; then R will no longer print the contents of the data frame; it tells me it is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame like it does before the deletion. Is there a way to get around that, either using a different deletion technique or another function? print(data) and show(data) are not helping. Ultimately, I am trying to go through a couple of iterations of find pathologic points, delete and re-fit. In this case I could guess at what is wrong and probably be correct, but I want to follow the clues as a learning exercise. Once that is complete, I plan to plot everything with the deleted points emphasized. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch
Re: [R] Deleting observations - can't see the data after that
Josh, Sounds good. Where would I find the history? I'm working on Linux (Ubuntu 9.10, R 2.9.2); if it's history(), we're out of luck. You guys are allowed to hound; whether or not I can create a suitable example is another story. As far as what was happening, a summary of the object makes a lot of sense, and that's pretty much what it was. Something like a data frame with 16 rows and 5 columns or there abouts. Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 7:22 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that Dear Bill, We hound because we care---through repeated painful experiences, I have developed an avoidance to using function names for my functions/objects (and against irons near my fingers...but that is another story). If you still have the output from R when you attempted to print your data frame, I would be interested in seeing it. It almost sounds like some sort of summary of the object, rather than the object itself (if that makes any sense). Maybe its still in your history? As a side note, depending on the situation, you might get some mileage out of with() to lessen the this$that burden. If you didn't know about it, hopefully it saves you at least a bit of time :) Here's to a better next two days than your last, Cheers, Josh On Thu, Oct 7, 2010 at 5:16 PM, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: First, no lasting hard feelings - I've had two days of people riding me over minutia like you can't imagine. When you put this in the context of a possible bug, I'll see what I can turn up for you. FWIW, I think it just the variable name. Bill -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 7:10 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:58, Schwab,Wilhelm K wrote: Foolish? Try convenient. Can't win for losing today. Anyway, I most certainly did not make the mistake you suggest, though some other mistake is possible. I never said it printed nothing; I was very explicit that it described it as a data frame with the correct number of rows and columns; it simply would not print the data. I didn't mean to be critical. I'm just trying to understand how you managed to get to the stage where R will show you that 'data' is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame. This should be reproducible. Who knows, you may have found a bug that should be fixed. So what was the precise message from R when it told you that it had the dataframe but wouldn't print it. Can you make up a reproducible example? -Peter Ehlers -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 6:53 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:13, Schwab,Wilhelm K wrote: Josh, Jim, Thanks for responding. So far, it looks like my use of the name data was the problem - that could have taken some time to find. I typically do not attach frames (and did not here), so I end up with lots of this$that in my code. While I think it's foolish to call your data.frame 'data', I really doubt that that's the cause of your troubles. More likely you did something else afterwards that caused your data to be 'unprintable'. Or perhaps you goofed up the subsetting with something like data = data(-3,); But I would have expected R to print _some_ thing, if only an error message. Anyway, I'm glad the problem is resolved (for now). -Peter Ehlers If it gives me any more trouble, I will indeed post an example. Thanks! Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 4:46 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that Hi Bill, Several things come to mind. First, try naming your data frame something besides a function name (data() is also a function). Second, have you attached the data frame? Using: data = data[-3, ] worked fine for me when I made up some data. Perhaps you can create a minimal and reproducible example? You might also send us the results of: sessionInfo() ls() search() Cheers, Josh On Thu, Oct 7, 2010 at 2:30 PM, Schwab,Wilhelm Kbsch...@anest.ufl.edu wrote: Hello all, I am loading a data frame, fitting a model, getting diagnostic plots and they are flagging a couple of observations as problematic. Fair enough, and I want re-fit without them. After I delete an offending row (identified by one of the diagnostic plots), something
Re: [R] Deleting observations - can't see the data after that
Josh, Unfortunately, I created a lot of lines after getting it working, so there was no getting back to it, and right now I can't reproduce it - sorry. If I use one Gnome shell and exit R and re-run it, am I clearing everything? I assume so, but if not, that might be relevant. AFAIK, I do not save and re-use workspaces. Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 8:15 PM To: Schwab,Wilhelm K Subject: Re: [R] Deleting observations - can't see the data after that I mostly meant look at whatever past commands you had typed using the up arrow (only useful for a very limited number). I am sure this has already taken enough of your time, and since its working for you now, I would not worry about looking into it further. I know that I have seen something in the past that produced a short output about an object exactly like what you described (data frame with n rows and m columns), but I cannot remember what it was for the life of me. On Thu, Oct 7, 2010 at 5:32 PM, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: Josh, Sounds good. Where would I find the history? I'm working on Linux (Ubuntu 9.10, R 2.9.2); if it's history(), we're out of luck. You guys are allowed to hound; whether or not I can create a suitable example is another story. As far as what was happening, a summary of the object makes a lot of sense, and that's pretty much what it was. Something like a data frame with 16 rows and 5 columns or there abouts. Bill -Original Message- From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] Sent: Thursday, October 07, 2010 7:22 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that Dear Bill, We hound because we care---through repeated painful experiences, I have developed an avoidance to using function names for my functions/objects (and against irons near my fingers...but that is another story). If you still have the output from R when you attempted to print your data frame, I would be interested in seeing it. It almost sounds like some sort of summary of the object, rather than the object itself (if that makes any sense). Maybe its still in your history? As a side note, depending on the situation, you might get some mileage out of with() to lessen the this$that burden. If you didn't know about it, hopefully it saves you at least a bit of time :) Here's to a better next two days than your last, Cheers, Josh On Thu, Oct 7, 2010 at 5:16 PM, Schwab,Wilhelm K bsch...@anest.ufl.edu wrote: First, no lasting hard feelings - I've had two days of people riding me over minutia like you can't imagine. When you put this in the context of a possible bug, I'll see what I can turn up for you. FWIW, I think it just the variable name. Bill -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 7:10 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:58, Schwab,Wilhelm K wrote: Foolish? Try convenient. Can't win for losing today. Anyway, I most certainly did not make the mistake you suggest, though some other mistake is possible. I never said it printed nothing; I was very explicit that it described it as a data frame with the correct number of rows and columns; it simply would not print the data. I didn't mean to be critical. I'm just trying to understand how you managed to get to the stage where R will show you that 'data' is a data frame with specific (correct) number of rows and columns, but won't show me what remains in the frame. This should be reproducible. Who knows, you may have found a bug that should be fixed. So what was the precise message from R when it told you that it had the dataframe but wouldn't print it. Can you make up a reproducible example? -Peter Ehlers -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Thursday, October 07, 2010 6:53 PM To: Schwab,Wilhelm K Cc: r-help@r-project.org Subject: Re: [R] Deleting observations - can't see the data after that On 2010-10-07 17:13, Schwab,Wilhelm K wrote: Josh, Jim, Thanks for responding. So far, it looks like my use of the name data was the problem - that could have taken some time to find. I typically do not attach frames (and did not here), so I end up with lots of this$that in my code. While I think it's foolish to call your data.frame 'data', I really doubt that that's the cause of your troubles. More likely you did something else afterwards that caused your data to be 'unprintable'. Or perhaps you goofed up the subsetting with something like data = data(-3,); But I would have expected R to print _some_ thing, if only an error message. Anyway, I'm glad the problem
[R] Ordering categories on a boxplot - a serious trap??
Hello all, I think I probably did something stupid, and R's part was to allow me to do it. My goal was to control the order of factor levels appearing horizontally on a boxplot. Enter search engines and perhaps some creative stupidity on my part, and I came up with the following: v=read.table(factor-order.txt,header=TRUE); levels(v$doseGroup) = c(L, M, H); boxplot(v$dose~v$doseGroup); A good way to see the trap is to evaluate: v=read.table(factor-order.txt,header=TRUE); par(mfrow=c(2,1)); boxplot(v$dose~v$doseGroup); levels(v$doseGroup) = c(L, M, H); boxplot(v$dose~v$doseGroup); par(mfrow=c(1,1)); The above creates two plots, one correct with the factors in an inconvient order, and one that is WRONG. In the latter, the labels appear in the desired order, but the data does not move with them. I did not discover the problem until I repeated the same type of plot with something that had a known relationship with the levels, and the result was clearly not correct. I *think* the problem is to assign to the return value of levels(). How did I think to do that? I'm not really sure, but please look at https://stat.ethz.ch/pipermail/r-help/2008-August/171884.html Perhaps it does not say to do exactly what I did, but it sure was easy to follow to the mistake, it appeared to do what I wanted, and the consequences of the mistake are ugly. Perhaps levels() should return something that is immutable?? If I am looking at this correctly, levels() is an accident waiting to happen. What should I have done? It seems: read data and order factor levels v=read.table(factor-order.txt,header=TRUE); group = factor(v$doseGroup,levels = c(L, M, H) ); boxplot(v$dose~group); One disappointment is that the above factor() call apparently needs to be repeated for any subset of v - I'm still trying to get my mind around that one. Can anyone confirm this? It strikes me as a trap that should be addressed so that an error results rather than a garbage graph. Bill --- Wilhelm K. Schwab, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ordering categories on a boxplot - a serious trap??
Phil, That works[*], but I still think there is a big problem given how easy it is to do the wrong thing, and that searches lead to dangerous instructions. Hopefully this will serve to keep others out of trouble, but so might an immutable return value from levels(). [*] I have not yet done anything with selecting parts of the data frame. Using a separate factor, I quickly hit trouble with size mismatches, though I could probably work around them by recreating the factor after any such change. Proceeding with caution... Bill --- Wilhelm K. Schwab, Ph.D. -Original Message- From: Phil Spector [mailto:spec...@stat.berkeley.edu] Sent: Thursday, February 25, 2010 7:06 PM To: Schwab,Wilhelm K Subject: Re: [R] Ordering categories on a boxplot - a serious trap?? Wilhelm - I don't know if this is correct for your problem because you didn't provide a reproducible example, but perhaps you could try v$doseGroup = factor(v$doseGroup,levels=c(L, M, H)) instead of setting the levels directly. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Thu, 25 Feb 2010, Schwab,Wilhelm K wrote: Hello all, I think I probably did something stupid, and R's part was to allow me to do it. My goal was to control the order of factor levels appearing horizontally on a boxplot. Enter search engines and perhaps some creative stupidity on my part, and I came up with the following: v=read.table(factor-order.txt,header=TRUE); levels(v$doseGroup) = c(L, M, H); boxplot(v$dose~v$doseGroup); A good way to see the trap is to evaluate: v=read.table(factor-order.txt,header=TRUE); par(mfrow=c(2,1)); boxplot(v$dose~v$doseGroup); levels(v$doseGroup) = c(L, M, H); boxplot(v$dose~v$doseGroup); par(mfrow=c(1,1)); The above creates two plots, one correct with the factors in an inconvient order, and one that is WRONG. In the latter, the labels appear in the desired order, but the data does not move with them. I did not discover the problem until I repeated the same type of plot with something that had a known relationship with the levels, and the result was clearly not correct. I *think* the problem is to assign to the return value of levels(). How did I think to do that? I'm not really sure, but please look at https://stat.ethz.ch/pipermail/r-help/2008-August/171884.html Perhaps it does not say to do exactly what I did, but it sure was easy to follow to the mistake, it appeared to do what I wanted, and the consequences of the mistake are ugly. Perhaps levels() should return something that is immutable?? If I am looking at this correctly, levels() is an accident waiting to happen. What should I have done? It seems: read data and order factor levels v=read.table(factor-order.txt,header=TRUE); group = factor(v$doseGroup,levels = c(L, M, H) ); boxplot(v$dose~group); One disappointment is that the above factor() call apparently needs to be repeated for any subset of v - I'm still trying to get my mind around that one. Can anyone confirm this? It strikes me as a trap that should be addressed so that an error results rather than a garbage graph. Bill --- Wilhelm K. Schwab, Ph.D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.