Duncan Murdoch wrote: > On 28/07/2010 8:10 PM, Ray Brownrigg wrote: >> NOTE: Now submitted to R-devel, as this seems more appropriate. >> >> I may have spoken too soon about this having been fixed. (see below). >> >> If I create another "unusual but not 'invalid'" filename in the R >> subdirectory, the >> behaviour is different from that reported below, and is similar to the >> original poster's >> output (the third "unlink" command, where "xyz" was "~"): >> >> circa> ls -al RColorBrewer/R >> total 140 >> -rwxr-xr-x 1 ray ecs 43988 Apr 17 2005 ColorBrewer.R~* >> -rw-r--r-- 1 ray ecs 0 Jul 29 09:57 residuals.MCMCglmm.R?xyz > > Ray clarified to me that this filename was "residuals.MCMCglmm.R" > preceded by 3 spaces and followed by a carriage return and "xyz". > >> drwxr-xr-x 2 ray ecs 4096 Jul 29 12:02 ./ >> drwxr-xr-x 5 ray ecs 4096 Jul 29 11:49 ../ >> -rwxr-xr-x 1 ray ecs 43988 Jul 29 09:57 ColorBrewer.R* >> -rwxr-xr-x 1 ray ecs 43988 Apr 17 2005 ColorBrewer.R~* >> -rw-r--r-- 1 ray ecs 0 Jul 29 09:58 residuals.MCMCglmm.R >> circa> >> circa> >> circa> R CMD build RColorBrewer >> * checking for file 'RColorBrewer/DESCRIPTION' ... OK >> * preparing 'RColorBrewer': >> * checking DESCRIPTION meta-information ... OK >> * checking whether 'INDEX' is up-to-date ... NO >> * use '--force' to overwrite the existing 'INDEX' >> * removing junk files >> unlink RColorBrewer/R/ ColorBrewer.R~ >> unlink RColorBrewer/R/ColorBrewer.R >> unlink RColorBrewer/R/ residuals.MCMCglmm.R >> xyz > > That certainly looks bad. I can't reproduce it on Windows; it doesn't > allow that filename. So I'll have to leave this for a Unix-alike user.
I have been following this from the sideline, because I suck really bad when it comes to Perl programming. However.... I'm seeing this stuff in the build script: ## Remove exclude files. open(EXCLUDE, "< $exclude"); while(<EXCLUDE>) { rmtree(glob($_)); } close(EXCLUDE); Now this comes after find(\&find_exclude_files, "$pkgname"); which AFAICT prints a number of file names into EXCLUDE. Now if one of those file names contain a wildcard, I conjecture that the glob() can make weird things happen. I don't think we want to glob there, do we? Another issue is that EXCLUDE seems unprotected against file names with embedded newlines. Something like find's -print0 would be handy... > Duncan Murdoch > >> unlink RColorBrewer/R/residuals.MCMCglmm.R >> unlink RColorBrewer/R/ColorBrewer.R~ >> rmdir RColorBrewer/R >> * checking for LF line-endings in source and make files >> * checking for empty or unneeded directories >> * building 'RColorBrewer_1.0-3.tar.gz' >> >> circa> >> >> Ray Brownrigg >> >> On Thu, 29 Jul 2010, Ray Brownrigg wrote: >>> On Thu, 29 Jul 2010, Duncan Murdoch wrote: >>>> On 28/07/2010 10:01 AM, Jarrod Hadfield wrote: >>>>> Hi Marc, >>>>> >>>>> Thanks for the info on recovery - most of it can pieced together from >>>>> backups but a quick, cheap and easy method of recovery would have been >>>>> nicer. >>>>> >>>>> My main concern is that this could happen again and that the "bug" is >>>>> not limited to R 2.9. I would think that an accidental carriage return >>>>> at the end of a file name (even a temporary one) would be a reasonably >>>>> common phenomenon (I'm surprised I hadn't done it before). >>>> If you can put together a recipe to reproduce the problem (or a less >>>> extreme version of R deleting files it shouldn't), we'll certainly fix >>>> it. But so far all we've got are guesses about what might have gone >>>> wrong, and I don't think anyone has been able to reproduce the problem >>>> on current R. >>> Duncan: >>> >>> It looks to me like it has already been fixed, if indeed that was the >>> problem. In R-2.10.1, I tried to reproduce the problem (using >>> RColorBrewer, since that was the smallest package I have a local copy of), >>> and the build produced this: >>> >>> * removing junk files >>> * excluding invalid files from 'RColorBrewer' >>> Subdirectory 'R' contains invalid file names: >>> residuals.MCMCglmm.R xyz >>> >>> where the space shown between the "R" and the "xyz" was a newline >>> character. [I didn't dare try using a "~" :-)] >>> >>> Ray Brownrigg >>> >>>> Duncan Murdoch >>>> >>>>> Cheers, >>>>> >>>>> Jarrod >>>>> >>>>> On 28 Jul 2010, at 14:04, Marc Schwartz wrote: >>>>>> Jarrod, >>>>>> >>>>>> Noting your exchange with Martin, Martin brings up a point that >>>>>> certainly I missed, which is that somehow the tilde ('~') character >>>>>> got into the chain of events. As Martin noted, on Linuxen/Unixen >>>>>> (including OSX), the tilde, when used in the context of file name >>>>>> globbing, refers to your home directory. Thus, a command such as: >>>>>> >>>>>> ls ~ >>>>>> >>>>>> will list the files in your home directory. Similarly: >>>>>> >>>>>> rm ~ >>>>>> >>>>>> will remove the files there as well. If the -rf argument is added, >>>>>> then the deletion becomes recursive through that directory tree, >>>>>> which appears to be the case here. >>>>>> >>>>>> I am unclear, as Martin appears to be, as to the steps that caused >>>>>> this to happen. That may yet be related in some fashion to Duncan's >>>>>> hypothesis. >>>>>> >>>>>> That being said, the use of the tilde character as a suffix to >>>>>> denote that a file is a backup version, is not limited to Fedora or >>>>>> Linux, for that matter. It is quite common for many text editors >>>>>> (eg. Emacs) to use this. As a result, it is also common for many >>>>>> applications to ignore files that have a tilde suffix. >>>>>> >>>>>> Based upon your follow up posts to the original thread, it would >>>>>> seem that you do not have any backups. The default ext3 file system >>>>>> that is used on modern Linuxen, by design, makes it a bit more >>>>>> difficult to recover deleted files. This is due to the unlinking of >>>>>> file metadata at the file system data structure level, as opposed to >>>>>> simply marking the file as deleted in the directory structures, as >>>>>> happens on Windows. >>>>>> >>>>>> There is a utility called ext3undel >>>>>> (http://projects.izzysoft.de/trac/ext3undel ), which is a wrapper of >>>>>> sorts to other undelete utilities such as PhotoRec and foremost. I >>>>>> have not used it/them, so cannot speak from personal experience. Thus >>>>>> it would be a good idea to engage in some reviews of the >>>>>> documentation and perhaps other online resources before proceeding. >>>>>> The other >>>>>> consideration is the Catch-22 of not copying anything new to your >>>>>> existing HD, for fear of overwriting the lost files with new data. So >>>>>> you would need to consider an approach of downloading these utilities >>>>>> via another computer and then running them on the computer in >>>>>> question from other media, such as a CD/DVD or USB HD. >>>>>> >>>>>> A more expensive option would be to use a professional data recovery >>>>>> service, where you would have to consider the cost of recovery >>>>>> versus your lost time. One option would be Kroll OnTrack UK >>>>>> (http://www.ontrackdatarecovery.co.uk/ ). I happen to live about a >>>>>> quarter mile from their world HQ here in a suburb of Minneapolis. I >>>>>> have not used them myself, but others that I know have, with good >>>>>> success. Again, this comes at a >>>>>> potentially substantial monetary cost. >>>>>> >>>>>> The key is that if you have any hope to recover the deleted files, >>>>>> you not copy anything new onto the hard drive in the mean time. >>>>>> Doing so will decrease the possibility of file recovery to near 0. >>>>>> >>>>>> As Duncan noted, there is great empathy with your situation. We have >>>>>> all gone through this at one time or another. In my case, it was >>>>>> perhaps 20+ years ago, but as a result, I am quite anal retentive >>>>>> about having backups, which I have done for some time on my systems, >>>>>> hourly. >>>>>> >>>>>> HTH, >>>>>> >>>>>> Marc Schwartz >>>>>> >>>>>> On Jul 28, 2010, at 5:55 AM, Jarrod Hadfield wrote: >>>>>>> Hi Martin, >>>>>>> >>>>>>> I think this is the most likely reason given that the name in the >>>>>>> DESCRIPTION file does NOT have a version number. Even so, it is >>>>>>> very easy to misname a file and then delete it/change its name (as >>>>>>> I've done here) and I hope current versions of R would not cause >>>>>>> this problem. Perhaps Fedora should not use ~ as its back up file >>>>>>> suffixes? >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Jarrod >>>>>>> >>>>>>> On 28 Jul 2010, at 11:41, Martin Maechler wrote: >>>>>>>>>>>>> Jarrod Hadfield <j.hadfi...@ed.ac.uk> >>>>>>>>>>>>> on Tue, 27 Jul 2010 21:37:09 +0100 writes: >>>>>>>>> Hi, I ran R (version 2.9.0) CMD build under root in >>>>>>>>> Fedora (9). When it tried to remove "junk files" it >>>>>>>>> removed EVERYTHING in my local account! (See below). >>>>>>>>> >>>>>>>>> Can anyone tell me what happened, >>>>>>>> the culprit may lay here: >>>>>>>>>> * removing junk files >>>>>>>>>> unlink MCMCglmm_2.05/R/ residuals.MCMCglmm.R >>>>>>>>>> ~ >>>>>>>> where it seems that someone (you?) have added a newline >>>>>>>> in the filname, so instead of >>>>>>>> 'residuals.MCMCglmm.R~' >>>>>>>> you got >>>>>>>> >>>>>>>> 'residuals.MCMCglmm.R >>>>>>>> ~' >>>>>>>> >>>>>>>> and the unlink / rm command interpreted '~' as your home >>>>>>>> directory. >>>>>>>> >>>>>>>> But I can hardly believe it. >>>>>>>> This seems explanation seems a bit doubtful to me.. ... >>>>>>>> >>>>>>>>> and even more importantly if I can I restore what was lost. >>>>>>>> well, you just get it from the backup. You do daily backups, do >>>>>>>> you? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Martin Maechler, ETH Zurich >>>> ______________________________________________ >>>> r-h...@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html and provide commented, >>>> minimal, self-contained, reproducible code. >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel