Thanks for the detailed advice. And thanks, Richard for your advice too. In the end (before I received your posts) I managed to move all the files into enough smaller directories that I could browse them in Nautilus. From what I saw it looked very much to me like most of the files were ones that had been deleted by emerge before the big disaster. I didn't look at every single one obviously, but it soon became obvious that I wasn't going to find much of any use.
And thanks for giving a practical example of how to use find. I have always found the man page rather heavy going, so this is the first time I have felt I have half an idea how to use it. Robert On Tue, 2006-26-09 at 08:20 -0500, Boyd Stephen Smith Jr. wrote: > On Monday 25 September 2006 22:55, Robert Persson <[EMAIL PROTECTED]> > wrote about '[gentoo-user] I have 146,000 files in lost+found. How do I > sort them?': > > Am I likely to find many usable files in that /lost+found directory? > > Maybe. I tried to recover a corrupted ext3 boot recently and was unable to > pull anything useful out of lost and found that was larger than a > symlink. :( If a number of files NOT in lost+found were corrupt, it's > likely most of the files in lost+found are corrupt as well. > > That said, /boot data is generally easy to replace, so I put no effort into > recovering files that were corrupted. If the data was valuable, if might > be worth it to spend some time sorting those out. > > > If I can, how can I best sift through them? > > Carefully. :) > > > Is there a utility, or > > something I could drop into a simple bash script, that would look at the > > first few bytes of the file and, say, identify it as a jpeg or an xml > > file, so that it could be given an appropriate file extension, deleted > > or moved? > > As the other poster mentioned, the file utility is useful for identifying > the type of file. Keep in mind though that is only looks at the first few > bytes of the file, if there's corruption later on file won't notice. > > > Or is there one that could distinguish a text file from a > > binary? > > Of course, file does this to some extent. A MIME type of text/* is > generally text, while anything else is binary. But, file's output (by > default) isn't a simple "binary" or "text" string. > > Some of the GNU utilities that are meant for text files will complain > before operating on a binary file, so you could use those for this task, > possibly. (I'm thinking of less and grep.) In particular, > grep '[^[:print:]]' should return true when run against a file that > contains non-printable characters (like control characters or NUL, and, > depending on locale, non-7-bit-clean characters). > > > Are there any other strategies I could use to sift through these files > > (assuming it would be worth doing)? > > Well, before you write some sort of bash script around file to rename > stuff, you'll probably want to remove anything that is clearly trash, like > device nodes or 0-length files. Something like: > find lost+found \! \( -type f -o -type d \ -o -type l \) -o -empty -delete > should work if you are using GNU find. > -- [email protected] mailing list

