http://lists.acm.jhu.edu/pipermail/acm/2007-November/006831.html

[ACM] **SPAM** Re: ext2/ext3 on-disk format

Peter Froehlich phf at cs.jhu.edu
Wed Nov 28 11:50:38 EST 2007
Hi all,

Just on the general topic of recovering emails from "trashed" disks,  
has it occurred to you that there might be serious investigative  
applications for a product that does exactly what Asheesh wants? With  
all the emails getting lost in the White House and all... :-)  
Seriously, maybe there's even a quick conference paper there, I'd  
recommend checking in with Randal and his group.

Cheers,
Peter

On Nov 28, 2007, at 6:44 AM, Antonello Cruz wrote:

> Asheesh,
>
> ext2/ext3 data should be aligned by block (usually it is 2K but can  
> be 4K)
> http://en.wikipedia.org/wiki/Ext2
>
> Finding the end of a file that is longer than one block is tricky  
> since
> the blocks storing that file is not a linked list. It is a sort of  
> tree
> rooted on the inode (see the wiki page). I am not sure how long your
> emails generally are, but if they are shorter than a block, your  
> approach
> for finding the beginning of the message should work.
>
> Another approach, more cumbersome though, is finding the beginning  
> of each
> message which will tell you the first block of the file you want to
> recover. Then you can go the the blocks that are supposed to heve the
> inodes (you'll need to figure out how ext2/3 is laid out at the  
> beginning
> of the disk) and find the inode corresponding to that file. There  
> can be
> more than one for two reasons. First, it may be a deleted inode from a
> file previously stored at the same block, or it may be a hard link  
> to the
> same file.
>
> Keep in mind that I am not an ext2/3 expert or a storage system  
> expert.
>
> Good luck,
>
> Antonello
>
> --- Asheesh Laroia <acm at jhu.asheesh.org> wrote:
>
>> A few months back, I suffered some major data loss on some hard  
>> drives.
>> (Lesson learned: RAID is not backup.)  I had a partial backup of my
>> emails
>> that were stored on those drives, but a couple of days before the  
>> main
>> drives failed I rm -rf'd the backup.  The partial backup was  
>> stored on
>> ext3.
>>
>> Then the main drives failed, so I saved a disk image of the drive  
>> where
>> the partial backup was rm'd.
>>
>> So today I'm looking at that saved disk image in a hex editor.  I  
>> don't
>> need filenames, and I can identify the sorts of files I want: I want
>> email
>> files (messages, one per file, in Maildirs), and they're really  
>> easy to
>> detect: They start with a mail header, which looks something like  
>> "Date:
>>
>> Tue, 16 Sep"....
>>
>> But what I do need is a reliable way to detect file boundaries in  
>> ext3,
>> preferably a way that works for deleted files also.
>>
>> For file starts - Do they always start at offsets that fit a pattern,
>> like
>> (offset % 2048) == 0?  Then I can only start looking for email  
>> headers
>> at
>> those positions.
>>
>> For file ends - Is there file-end zero padding until some block  
>> width,
>> like "after the file the rest of the 4096-size block is padding with
>> zeroes"?  Then I use that to detect that I have the whole message  
>> file.
>>
>> The filesystem where the deletes happened can be inspected with  
>> things
>> like debugfs or tune2fs.  Assume I don't know anything about  
>> filesystems
>>
>> but in general am a reasonable fellow who will try to understand what
>> you
>> teach him.
>>
>> I'd dearly appreciate help, for example from people who took Storage
>> Systems.  If you only know about ext2, tell me anyway - ext3 is quite
>> similar!
>>
>> -- Asheesh.
>>
>> --
>> I finally went to the eye doctor.  I got contacts.  I only need  
>> them to
>> read, so I got flip-ups.
>>  		-- Steven Wright
>> _______________________________________________
>> ACM mailing list
>> ACM at acm.jhu.edu
>> http://lists.acm.jhu.edu/mailman/listinfo/acm
>>
>
>
>
>        
> ______________________________________________________________________ 
> ______________
> Be a better pen pal.
> Text or chat with friends inside Yahoo! Mail. See how.  http:// 
> overview.mail.yahoo.com/
> _______________________________________________
> ACM mailing list
> ACM at acm.jhu.edu
> http://lists.acm.jhu.edu/mailman/listinfo/acm
>

--
Peter H. Froehlich <><><><><><> http://www.cs.jhu.edu/~phf/
OpenPGP: ABC2 9BCC 1445 86E9 4D59  F532 A8B2 BFAE 342B E9D9




Reply via email to