Re: sha1summ of complete directory?
On Tue, Jul 07, 2009 at 12:37:37PM -0400, Scott Gifford wrote: The purpose of the ls was to sort the filenames, but looking more clostey, bash sorts them already. csh and ksh do the same, and glob(3) sorts by default. I'm not sure if all shells do that or not, but it seems that most do, and if your shell does, those two should be equivalent. That depends on the value of the LC_COLLATE environment variable. -- Chris. == I contend that we are both atheists. I just believe in one fewer god than you do. When you understand why you dismiss all the other possible gods, you will understand why I dismiss yours. -- Stephen F Roberts -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
On Wed, Jul 15, 2009 at 07:36:24AM -0700, Todd A. Jacobs wrote: On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? You might want to look at how git does this. As I understand it, git stores hashes of trees, so the implementation may help you. Not really... the hash git indexes with is that of the compressed object (which is either a blob, tree, or commit). Tree and commit objects point at other objects (which are also stored by hash). Blobs are the files themselves. More info: http://www.gitready.com/beginner/2009/02/17/how-git-stores-your-data.html http://eagain.net/articles/git-for-computer-scientists/ Cheers, -- Eric Gerlach, Network Administrator Federation of Students University of Waterloo p: (519) 888-4567 x36329 e: egerl...@feds.uwaterloo.ca -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
In 20090716151953.ge4...@wks0082.feds.uwaterloo.ca, Eric Gerlach wrote: On Wed, Jul 15, 2009 at 07:36:24AM -0700, Todd A. Jacobs wrote: On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? You might want to look at how git does this. As I understand it, git stores hashes of trees, so the implementation may help you. Not really... the hash git indexes with is that of the compressed object (which is either a blob, tree, or commit). Actually, I'm fairly sure it hashes the uncompressed object (now[1]), but I'd have to dig in to the source code to be sure. Tree and commit objects point at other objects (which are also stored by hash). Blobs are the files themselves. That is one way of calculating a single hash for a complete directory tree. The tree is identified by it's hash, which verifies the contents. The contents identify the pointed to objects by hash, which verifies their contents. Etc. The hash/sum calculated has the same verification properties as a single- file data-only hash. It *might* not be as cryptographically strong, but that would be a bit surprising and I've seen no papers/pages verifying or refuting it's strength.[2] -- Boyd Stephen Smith Jr. ,= ,-_-. =. b...@iguanasuicide.net ((_/)o o(\_)) ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-' http://iguanasuicide.net/\_/ [1] There was a small period of time during Linus's maintainership of git that it hashed differently than it does now. I can't recall why or when it was changed. [2] Other than the fact that it uses a 128-bit SHA-1 hash and that *may* be getting too weak to be considered cryptographically secure in the near future. Using SHA-2 is probably better, and you shouldn't lose much strength by truncating at 128-bits if you need that size specifically, but git doesn't support that. Hopefully SHA-3 will be out before it matters, which means git can switch to that.[3] [3] If they ever decide to switch, it will probably be painful. They might not ever switch, since I don't think that resistance against attackers was the intent, just identification and resistance to random corruption. (CVS and SVN could be silently corrupted for years and it was virtually impossible to tell; that doesn't happen to git repositories.) signature.asc Description: This is a digitally signed message part.
Re: sha1summ of complete directory?
On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? You might want to look at how git does this. As I understand it, git stores hashes of trees, so the implementation may help you. -- Oh, look: rocks! -- Doctor Who, Destiny of the Daleks -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
On Tue, Jul 07, 2009 at 12:08:05AM -0400, Scott Gifford wrote: Ron Johnson ron.l.john...@cox.net writes: You will need to figure out what metadata you care about and what you don't, then. For example, do you want to detect a renamed file? A change in mtime/ctime/utime? A change in permissions? For the file contents, if there are no subdirectories you can use: cat `ls` |sha1sum Which is basically: cat * | sha1sum And it would still fail to detect the difference between: case 1: seq 1 6 file1 case2: seq 1 3 file1 seq 4 6 file2 (Same content, different metadata) ls will sort the files, so they will end up in the same order every time. For metadata, you could come up with the flags to ls that give the metadata you want, then include that before or after the cat, like this: (ls -l; cat `ls`) |sha1sum Same comment about echo *. Also note that packing files together with their metadata is, in fact, exactly what an archive utility does: tar cf - . | sha1sum This will also pick hidden files and subdirectories. Oh, and sha1sum is not the only checksum available: $ ls /usr/bin/*sum /usr/bin/cksum /usr/bin/lh_source_md5sum /usr/bin/sha384sum /usr/bin/innochecksum /usr/bin/md5sum/usr/bin/sha512sum /usr/bin/jacksum /usr/bin/sha1sum /usr/bin/shasum /usr/bin/kmk_md5sum/usr/bin/sha224sum /usr/bin/sum /usr/bin/lh_binary_md5sum /usr/bin/sha256sum Some of those are actually relevant :-) -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
Tzafrir Cohen tzaf...@cohens.org.il writes: On Tue, Jul 07, 2009 at 12:08:05AM -0400, Scott Gifford wrote: [...] For the file contents, if there are no subdirectories you can use: cat `ls` |sha1sum Which is basically: cat * | sha1sum The purpose of the ls was to sort the filenames, but looking more clostey, bash sorts them already. csh and ksh do the same, and glob(3) sorts by default. I'm not sure if all shells do that or not, but it seems that most do, and if your shell does, those two should be equivalent. Thanks Tzafrir, I learned something today! :-) [...] Also note that packing files together with their metadata is, in fact, exactly what an archive utility does: tar cf - . | sha1sum This will also pick hidden files and subdirectories. If there were a way to specify what metadata to include in the archive, this would be a perfect solution for the OP. Scott. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
sha1summ of complete directory?
How would one go about computing a *single* hash value for a complete directory tree? -- Scooty Puff, Sr The Doom-Bringer -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
On Monday 06 July 2009 08:30 pm, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? -- Scooty Puff, Sr The Doom-Bringer Tar the tree and then calculate the sha1sum of the tar file. Easy, no? Mark -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
2009/7/7 Mark Neidorff m...@neidorff.com: On Monday 06 July 2009 08:30 pm, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? This is what I was thinking, If you don't want to retain the tar file then pipe it to sha1sum. Adrian -- 24x7x365 != 24x7x52 Stupid or bad maths? erno hm. I've lost a machine.. literally _lost_. it responds to ping, it works completely, I just can't figure out where in my apartment it is. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
On 2009-07-06 20:29, Adrian Levi wrote: 2009/7/7 Mark Neidorff m...@neidorff.com: On Monday 06 July 2009 08:30 pm, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? This is what I was thinking, If you don't want to retain the tar file then pipe it to sha1sum. I was thinking about that, but tar stores metadata that I know are different between the two trees, thus invalidating that idea. -- Scooty Puff, Sr The Doom-Bringer -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? hash everything and then hash the result? (if you don't care about metadata that is - if you do add a nice stat of everything into the final hash) -- A search of his car uncovered pornography, a homemade sex aid, women's stockings and a Jack Russell terrier. - http://www.news.com.au/story/0%2C27574%2C24675808-421%2C00.html -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
Ron Johnson ron.l.john...@cox.net writes: On 2009-07-06 20:29, Adrian Levi wrote: 2009/7/7 Mark Neidorff m...@neidorff.com: On Monday 06 July 2009 08:30 pm, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? This is what I was thinking, If you don't want to retain the tar file then pipe it to sha1sum. I was thinking about that, but tar stores metadata that I know are different between the two trees, thus invalidating that idea. You will need to figure out what metadata you care about and what you don't, then. For example, do you want to detect a renamed file? A change in mtime/ctime/utime? A change in permissions? For the file contents, if there are no subdirectories you can use: cat `ls` |sha1sum ls will sort the files, so they will end up in the same order every time. For metadata, you could come up with the flags to ls that give the metadata you want, then include that before or after the cat, like this: (ls -l; cat `ls`) |sha1sum Hope this helps, Scott. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
On 2009-07-06 23:08, Scott Gifford wrote: Ron Johnson ron.l.john...@cox.net writes: On 2009-07-06 20:29, Adrian Levi wrote: 2009/7/7 Mark Neidorff m...@neidorff.com: On Monday 06 July 2009 08:30 pm, Ron Johnson wrote: How would one go about computing a *single* hash value for a complete directory tree? This is what I was thinking, If you don't want to retain the tar file then pipe it to sha1sum. I was thinking about that, but tar stores metadata that I know are different between the two trees, thus invalidating that idea. You will need to figure out what metadata you care about and what you don't, then. For example, do you want to detect a renamed file? A change in mtime/ctime/utime? A change in permissions? For the file contents, if there are no subdirectories you can use: cat `ls` |sha1sum ls will sort the files, so they will end up in the same order every time. For metadata, you could come up with the flags to ls that give the metadata you want, then include that before or after the cat, like this: (ls -l; cat `ls`) |sha1sum Hope this helps, Thanks, I *think* it did. -- Scooty Puff, Sr The Doom-Bringer -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: sha1summ of complete directory?
On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson ron.l.john...@cox.net was heard to say: How would one go about computing a *single* hash value for a complete directory tree? Depending on how important uniqueness is, you could just cat the whole thing (sorting filenames first, of course) and pipe it to your favorite cryptographic hash. Whether that's good enough obviously depends on what you're trying to do. You could tag on a list of file names and file sizes, although if it's important to be super-correct you'll need a way of uniquely identifying where filenames stop and where the list stops (NUL is probably your friend there) and you'll want to normalize white-space. Daniel -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org