Re: sha1summ of complete directory?

2009-07-20 Thread Chris Bannister
On Tue, Jul 07, 2009 at 12:37:37PM -0400, Scott Gifford wrote:
 The purpose of the ls was to sort the filenames, but looking more
 clostey, bash sorts them already.  csh and ksh do the same, and
 glob(3) sorts by default.  I'm not sure if all shells do that or not,
 but it seems that most do, and if your shell does, those two should be
 equivalent.

That depends on the value of the LC_COLLATE environment variable.

-- 
Chris.
==
I contend that we are both atheists. I just believe in one fewer god
than you do. When you understand why you dismiss all the other
possible gods, you will understand why I dismiss yours.
   -- Stephen F Roberts


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-16 Thread Eric Gerlach
On Wed, Jul 15, 2009 at 07:36:24AM -0700, Todd A. Jacobs wrote:
 On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote:
 
  How would one go about computing a *single* hash value for a complete
  directory tree?
 
 You might want to look at how git does this. As I understand it, git
 stores hashes of trees, so the implementation may help you.

Not really... the hash git indexes with is that of the compressed object (which
is either a blob, tree, or commit).  Tree and commit objects point at other
objects (which are also stored by hash).  Blobs are the files themselves.

More info:
http://www.gitready.com/beginner/2009/02/17/how-git-stores-your-data.html
http://eagain.net/articles/git-for-computer-scientists/

Cheers,

-- 
Eric Gerlach, Network Administrator
Federation of Students
University of Waterloo
p: (519) 888-4567 x36329
e: egerl...@feds.uwaterloo.ca


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-16 Thread Boyd Stephen Smith Jr.
In 20090716151953.ge4...@wks0082.feds.uwaterloo.ca, Eric Gerlach wrote:
On Wed, Jul 15, 2009 at 07:36:24AM -0700, Todd A. Jacobs wrote:
 On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote:
  How would one go about computing a *single* hash value for a complete
  directory tree?

 You might want to look at how git does this. As I understand it, git
 stores hashes of trees, so the implementation may help you.

Not really... the hash git indexes with is that of the compressed object
 (which is either a blob, tree, or commit).

Actually, I'm fairly sure it hashes the uncompressed object (now[1]), but 
I'd have to dig in to the source code to be sure.

 Tree and commit objects point
 at other objects (which are also stored by hash).  Blobs are the files
 themselves.

That is one way of calculating a single hash for a complete directory tree.  
The tree is identified by it's hash, which verifies the contents.  The 
contents identify the pointed to objects by hash, which verifies their 
contents.  Etc.

The hash/sum calculated has the same verification properties as a single-
file data-only hash.  It *might* not be as cryptographically strong, but 
that would be a bit surprising and I've seen no papers/pages verifying or 
refuting it's strength.[2]
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net  ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/

[1] There was a small period of time during Linus's maintainership of git 
that it hashed differently than it does now.  I can't recall why or when it 
was changed.

[2] Other than the fact that it uses a 128-bit SHA-1 hash and that *may* be 
getting too weak to be considered cryptographically secure in the near 
future.  Using SHA-2 is probably better, and you shouldn't lose much 
strength by truncating at 128-bits if you need that size specifically, but 
git doesn't support that.  Hopefully SHA-3 will be out before it matters, 
which means git can switch to that.[3]

[3] If they ever decide to switch, it will probably be painful.  They might 
not ever switch, since I don't think that resistance against attackers was 
the intent, just identification and resistance to random corruption.  (CVS 
and SVN could be silently corrupted for years and it was virtually 
impossible to tell; that doesn't happen to git repositories.)


signature.asc
Description: This is a digitally signed message part.


Re: sha1summ of complete directory?

2009-07-15 Thread Todd A. Jacobs
On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote:

 How would one go about computing a *single* hash value for a complete
 directory tree?

You might want to look at how git does this. As I understand it, git
stores hashes of trees, so the implementation may help you.

-- 
Oh, look: rocks!
-- Doctor Who, Destiny of the Daleks


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-07 Thread Tzafrir Cohen
On Tue, Jul 07, 2009 at 12:08:05AM -0400, Scott Gifford wrote:
 Ron Johnson ron.l.john...@cox.net writes:

 You will need to figure out what metadata you care about and what you
 don't, then.  For example, do you want to detect a renamed file?  A
 change in mtime/ctime/utime?  A change in permissions?
 
 For the file contents, if there are no subdirectories you can use:
 
 cat `ls` |sha1sum

Which is basically:

  cat * | sha1sum

And it would still fail to detect the difference between:

  case 1:
seq 1 6 file1

  case2:
seq 1 3 file1
seq 4 6 file2

(Same content, different metadata)

 
 ls will sort the files, so they will end up in the same order every
 time.
 
 For metadata, you could come up with the flags to ls that give the
 metadata you want, then include that before or after the cat, like
 this:
 
 (ls -l; cat `ls`) |sha1sum

Same comment about echo *.

Also note that packing files together with their metadata is, in fact,
exactly what an archive utility does:

  tar cf - . | sha1sum

This will also pick hidden files and subdirectories.

Oh, and sha1sum is not the only checksum available:

$ ls /usr/bin/*sum
/usr/bin/cksum /usr/bin/lh_source_md5sum  /usr/bin/sha384sum
/usr/bin/innochecksum  /usr/bin/md5sum/usr/bin/sha512sum
/usr/bin/jacksum   /usr/bin/sha1sum   /usr/bin/shasum
/usr/bin/kmk_md5sum/usr/bin/sha224sum /usr/bin/sum
/usr/bin/lh_binary_md5sum  /usr/bin/sha256sum

Some of those are actually relevant :-)

-- 
Tzafrir Cohen | tzaf...@jabber.org | VIM is
http://tzafrir.org.il || a Mutt's
tzaf...@cohens.org.il ||  best
ICQ# 16849754 || friend


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-07 Thread Scott Gifford
Tzafrir Cohen tzaf...@cohens.org.il writes:

 On Tue, Jul 07, 2009 at 12:08:05AM -0400, Scott Gifford wrote:

[...]

 For the file contents, if there are no subdirectories you can use:
 
 cat `ls` |sha1sum

 Which is basically:

   cat * | sha1sum

The purpose of the ls was to sort the filenames, but looking more
clostey, bash sorts them already.  csh and ksh do the same, and
glob(3) sorts by default.  I'm not sure if all shells do that or not,
but it seems that most do, and if your shell does, those two should be
equivalent.

Thanks Tzafrir, I learned something today!  :-)

[...]

 Also note that packing files together with their metadata is, in fact,
 exactly what an archive utility does:

   tar cf - . | sha1sum

 This will also pick hidden files and subdirectories.

If there were a way to specify what metadata to include in the
archive, this would be a perfect solution for the OP.

Scott.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



sha1summ of complete directory?

2009-07-06 Thread Ron Johnson


How would one go about computing a *single* hash value for a 
complete directory tree?


--
Scooty Puff, Sr
The Doom-Bringer


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org




Re: sha1summ of complete directory?

2009-07-06 Thread Mark Neidorff
On Monday 06 July 2009 08:30 pm, Ron Johnson wrote:
 How would one go about computing a *single* hash value for a
 complete directory tree?

 --
 Scooty Puff, Sr
 The Doom-Bringer

Tar the tree and then calculate the sha1sum of the tar file.  Easy, no?

Mark


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-06 Thread Adrian Levi
2009/7/7 Mark Neidorff m...@neidorff.com:
 On Monday 06 July 2009 08:30 pm, Ron Johnson wrote:
 How would one go about computing a *single* hash value for a
 complete directory tree?

This is what I was thinking, If you don't want to retain the tar file
then pipe it to sha1sum.

Adrian

-- 
24x7x365 != 24x7x52 Stupid or bad maths?
erno hm. I've lost a machine.. literally _lost_. it responds to
ping, it works completely, I just can't figure out where in my
apartment it is.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-06 Thread Ron Johnson

On 2009-07-06 20:29, Adrian Levi wrote:

2009/7/7 Mark Neidorff m...@neidorff.com:

On Monday 06 July 2009 08:30 pm, Ron Johnson wrote:

How would one go about computing a *single* hash value for a
complete directory tree?


This is what I was thinking, If you don't want to retain the tar file
then pipe it to sha1sum.


I was thinking about that, but tar stores metadata that I know are 
different between the two trees, thus invalidating that idea.


--
Scooty Puff, Sr
The Doom-Bringer


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org




Re: sha1summ of complete directory?

2009-07-06 Thread CaT
On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson wrote:
 How would one go about computing a *single* hash value for a complete 
 directory tree?

hash everything and then hash the result? (if you don't care about metadata
that is - if you do add a nice stat of everything into the final hash)

-- 
  A search of his car uncovered pornography, a homemade sex aid, women's 
  stockings and a Jack Russell terrier.
- http://www.news.com.au/story/0%2C27574%2C24675808-421%2C00.html


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-06 Thread Scott Gifford
Ron Johnson ron.l.john...@cox.net writes:

 On 2009-07-06 20:29, Adrian Levi wrote:
 2009/7/7 Mark Neidorff m...@neidorff.com:
 On Monday 06 July 2009 08:30 pm, Ron Johnson wrote:
 How would one go about computing a *single* hash value for a
 complete directory tree?
 This is what I was thinking, If you don't want to retain the tar file
 then pipe it to sha1sum.

 I was thinking about that, but tar stores metadata that I know are
 different between the two trees, thus invalidating that idea.

You will need to figure out what metadata you care about and what you
don't, then.  For example, do you want to detect a renamed file?  A
change in mtime/ctime/utime?  A change in permissions?

For the file contents, if there are no subdirectories you can use:

cat `ls` |sha1sum

ls will sort the files, so they will end up in the same order every
time.

For metadata, you could come up with the flags to ls that give the
metadata you want, then include that before or after the cat, like
this:

(ls -l; cat `ls`) |sha1sum

Hope this helps,

Scott.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: sha1summ of complete directory?

2009-07-06 Thread Ron Johnson

On 2009-07-06 23:08, Scott Gifford wrote:

Ron Johnson ron.l.john...@cox.net writes:


On 2009-07-06 20:29, Adrian Levi wrote:

2009/7/7 Mark Neidorff m...@neidorff.com:

On Monday 06 July 2009 08:30 pm, Ron Johnson wrote:

How would one go about computing a *single* hash value for a
complete directory tree?

This is what I was thinking, If you don't want to retain the tar file
then pipe it to sha1sum.

I was thinking about that, but tar stores metadata that I know are
different between the two trees, thus invalidating that idea.


You will need to figure out what metadata you care about and what you
don't, then.  For example, do you want to detect a renamed file?  A
change in mtime/ctime/utime?  A change in permissions?

For the file contents, if there are no subdirectories you can use:

cat `ls` |sha1sum

ls will sort the files, so they will end up in the same order every
time.

For metadata, you could come up with the flags to ls that give the
metadata you want, then include that before or after the cat, like
this:

(ls -l; cat `ls`) |sha1sum

Hope this helps,


Thanks, I *think* it did.

--
Scooty Puff, Sr
The Doom-Bringer


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org




Re: sha1summ of complete directory?

2009-07-06 Thread Daniel Burrows
On Mon, Jul 06, 2009 at 07:30:19PM -0500, Ron Johnson ron.l.john...@cox.net 
was heard to say:

 How would one go about computing a *single* hash value for a complete 
 directory tree?

  Depending on how important uniqueness is, you could just cat the
whole thing (sorting filenames first, of course) and pipe it to your
favorite cryptographic hash.  Whether that's good enough obviously
depends on what you're trying to do.  You could tag on a list of file
names and file sizes, although if it's important to be super-correct
you'll need a way of uniquely identifying where filenames stop and
where the list stops (NUL is probably your friend there) and you'll
want to normalize white-space.

  Daniel


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org