Adam Olsen rha...@gmail.com (AO) wrote:
AO The Wayback Machine has 150 billion pages, so 2**37. Google's index
AO is a bit larger at over a trillion pages, so 2**40. A little closer
AO than I'd like, but that's still 56294995000 to 1 odds of having
AO *any* collisions between *any* of the
Adam Olsen wrote:
On Apr 16, 11:15 am, SpreadTooThin bjobrie...@gmail.com wrote:
And yes he is right CRCs hashing all have a probability of saying that
the files are identical when in fact they are not.
Here's the bottom line. It is either:
A) Several hundred years of mathematics and
Adam Olsen wrote:
On Apr 16, 4:27 pm, Rhodri James rho...@wildebst.demon.co.uk
wrote:
On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen rha...@gmail.com wrote:
On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:
Okay, before I tell you about the empirical, real-world evidence I have
could
On Thu, 2009-04-16 at 21:44 -0700, Adam Olsen wrote:
The Wayback Machine has 150 billion pages, so 2**37. Google's index
is a bit larger at over a trillion pages, so 2**40. A little closer
than I'd like, but that's still 56294995000 to 1 odds of having
*any* collisions between *any* of
Adam Olsen wrote:
On Apr 16, 11:15 am, SpreadTooThin bjobrie...@gmail.com wrote:
And yes he is right CRCs hashing all have a probability of saying that
the files are identical when in fact they are not.
Here's the bottom line. It is either:
A) Several hundred years of mathematics and
On Apr 17, 4:54 am, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
On Apr 16, 11:15 am, SpreadTooThin bjobrie...@gmail.com wrote:
And yes he is right CRCs hashing all have a probability of saying that
the files are identical when in fact they are not.
Here's the bottom line. It
On Apr 17, 5:30 am, Tim Wintle tim.win...@teamrubber.com wrote:
On Thu, 2009-04-16 at 21:44 -0700, Adam Olsen wrote:
The Wayback Machine has 150 billion pages, so 2**37. Google's index
is a bit larger at over a trillion pages, so 2**40. A little closer
than I'd like, but that's still
On Apr 17, 9:59 am, norseman norse...@hughes.net wrote:
The more complicated the math the harder it is to keep a higher form of
math from checking (or improperly displacing) a lower one. Which, of
course, breaks the rules. Commonly called improper thinking. A number
of math teasers make use
On Apr 17, 9:59 am, SpreadTooThin bjobrie...@gmail.com wrote:
You know this is just insane. I'd be satisfied with a CRC16 or
something in the situation i'm in.
I have two large files, one local and one remote. Transferring every
byte across the internet to be sure that the two files are
In message mailman.3934.1239821812.11746.python-l...@python.org, Nigel
Rantor wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software
On Fri, 17 Apr 2009 11:19:31 -0700, Adam Olsen wrote:
Actually, *cryptographic* hashes handle that just fine. Even for files
with just a 1 bit change the output is totally different. This is known
as the Avalanche Effect. Otherwise they'd be vulnerable to attacks.
Which isn't to say you
On Apr 15, 12:56 pm, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
an incorrect
On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
On Apr 15, 12:56 pm, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
Adam Olsen wrote:
On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
On Apr 15, 12:56 pm, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's
Adam Olsen wrote:
On Apr 15, 12:56 pm, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
On 2009-04-16, Adam Olsen rha...@gmail.com wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's
completely overshadowed by the risk of a hardware or software
failure producing an incorrect result.
Not when you're
On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
On Apr 15, 12:56 pm, Nigel Rantor wig...@wiggly.org wrote:
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
On Apr 16, 8:59 am, Grant Edwards inva...@invalid wrote:
On 2009-04-16, Adam Olsen rha...@gmail.com wrote:
I'm afraid you will need to back up your claims with real files.
Although MD5 is a smaller, older hash (128 bits, so you only need
2**64 files to find collisions),
You don't need
On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen rha...@gmail.com wrote:
On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:
Okay, before I tell you about the empirical, real-world evidence I have
could you please accept that hashes collide and that no matter how many
samples you use the
On Apr 16, 11:15 am, SpreadTooThin bjobrie...@gmail.com wrote:
And yes he is right CRCs hashing all have a probability of saying that
the files are identical when in fact they are not.
Here's the bottom line. It is either:
A) Several hundred years of mathematics and cryptography are wrong.
On Apr 16, 4:27 pm, Rhodri James rho...@wildebst.demon.co.uk
wrote:
On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen rha...@gmail.com wrote:
On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:
Okay, before I tell you about the empirical, real-world evidence I have
could you please
On Wed, 15 Apr 2009 07:54:20 +0200, Martin wrote:
Perhaps I'm being dim, but how else are you going to decide if two
files are the same unless you compare the bytes in the files?
I'd say checksums, just about every download relies on checksums to
verify you do have indeed the same file.
On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:
The checksum does look at every byte in each file. Checksumming isn't a
way to avoid looking at each byte of the two files, it is a way of
mapping all the bytes to a single number.
My understanding
On 2009-04-15, Martin mar...@marcher.name wrote:
Hi,
On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't
On 2009-04-15, Martin mar...@marcher.name wrote:
On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
I'd still say rather burn CPU cycles than development hours (if I got
the question right),
_Hours_? Calling the file compare module takes _one_line_of_code_.
Implementing a file compare from
Martin wrote:
On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:
The checksum does look at every byte in each file. Checksumming isn't a
way to avoid looking at each byte of the two files, it is a way of
mapping all the bytes to a single number.
My
Grant Edwards wrote:
We all rail against premature optimization, but using a
checksum instead of a direct comparison is premature
unoptimization. ;)
And more than that, will provide false positives for some inputs.
So, basically it's a worse-than-useless approach for determining if two
On Apr 15, 8:04 am, Grant Edwards inva...@invalid wrote:
On 2009-04-15, Martin mar...@marcher.name wrote:
Hi,
On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if
On Apr 15, 11:04 am, Nigel Rantor wig...@wiggly.org wrote:
The fact that two md5 hashes are equal does not mean that the sources
they were generated from are equal. To do that you must still perform a
byte-by-byte comparison which is much less work for the processor than
generating an md5 or
Adam Olsen wrote:
The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
an incorrect result.
Not when you're using them to compare lots of files.
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison of two files to see if they
are they same.
What should I be using if not filecmp.cmp?
--
On Apr 13, 8:39 pm, Grant Edwards gra...@visi.com wrote:
On 2009-04-13, Peter Otten __pete...@web.de wrote:
But there's a cache. A change of file contents may go
undetected as long as the file stats don't change:
Good point. You can fool it if you force the stats to their
old values
Hi,
On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by
SpreadTooThin wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison of two files to see if they
are they same.
What should I be using if not filecmp.cmp?
Well,
On Apr 13, 2:00 pm, Przemyslaw Kaminski cge...@gmail.com wrote:
SpreadTooThin wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison of two files to see if they
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison of two files to see if they
are they same.
Perhaps I'm
On Apr 13, 2:03 pm, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison
On 2009-04-13, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by byte comparison of two
On Apr 13, 2:37 pm, Grant Edwards inva...@invalid wrote:
On 2009-04-13, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm
Grant Edwards wrote:
On 2009-04-13, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp function but I don't get a warm fuzzy feeling
that it is doing a byte by
On Mon, 13 Apr 2009 15:03:32 -0500, Grant Edwards wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same. I see
the filecmp.cmp function but I don't get a warm fuzzy feeling that it
is doing a byte by byte comparison
SpreadTooThin wrote:
On Apr 13, 2:37 pm, Grant Edwards inva...@invalid wrote:
On 2009-04-13, Grant Edwards inva...@invalid wrote:
On 2009-04-13, SpreadTooThin bjobrie...@gmail.com wrote:
I want to compare two binary files and see if they are the same.
I see the filecmp.cmp
On 2009-04-13, Peter Otten __pete...@web.de wrote:
But there's a cache. A change of file contents may go
undetected as long as the file stats don't change:
Good point. You can fool it if you force the stats to their
old values after you modify a file and you don't clear the
cache.
--
Grant
43 matches
Mail list logo