On 01/21/2013 07:06 AM, Ferrous Cranus wrote:

 <snip>


Seriously, you're asking for something that's beyond the power of

humans or computers. You want to identify that something's the same

file, without tracking the change or having any identifiable tag.

That's a fundamentally impossible task.

No, it is difficult but not impossible.
It just cannot be done by tagging the file by:

1. filename
2. filepath
3. hash (math algorithm producing a string based on the file's contents)

We need another way to identify the file WITHOUT using the above attributes.


Repeating the same impossible scenario won't solve it. You need to find some other way to recognize the file. If you can't count on either name, location, or content, you can't do it.

Try solving the problem by hand. If you examine the files, and a particular one has both changed names and content, how are you going to decide that it's the "same" one? Define "same" in a way that you could do it by hand, and you're halfway towards a programming solution.

Maybe it'd be obvious from an analogy. Suppose you're HR for a company with 100 employees, and a strange policy of putting paychecks under the wipers of the employees' windshields. All the employee cars are kept totally clean of personal belongings, with no registration or license plates. The lot has no reserved parking places, so every car has a random location.

For a while, you just memorize the make/model/color of each car, and everything's fine. But one day several of the employees buy new cars. How do you then associate each car with each employee?

I've got it - you require each one to keep a numbered parking sticker, and they move the sticker when they get a new car.

Or, you give everyone a marked, reserved parking place.

Or you require each employee to report any car exchanges to you, so you can update your records.

If you can solve this one, you can probably solve the other one. Until then, we have no spec.



--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to