On Thu, Sep 26, 2013 at 10:03 AM, Sam Putman <[email protected]> wrote:

> The notion is to have a consistent way to map between "a" large sound file
> and "the" large sound file. From one perspective it's just a large number,
> and it's nice if two copies of that number are never treated as different
> things.
>

If we're considering the sound value, I think you cannot avoid having
multiple representations for the same meaning. There are different lossless
encodings (like Flac vs. Wav vs. 7zip'd Wav vs. self-extracting JavaScript)
and lossy encodings (Opus vs. MP3). There will be encodings more or less
suitable for streaming or security concerns. If we 'chunkify' a large sound
for streaming, there is some arbitrary aliasing regarding the size of each
chunk.

So when you discuss a sound file, you are not discussing the value or
meaning but rather a specific, syntactic representation of that meaning.

(A little philosophy.)

In my understanding, the difference between information (or data) and pure
mathematical values is that the information has origin, history, context,
inertia, physical spatial-temporal representation, and even physical mass
(related to Boltzmann's constant and Laundauer's principle). Information is
something mechanical, and much of computer science might be more accurately
described as information mechanics. From this perspective (which is the
usual one I hold) copies of a number really are different. They have
different locations, different futures. Further, they can only be
considered 'copies' if there was an act of copying (at a specific
spatial-temporal location). A large number constructed by two independent
computations isn't a copy and may have unique meaning.


>
>> For identity, I prefer to formally treat uniqueness as a semantic
>> feature, not a syntactic one.
>>
>
> I entirely agree! Hence the proposal of a function hash(foo) that produces
> a unique value for any given foo, where foo is an integer of arbitrary size
> (aka data). We may then compare the hashes as though they are the values,
> while saving time.
>

How often do we compare very large integers for equality?

I agree that keeping some summary information about a number, perhaps even
a hash, would be useful for quick comparisons for very large integers
(large enough that keeping the hash in memory is negligible). But I imagine
this would be a rather specialized use-case.


> Hashing is not associative per se but it may be made to behave
> associatively through various tweaks:
>
> http://en.wikipedia.org/wiki/Merkle_tree
>


Even a Merkle tree or a tiger tree hash has the same problems with aliasing
and associativity of the underlying data.

Best,

Dave
_______________________________________________
fonc mailing list
[email protected]
http://vpri.org/mailman/listinfo/fonc

Reply via email to