I have strings (on the alphabet {A,C,T,G}) of length 30 to 50. I am trying 
to hash them to save on space (as I have a few million to billion of them). 
I know I should be using a bloom filter 
(http://en.wikipedia.org/wiki/Bloom_filter) or some other such space-saving 
data structure, but I'm too lazy/busy/inexperienced to do it by hand.

On Friday, December 5, 2014 5:10:54 PM UTC-8, Jason Merrill wrote:
>
> There might be a good solution to the particular problem you're trying to 
> solve, though. What are you trying to do?
>
> On Friday, December 5, 2014 5:08:08 PM UTC-8, John Myles White wrote:
>>
>> For specialized cases it is possible to achieve 1-1-ness: 
>> http://en.wikipedia.org/wiki/Perfect_hash_function
>>
>> But this is not something that most people aspire to do for most types 
>> since 1-1-ness isn't essential in most applications and is costly to 
>> achieve.
>>
>>  -- John
>>
>> On Dec 5, 2014, at 5:03 PM, David Koslicki <[email protected]> wrote:
>>
>> Ah, of course! I was hoping that on certain data types it was 1-1, but I 
>> guess that was a long shot. Thanks for clarifying.
>>
>> On Friday, December 5, 2014 4:57:41 PM UTC-8, Jason Merrill wrote:
>>>
>>> If the space of possible hashes is smaller than the space of possible 
>>> inputs (e.g. the hash is represented with fewer bits than the input data 
>>> is), which is typically the case, then you can use the Pigeonhole Principle 
>>> to prove what John wrote:
>>>
>>> https://en.wikipedia.org/wiki/Pigeonhole_principle
>>>
>>> On Friday, December 5, 2014 4:35:18 PM UTC-8, John Myles White wrote:
>>>>
>>>> This function is impossible to write in generality since hash functions 
>>>> aren't one-to-one. 
>>>>
>>>>  -- John 
>>>>
>>>> On Dec 5, 2014, at 4:32 PM, David Koslicki <[email protected]> wrote: 
>>>>
>>>> > Hello, 
>>>> > 
>>>> > Is there a built in function that will undo hash()? 
>>>> > 
>>>> > i.e. I am looking for a function "dehash()" such that 
>>>> > dehash(hash("ACTG")) == "ACTG" 
>>>> > 
>>>> > I can't seem to find this anywhere (documentation, google, this user 
>>>> group, etc). 
>>>> > 
>>>> > Thanks, 
>>>> > 
>>>> > ~David 
>>>>
>>>>
>>

Reply via email to