Re: Question regarding fast-hashing in PGSQL

2019-09-18 Thread Tom Lane
Stephen Conley writes: > My idea was to hash the string to a bigint, because the likelihood of all 3 > columns colliding is almost 0, and if a duplicate does crop up, it isn't > the end of the world. > However, Postgresql doesn't seem to have any 'native' hashing calls that > result in a bigint.

Re: Question regarding fast-hashing in PGSQL

2019-09-18 Thread Stephen Conley
This should work perfectly for me. Thank you so much! On Wed, Sep 18, 2019 at 12:50 PM Adam Brusselback wrote: > I've had a similar issue in the past. > > I used the md5 hash function and stored it in a UUID column for my > comparisons. Bigger than a bigint, but still much faster than string >

Re: Question regarding fast-hashing in PGSQL

2019-09-18 Thread Adam Brusselback
I've had a similar issue in the past. I used the md5 hash function and stored it in a UUID column for my comparisons. Bigger than a bigint, but still much faster than string comparisons directly for my use case. UUID works fine for storing md5 hashes and gives you the ability to piggyback on all t

Question regarding fast-hashing in PGSQL

2019-09-18 Thread Stephen Conley
Hey there; I have a weird use case where I am basically taking data from many different sources and merging it into a single table, while trying to avoid duplicates as much as possible. None of them share any kind of primary key, but I have determined 3 columns that, together, will almost always