Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Thomas Hallgren

Gevik Babakhani wrote:

 To my opinion GUIDs type need to provide the following in the database.

1. GUID type must accept the correct string format(s), with of without
extra '-'
2. GUID type must internally be stored as small as possible.
3. GUID type must be comparable with == , != , LIKE and (NOT) IS NULL
4. GUID type must have the ability to be indexed, grouped, ordered,
DISTINCT... but not MAX(), MIN() or SUM()


Where do you see a need for LIKE on a GUID?

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Gevik Babakhani
LIKE could come handy if someone wants to abuse the uuid datatype to
store MD5 hash values. However I am not going to implement it if there
is no need for that (assuming it will pass the acceptance test)

On Mon, 2006-09-18 at 10:06 +0200, Thomas Hallgren wrote:
 Gevik Babakhani wrote:
   To my opinion GUIDs type need to provide the following in the database.
  
  1. GUID type must accept the correct string format(s), with of without
  extra '-'
  2. GUID type must internally be stored as small as possible.
  3. GUID type must be comparable with == , != , LIKE and (NOT) IS NULL
  4. GUID type must have the ability to be indexed, grouped, ordered,
  DISTINCT... but not MAX(), MIN() or SUM()
  
 Where do you see a need for LIKE on a GUID?
 
 Regards,
 Thomas Hallgren
 
 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Thomas Hallgren

Gevik Babakhani wrote:

LIKE could come handy if someone wants to abuse the uuid datatype to
store MD5 hash values. However I am not going to implement it if there
is no need for that (assuming it will pass the acceptance test)

  
Perhaps providing LIKE just to encourage abuse is not such a good idea? 
IMHO, a GUID should be comparable for equality and NULL only, not LIKE. 
I also think that ordering is feasible only when looking at parts of the 
GUID, i.e. order by the result of a function that extracts a timestamp 
or a node-address. Magnitude comparison on the GUID as a whole makes no 
sense to me.


Regards,
Thomas Hallgren


On Mon, 2006-09-18 at 10:06 +0200, Thomas Hallgren wrote:
  

Gevik Babakhani wrote:


 To my opinion GUIDs type need to provide the following in the database.

1. GUID type must accept the correct string format(s), with of without
extra '-'
2. GUID type must internally be stored as small as possible.
3. GUID type must be comparable with == , != , LIKE and (NOT) IS NULL
4. GUID type must have the ability to be indexed, grouped, ordered,
DISTINCT... but not MAX(), MIN() or SUM()

  

Where do you see a need for LIKE on a GUID?

Regards,
Thomas Hallgren





  



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Gevik Babakhani

 Magnitude comparison on the GUID as a whole makes no 
 sense to me.

I agree. Any kind of comparison except equality has no meaning for the
GUID. (And this is discussed before) I rather have the option to sort
and group for the sake of consistency and compatibility.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread mark
On Mon, Sep 18, 2006 at 11:12:54AM +0200, Gevik Babakhani wrote:
  Magnitude comparison on the GUID as a whole makes no 
  sense to me.
 I agree. Any kind of comparison except equality has no meaning for the
 GUID. (And this is discussed before) I rather have the option to sort
 and group for the sake of consistency and compatibility.

Thomas: The ability to sort / comparison is required for use with
B-Tree index. I prefer a fast comparison over one with more meaning.
memcmp() is fine with me and it is how I implement it in my UUID
PostgreSQL library.

Gevik: Once you have your patch in a ready state, I'll compare it
against what I have and see if there is anything missing, or if I
did anything better. Possibly not, but it's worth the check.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Thomas Hallgren

[EMAIL PROTECTED] wrote:

On Mon, Sep 18, 2006 at 11:12:54AM +0200, Gevik Babakhani wrote:
Magnitude comparison on the GUID as a whole makes no 
sense to me.

I agree. Any kind of comparison except equality has no meaning for the
GUID. (And this is discussed before) I rather have the option to sort
and group for the sake of consistency and compatibility.


Thomas: The ability to sort / comparison is required for use with
B-Tree index. I prefer a fast comparison over one with more meaning.
memcmp() is fine with me and it is how I implement it in my UUID
PostgreSQL library.

Fair enough. Although the magnitudes as such makes little sense, the ability to order will 
make it possible to compare results from different queries etc. Very difficult to do with 
random order.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-07 Thread Gevik Babakhani
On Wed, 2006-09-06 at 17:05 -0400, [EMAIL PROTECTED] wrote:

 The UUID type itself has value, however, the value it provides is
 limited. Generation of a UUID doesn't have to occur with the database.
 The application inserting the row can generate the UUID. The UUID type
 itself has limited value, in that the difference between a 36 bytes +
 4 bytes = 40 bytes as a fully expanded char(40), or the packed value
 using the hexstring encode/decode SQL functions of a bytea type at 16
 + 4 bytes = 20 bytes compared to a tightly packed UUID type of 16
 bytes, are very close. The argument can easily be made that if space
 (disk space, index size, I/O bandwidth) is your first priority, than
 a UUID is the wrong type to use. A 64-bit integer may suffice.

Agreed.

 
 I'm also having trouble with the idea that a UUID deserves special
 treatment. I currently have a desire to store both UUID and MD5
 checksum in my rows. They are both 128 bits = 16 bytes, and fit all
 of the same requirements above.

Agreed.

 This makes it seem as if a generic 128-bit data type would be
 desirable. They both have a compatible representation of a hexadecimal
 string. The extra '-' characters in the UUID can be easily added when
 necessary by a HEXSTRING2UUID() sort of function in plpgsql or in
 the claling application.

Agreed

 
 But - MD5 isn't the only checksum that is frequently used. Some argue
 that the MD5 can be shown to be weak in some regards, and that perhaps
 other checksum algorithms such as SHA-1 provide a better guarantee of
 uniqueness. SHA-1 isn't 128-bits. It's 160-bits.

To my knowledge most apps use MD5 checksum because it is easier to
implement and use.  
 
 This is where I start to buy Tom Lane's argument that the 4-byte
 prefix is no big deal. I find it more desirable to have a binary data
 type with a hexadecimal string input and output function. The
 flexibility of being able to use 128-bits or 160-bits is worth this
 4-byte header to me. What I don't want to do is store double size
 fields, stored as hexadecimal.

Agreed.

I have experimented with GUID/UUID type for a while and here are my
findings when having the need to store GUID/UUID/MD5 in the database.

1. Almost always these values are created outside the database. So to my
opinion having all kinds of functions in the database to create such
values becomes second priority except when wanting to use GUIDs auto
generated for PK like the uuid datatype in MS SQL

2. Yet I haven't seen any kind of arithmetic operations (+ - * / %) on
stored GUIDs hence these operations would be meaningless. I also never
did see the need to change specific parts of a GUID 

To my opinion only some of relational/compare operations like == and !=
apply to such values. comparing guid = guid or md5  md5 is also
meaningless.

3. Almost always a GUID/MD5 is not changed when it is generated and
stored except for cases like resetting md5sum of a password that is
changed.

To my opinion GUIDs type need to provide the following in the database.

1. GUID type must accept the correct string format(s), with of without
extra '-'
2. GUID type must internally be stored as small as possible.
3. GUID type must be comparable with == , != , LIKE and (NOT) IS NULL
4. GUID type must have the ability to be indexed, grouped, ordered,
DISTINCT... but not MAX(), MIN() or SUM()

I think option three would be a good one to experiment with. I am
thinking of a 16 length struct to hold the GUID value and corresponding
functionality to achieve the above however possible.

I also think we should have the GUID/UUID as a datatype and not just
functions handling hexstring.







---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-07 Thread Martijn van Oosterhout
On Thu, Sep 07, 2006 at 01:27:21PM +0200, Gevik Babakhani wrote:
 To my opinion only some of relational/compare operations like == and !=
 apply to such values. comparing guid = guid or md5  md5 is also
 meaningless.

snip

 4. GUID type must have the ability to be indexed, grouped, ordered,
 DISTINCT... but not MAX(), MIN() or SUM()

Err, for ordered you need to define , , =, =, which means you're
going to get MAX, and MIN for free...

 I also think we should have the GUID/UUID as a datatype and not just
 functions handling hexstring.

Sure, but that will be the I/O format, right?

Have a nice day,

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature