Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-15 Thread Kless
An answer of Jerry Stuckle:

---
 Yes, they must be managed by the language.  Which is why it should be
 part of the standard.  That way, changing databases does not require
 changing code.

 You are correct that putting widely used features into a standard that
 is implemented by everyone is good.

 This does not extend to the conclusion that one should never put in a
 feature until it is standard. Look at any successful software product
 and see how it usually leads the standard rather than follows it.
 People
 only tend to make standards once they realize things are getting out
 of
 control, which is long after the products are in use.

Non-standard features just force people to stick with that one
product.
  In the long run, the only people who benefit are the product
developers.

 In PostgreSQL they're stored as 16 binary bytes [2], and the core
 database does not include any function for generating UUIDs

 Yep, which in the grand scheme of things, probably makes zero
 difference.  The difference between 16 and 32 bytes in any single row
 is minuscule.

 This is incorrect. UUID at 16 bytes is already long in terms of
 being
 used as a primary index. In an 8K page, one can only fit 512 UUIDs
 (forgetting the requirement for headers) - if it was stored as 32
 bytes
 - or 36 bytes, or 40 bytes (with punctuation), it would be at less
 than
 256 UUIDs per page. For a join table joining one set of UUID to
 another
 set, that's  256 vs  128. Doubling the size of an index row roughly
 doubles the time to look up the value.

Incorrect.  Doubling the size of the index has very little effect on
how
long it takes to look up a value.  Intelligent databases use a binary
search so doubling the size only means one additional comparison need
be
done.  And heavily used indexes are generally cached in memory anyway.

 I am not in favor of adding more database-specific types to ANY
 database
 - and I think PostGres doing it was a mistake.

 As somebody who wrote his own module to do UUID for PostgreSQL when I
 needed it in PostgreSQL 8.0, I don't agree. Just as you think defining
 it in a standard is better than each vendor doing it their own way, I
 think doing it in one product is better than each user of the product
 doing it their own way.

Fine.  Whatever you want for your code.  But don't expect the rest of
the world to jump because you want it.

 If there is a demand for it, then it should be added to the SQL
 standard.  That is the correct way to propose a change.  That's why
 there are standards.

 Provide a real example of any similar product doing this. Exactly
 which
 enhancement to a standard was defined without even a prototype
 existing
 used in an existing product that purports to implement the standard?

 I'm sure one or two examples must exist, but I cannot think of any.
 Every enhancement I can think of that eventually made it into a
 standard, was first implemented within a popular product, and then
 demanded as a standard to be applied to all other products.


Most features added to the SQL standard, for instance.  Like explicit
JOINs, recursive SQL and a bunch more.  Also changes to the C++
standard
such as exceptions were at least in the process of being evaluated and
approved before they were in any product.

There's a reason for having a process to propose features to a
product.
And it does not require the proposed change to be in any product.

--
==
Remove the x from my email address
Jerry Stuckle
JDS Computer Training Corp.
[EMAIL PROTECTED]
==
---

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-15 Thread Abhijit Menon-Sen
At 2008-07-15 08:34:01 -0700, [EMAIL PROTECTED] wrote:

 An answer of Jerry Stuckle:

Please stop cross-posting messages from this list to whatever MySQL list
you're on. It's a boring, pointless waste of time at best, and at worst
will get you written off as a troll in both places pretty soon.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-15 Thread Mark Mielke
First - please stop copying this list - this is not the convince Jerry 
to include UUID in MySQL mailing list.


Really - I don't care what he thinks. But, on the subjects themselves 
and how they apply to *PostgreSQL*:



Non-standard features just force people to stick with that one
product.
  In the long run, the only people who benefit are the product
developers.
  


I chose PostgreSQL over MySQL because it provided numerous features - 
both standard and non - that I needed on the day I made my decision. I 
don't care about the long run as a user. One might as well say 90% of 
the world is wrong for using Microsoft products, because it locks one 
into Microsoft. One can say this - and people do say this - but none of 
this changes the fact that 90% of the world is relatively happy with 
their choice. They voted with their dollars. All decisions should be 
made on a cost-benefit analysis - they should not be based on some 
arbitrary code like I will not choose a solution that locks me in.


Additionally - in the context of MySQL - the main reason I chose 
PostgreSQL over MySQL is because it provided things like CREATE VIEW, 
which MySQL did not at the time. People such as Jerry can pretend that 
standards guarantee that a feature is in all products, but it seems 
quite clear that just because something is a standard does NOT mean it 
is implemented the same everywhere, or even at all. At the time I chose 
PostgreSQL it was my opinion that PostgreSQL was far more 
standards-compliant than MySQL was going to be for at least a few years. 
I am glad I came to the correct conclusion. MySQL implemented ACID as an 
after-thought. I mean - comone.



This is incorrect. UUID at 16 bytes is already long in terms of
being
used as a primary index. In an 8K page, one can only fit 512 UUIDs
(forgetting the requirement for headers) - if it was stored as 32
bytes
- or 36 bytes, or 40 bytes (with punctuation), it would be at less
than
256 UUIDs per page. For a join table joining one set of UUID to
another
set, that's  256 vs  128. Doubling the size of an index row roughly
doubles the time to look up the value.



Incorrect.  Doubling the size of the index has very little effect on
how
long it takes to look up a value.  Intelligent databases use a binary
search so doubling the size only means one additional comparison need
be
done.  And heavily used indexes are generally cached in memory anyway.
  


Wrong. A binary search that must read double the number of pages, and 
compare double the number of bytes, will take double the amount of time. 
There are factors that will reduce this, such as if you assume that most 
of the pages are in memory or cache memory, therefore the time to read 
the page is zero, therefore it's only the time to compare bytes - but at 
this point, the majority of the time is spent comparing bytes, and it's 
still wrong. If we add in accounting for the fact that UUID is compared 
using a possibly inlined memcpy() compared to treating it as a string 
where it is variable sized, and much harder to inline (double the number 
of oeprations), and it's pretty clear that the person who would make 
such a statement as above is wrong.


As another poster wrote - why not double the size of all other data 
structures too. It costs nothing, right?


Why does MySQL have a 3-byte integer support if they truly believe that 
saving 1 byte in 4 doesn't result in a savings for keys?


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]



Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-15 Thread Kless
I'm sorry, but it was necessary that certain answers were answered by
someone with wide knowledge on databases and overall about its own
database. This one was the only way, and I believe that it has been
enough positive, at least for the end users -every one that choose its
data base-. At least this clarifies how is working each community, and
what is to be true or not.

On Jul 15, 6:45 pm, [EMAIL PROTECTED] (Abhijit Menon-Sen) wrote:
 At 2008-07-15 08:34:01 -0700, [EMAIL PROTECTED] wrote:



  An answer of Jerry Stuckle:

 Please stop cross-posting messages from this list to whatever MySQL list
 you're on. It's a boring, pointless waste of time at best, and at worst
 will get you written off as a troll in both places pretty soon.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-15 Thread Andrew Dunstan



Kless wrote:

I'm sorry, but it was necessary that certain answers were answered by
someone with wide knowledge on databases and overall about its own
database. This one was the only way, and I believe that it has been
enough positive, at least for the end users -every one that choose its
data base-. At least this clarifies how is working each community, and
what is to be true or not.



Nonsense. It was not at all necessary.

If someone wants to post on this mailing list they should do it 
themselves. If not, you shouldn't cross-post for them.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Kless



-- Forwarded message --
From: Jerry Stuckle [EMAIL PROTECTED]
Date: Jul 13, 10:29 pm
Subject: Proposal - UUID data type
To: comp.databases.mysql


Kless wrote:
 On Jul 13, 9:37 pm, Jerry Stuckle [EMAIL PROTECTED] wrote:
 Rather, I would think you should propose the UUID data type be added to
 the SQL standard.  There are enough variations from the standards now.
 Those variations must be managed by the language and not by the RDBMS,
 where its *main work* is  storing data using the specific data types.
 A lot of languages already have modules to working with UUID [1].

Yes, they must be managed by the language.  Which is why it should be
part of the standard.  That way, changing databases does not require
changing code.

 And as it is, UUID's can be stored as a 32 byte string.  So it isn't as
 if there is not an alternative.
 If they're stored in ASCII form (32 hex digits), would indeed be very
 inefficient. So it's necessary that the RDBMS have a specific data
 type to handle the UUIDs.

 In PostgreSQL they're stored as 16 binary bytes [2], and the core
 database does not include any function for generating UUIDs

 [1]http://en.wikipedia.org/wiki/Uuid#Implementations
 [2]http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/...

Yep, which in the grand scheme of things, probably makes zero
difference.  The difference between 16 and 32 bytes in any single row
is
minuscule.

I am not in favor of adding more database-specific types to ANY
database
- and I think PostGres doing it was a mistake.

If there is a demand for it, then it should be added to the SQL
standard.  That is the correct way to propose a change.  That's why
there are standards.

--
==
Remove the x from my email address
Jerry Stuckle
JDS Computer Training Corp.
[EMAIL PROTECTED]
==

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Martijn van Oosterhout
On Mon, Jul 14, 2008 at 01:04:33AM -0700, Kless wrote:
 I am not in favor of adding more database-specific types to ANY
 database
 - and I think PostGres doing it was a mistake.

So you think that adding full text indexing, gist/gin indexes, text,
geometric types should have waited until the SQL standard specified
them? With that kind of thinking we'd still be in the database stone
age.

One of postgresql's greatest strengths is user-defined types, lets use
it.

 If there is a demand for it, then it should be added to the SQL
 standard.  That is the correct way to propose a change.  That's why
 there are standards.

You are ofcourse free to propose it to them, but the question is if
they'd listen...

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread David E. Wheeler

On Jul 14, 2008, at 01:34, Martijn van Oosterhout wrote:


I am not in favor of adding more database-specific types to ANY
database
- and I think PostGres doing it was a mistake.


So you think that adding full text indexing, gist/gin indexes, text,
geometric types should have waited until the SQL standard specified
them? With that kind of thinking we'd still be in the database stone
age.


Besides which, I seriously doubt that the SQL standard limits data  
types.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Mark Mielke

Kless wrote:

Yes, they must be managed by the language.  Which is why it should be
part of the standard.  That way, changing databases does not require
changing code.
  


You are correct that putting widely used features into a standard that 
is implemented by everyone is good.


This does not extend to the conclusion that one should never put in a 
feature until it is standard. Look at any successful software product 
and see how it usually leads the standard rather than follows it. People 
only tend to make standards once they realize things are getting out of 
control, which is long after the products are in use.



In PostgreSQL they're stored as 16 binary bytes [2], and the core
database does not include any function for generating UUIDs



Yep, which in the grand scheme of things, probably makes zero
difference.  The difference between 16 and 32 bytes in any single row
is minuscule.
  


This is incorrect. UUID at 16 bytes is already long in terms of being 
used as a primary index. In an 8K page, one can only fit 512 UUIDs 
(forgetting the requirement for headers) - if it was stored as 32 bytes 
- or 36 bytes, or 40 bytes (with punctuation), it would be at less than 
256 UUIDs per page. For a join table joining one set of UUID to another 
set, that's  256 vs  128. Doubling the size of an index row roughly 
doubles the time to look up the value.



I am not in favor of adding more database-specific types to ANY
database
- and I think PostGres doing it was a mistake.
  


As somebody who wrote his own module to do UUID for PostgreSQL when I 
needed it in PostgreSQL 8.0, I don't agree. Just as you think defining 
it in a standard is better than each vendor doing it their own way, I 
think doing it in one product is better than each user of the product 
doing it their own way.



If there is a demand for it, then it should be added to the SQL
standard.  That is the correct way to propose a change.  That's why
there are standards.
  


Provide a real example of any similar product doing this. Exactly which 
enhancement to a standard was defined without even a prototype existing 
used in an existing product that purports to implement the standard?


I'm sure one or two examples must exist, but I cannot think of any. 
Every enhancement I can think of that eventually made it into a 
standard, was first implemented within a popular product, and then 
demanded as a standard to be applied to all other products.


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]



Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Kless
I write here the answer of Jerry Stuckle [1] because it looks me
interesting and enough logical.


[1] 
http://groups.google.com/group/comp.databases.mysql/browse_thread/thread/89557609239a995e
---
Quite frankly, I don't care that PostGres has user-defined types.
They
restrict you to a single database, when others might be better for
other reasons.

And yes, I think other things should have been proposed to the SQL
standards committee.  It doesn't take that long to get a good proposal
into the standards.  No, it isn't immediate.  But if there is a case
to
be made for it, then the committee will act.

Then all databases get the feature, eventually.

As I said.  Do it the right way.  Submit your proposal.  If you have a
case, it will be added to the SQL standard.  If not, then it's not
that
important.
---


On Jul 14, 9:34 am, [EMAIL PROTECTED] (Martijn van Oosterhout) wrote:
 On Mon, Jul 14, 2008 at 01:04:33AM -0700, Kless wrote:
  I am not in favor of adding more database-specific types to ANY
  database
  - and I think PostGres doing it was a mistake.

 So you think that adding full text indexing, gist/gin indexes, text,
 geometric types should have waited until the SQL standard specified
 them? With that kind of thinking we'd still be in the database stone
 age.

 One of postgresql's greatest strengths is user-defined types, lets use
 it.

  If there is a demand for it, then it should be added to the SQL
  standard.  That is the correct way to propose a change.  That's why
  there are standards.

 You are ofcourse free to propose it to them, but the question is if
 they'd listen...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread David E. Wheeler

On Jul 14, 2008, at 10:49, Kless wrote:


I write here the answer of Jerry Stuckle [1] because it looks me
interesting and enough logical.


It just sounds narrow-minded to me. See:

  http://www.oreillynet.com/pub/a/network/2005/07/29/cjdate.html

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Andrew Dunstan



Kless wrote:

I write here the answer of Jerry Stuckle [1] because it looks me
interesting and enough logical.


[1] 
http://groups.google.com/group/comp.databases.mysql/browse_thread/thread/89557609239a995e
---
Quite frankly, I don't care that PostGres has user-defined types.
They
restrict you to a single database, when others might be better for
other reasons.

And yes, I think other things should have been proposed to the SQL
standards committee.  It doesn't take that long to get a good proposal
into the standards.  No, it isn't immediate.  But if there is a case
to
be made for it, then the committee will act.

Then all databases get the feature, eventually.

As I said.  Do it the right way.  Submit your proposal.  If you have a
case, it will be added to the SQL standard.  If not, then it's not
that
important.
  



The time taken to get something into the standard is a lifetime in 
computing terms. If my client has a need for a UDF they need it now, not 
when the standards committee gets around to thinking about it.


Many UDTs will be specialised to a single user, and never be candidates 
for inclusion in the standards. Excluding any type that isn't in the 
standard would be to throw away one of Postgres' greatest strengths, one 
we are justly proud of. Maybe you don't care about that, but we do, and 
our clients do.


In any case, a standard for UUIDs would almost certainly not specify how 
it is to be stored, which is where we got into this discussion.


This debate started with a misconception about how Postgres actually 
stores UUIDs, and doesn't seem to have gained much point since then.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Mark Mielke

Kless wrote:

I write here the answer of Jerry Stuckle [1] because it looks me
interesting and enough logical.
  


Jerry's answer isn't a real answer - and we don't care what MySQL does 
or does not do. PostgreSQL developers are not going to invest time into 
helping you get a feature into MySQL - if this is what you are trying to 
do, please stop.


MySQL didn't implement SQL-standards views until what - MySQL 4 or 5? 
Obviously standards is not their goal either. In Open Source / Free 
Software, the free contributions are from people with itches that they 
scratched. In a company like MySQL, it is more about business value or 
somewhere in between. I was a MySQL 3.x/4.x user until I learned 
PostgreSQL, and I have no intention of going back. They have so many 
incorrect assumptions built into their system, that I chose to switch 
databases instead of arguing with them. It's not worth my time, and I 
don't intend to go back. So, I will not be helping you get UUID into 
MySQL because I just don't care about MySQL...


Cheers,
mark




[1] 
http://groups.google.com/group/comp.databases.mysql/browse_thread/thread/89557609239a995e
---
Quite frankly, I don't care that PostGres has user-defined types.
They
restrict you to a single database, when others might be better for
other reasons.

And yes, I think other things should have been proposed to the SQL
standards committee.  It doesn't take that long to get a good proposal
into the standards.  No, it isn't immediate.  But if there is a case
to
be made for it, then the committee will act.

Then all databases get the feature, eventually.

As I said.  Do it the right way.  Submit your proposal.  If you have a
case, it will be added to the SQL standard.  If not, then it's not
that
important.
---
  


--
Mark Mielke [EMAIL PROTECTED]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Gregory Stark
Mark Mielke [EMAIL PROTECTED] writes:

 Kless wrote:
 Yes, they must be managed by the language.  Which is why it should be
 part of the standard.  That way, changing databases does not require
 changing code.
   

 You are correct that putting widely used features into a standard that is
 implemented by everyone is good.

 This does not extend to the conclusion that one should never put in a feature
 until it is standard. Look at any successful software product and see how it
 usually leads the standard rather than follows it. People only tend to make
 standards once they realize things are getting out of control, which is long
 after the products are in use.


To be fair there are two types of standards. Some standards follow
implementations, others lead and prevent the babel of incompatible
implementations from ever developing. Both ways work in the right context.

But (perhaps unfortunately) the SQL spec is very definitely of the former
type. Nothing goes into the SQL spec that one of the implementors hasn't
already done.

But that said Postgres isn't one of the main participants in the spec
committee. We don't really want to add features that later get added to the
spec in an incompatible way and we have no say in the committee to avoid that
situation.

 In PostgreSQL they're stored as 16 binary bytes [2], and the core
 database does not include any function for generating UUIDs
 

 Yep, which in the grand scheme of things, probably makes zero
 difference.  The difference between 16 and 32 bytes in any single row
 is minuscule.

Really? It sounds like 100% difference to me. If you applied that logic to
everything you would have a database which runs at half the speed it would
otherwise.

Keep in mind also that your primary keys have to be stored in every other
table as foreign keys too...

 I'm sure one or two examples must exist, but I cannot think of any. Every
 enhancement I can think of that eventually made it into a standard, was first
 implemented within a popular product, and then demanded as a standard to be
 applied to all other products.

C99? SMTP? NTP?

It tends to be important for network protocols since there's no gain in having
non-interoperable protocols.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Mark Mielke [EMAIL PROTECTED] writes:
 I'm sure one or two examples must exist, but I cannot think of any. Every
 enhancement I can think of that eventually made it into a standard, was first
 implemented within a popular product, and then demanded as a standard to be
 applied to all other products.

 C99? SMTP? NTP?

 It tends to be important for network protocols since there's no gain in having
 non-interoperable protocols.

Actually, the IETF's mantra has always been rough consensus and running
code (cf http://www.faqs.org/rfcs/rfc2031.html).  Network protocols
don't get standardized in advance of a working prototype, either.

(No, I take that back: there were some that did.  Ever heard of OSI?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Proposal - UUID data type

2008-07-14 Thread Mark Mielke

Gregory Stark wrote:

Mark Mielke [EMAIL PROTECTED] writes:
  

I'm sure one or two examples must exist, but I cannot think of any. Every
enhancement I can think of that eventually made it into a standard, was first
implemented within a popular product, and then demanded as a standard to be
applied to all other products.



C99? SMTP? NTP?

It tends to be important for network protocols since there's no gain in having
non-interoperable protocols.
  


For C99 - GCC had most of the C99 features years before C99 started. 
There are now some incompatibles that need to be dealt with.


For SMTP and NTP I think these protocols are just so old that people 
don't realize how much they have evolved, and how many products existed. 
I wasn't in the know at the time they were written (I was either a baby 
or in grade school), but I bet either: 1) they were written before it 
existed at all (not really an enhancment), or 2) they followed the 
prototype as it was implemented. There have been many extensions to SMTP 
that I have been aware of included support for SSL, that I doubt were in 
the standard first. The RFC is a request for comment. The STD 
process came a lot later.


If we grab a phrase from RFC 1305 for NTP - In Version 3 a new 
algorithm to combine the offsets of a number of peer time servers is 
presented in Appendix F. This algorithm is modelled on those used by 
national standards laboratories to combine the weighted offsets from a 
number of standard clocks to construct a synthetic laboratory timescale 
more accurate than that of any clock separately. This seems pretty 
clear that the standard was updated based upon existing implementation.


To some degree, except for the simplest of designs, it is almost bad to 
write down what WILL be done, without having experience, or a prototype 
to based ones conclusions from. Ivory tower stuff. The purpose of a 
standard is to have one common way that things are done - hopefully the 
best way - not just the only way that was considered. :-)


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]