Re: Enforce primary key on every table during dev?

2018-03-02 Thread marcelo



On 02/03/2018 01:10 , Daevor The Devoted wrote:



On Fri, Mar 2, 2018 at 12:05 AM, Gavin Flower 
> 
wrote:


On 02/03/18 06:47, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar

>> wrote:


    >Adding a surrogate key to such a table just adds overhead,
    although that could be useful
    >in case specific rows need updating or deleting without also
    modifying the other rows with
    >that same data - normally, only insertions and selections
happen
    on such tables though,
    >and updates or deletes are absolutely forbidden - corrections
    happen by inserting rows with
    >an opposite transaction.

    I routinely add surrogate keys like serial col to a table
already
    having a nice candidate keys
    to make it easy to join tables.  SQL starts looking
ungainly when
    you have a 3 col primary
    key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as
you describe) is good practice.
Sure there may be a unique key according to business logic
(which may be consist of those "ungainly" multiple columns),
but guess what, business logic changes, and then you're
screwed! So using a primary key whose sole purpose is to be a
primary key makes perfect sense to me.


I once worked in a data base that had primary keys of at least 4
columns, all character fields, Primary Key could easily exceed 45
characters.  Parent child structure was at least 4 deep.

A child table only needs to know its parent, so there is no
logical need to include its parent and higher tables primary keys,
and then have to add a field to make the composite primary key
unique!  So if every table has int (or long) primary keys, then a
child only need a single field to reference its parent.

Some apparently safe Natural Keys might change unexpectedly.  A
few years aback there was a long thread on Natural versus
Surrogate keys - plenty of examples were using Natural Keys can
give grief when they had to be changed!  I think it best to
isolate a database from external changes as much as is practicable.

Surrogate keys also simply coding, be it in SQL or Java, or
whatever language is flavour of the month.  Also it makes setting
up testdata and debugging easier.

I almost invariably define a Surrogate key when I design tables.


Cheers,
Gavin


Thank you! I think you have expressed far more clearly what I have 
been trying to say. +10 to you.

Me too. Another +10.


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Daevor The Devoted
On Fri, Mar 2, 2018 at 12:05 AM, Gavin Flower  wrote:

> On 02/03/18 06:47, Daevor The Devoted wrote:
>
>>
>> On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar > > wrote:
>>
>>
>> >Adding a surrogate key to such a table just adds overhead,
>> although that could be useful
>> >in case specific rows need updating or deleting without also
>> modifying the other rows with
>> >that same data - normally, only insertions and selections happen
>> on such tables though,
>> >and updates or deletes are absolutely forbidden - corrections
>> happen by inserting rows with
>> >an opposite transaction.
>>
>> I routinely add surrogate keys like serial col to a table already
>> having a nice candidate keys
>> to make it easy to join tables.  SQL starts looking ungainly when
>> you have a 3 col primary
>> key and need to join it with child tables.
>>
>>
>> I was always of the opinion that a mandatory surrogate key (as you
>> describe) is good practice.
>> Sure there may be a unique key according to business logic (which may be
>> consist of those "ungainly" multiple columns), but guess what, business
>> logic changes, and then you're screwed! So using a primary key whose sole
>> purpose is to be a primary key makes perfect sense to me.
>>
>
> I once worked in a data base that had primary keys of at least 4 columns,
> all character fields, Primary Key could easily exceed 45 characters.
> Parent child structure was at least 4 deep.
>
> A child table only needs to know its parent, so there is no logical need
> to include its parent and higher tables primary keys, and then have to add
> a field to make the composite primary key unique!  So if every table has
> int (or long) primary keys, then a child only need a single field to
> reference its parent.
>
> Some apparently safe Natural Keys might change unexpectedly.  A few years
> aback there was a long thread on Natural versus Surrogate keys - plenty of
> examples were using Natural Keys can give grief when they had to be
> changed!  I think it best to isolate a database from external changes as
> much as is practicable.
>
> Surrogate keys also simply coding, be it in SQL or Java, or whatever
> language is flavour of the month.  Also it makes setting up testdata and
> debugging easier.
>
> I almost invariably define a Surrogate key when I design tables.
>
>
> Cheers,
> Gavin
>
>
> Thank you! I think you have expressed far more clearly what I have been
trying to say. +10 to you.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 19:05 , Gavin Flower wrote:

On 02/03/18 06:47, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar > wrote:



    >Adding a surrogate key to such a table just adds overhead,
    although that could be useful
    >in case specific rows need updating or deleting without also
    modifying the other rows with
    >that same data - normally, only insertions and selections happen
    on such tables though,
    >and updates or deletes are absolutely forbidden - corrections
    happen by inserting rows with
    >an opposite transaction.

    I routinely add surrogate keys like serial col to a table already
    having a nice candidate keys
    to make it easy to join tables.  SQL starts looking ungainly when
    you have a 3 col primary
    key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as you 
describe) is good practice.
Sure there may be a unique key according to business logic (which may 
be consist of those "ungainly" multiple columns), but guess what, 
business logic changes, and then you're screwed! So using a primary 
key whose sole purpose is to be a primary key makes perfect sense to me.


I once worked in a data base that had primary keys of at least 4 
columns, all character fields, Primary Key could easily exceed 45 
characters.  Parent child structure was at least 4 deep.


A child table only needs to know its parent, so there is no logical 
need to include its parent and higher tables primary keys, and then 
have to add a field to make the composite primary key unique!  So if 
every table has int (or long) primary keys, then a child only need a 
single field to reference its parent.


Some apparently safe Natural Keys might change unexpectedly.  A few 
years aback there was a long thread on Natural versus Surrogate keys - 
plenty of examples were using Natural Keys can give grief when they 
had to be changed!  I think it best to isolate a database from 
external changes as much as is practicable.


Surrogate keys also simply coding, be it in SQL or Java, or whatever 
language is flavour of the month.  Also it makes setting up testdata 
and debugging easier.


I almost invariably define a Surrogate key when I design tables.


Cheers,
Gavin





+5. I fully agree.

---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 18:41 , Adrian Klaver wrote:

On 03/01/2018 01:26 PM, Ron Johnson wrote:

On 03/01/2018 03:14 PM, Adrian Klaver wrote:

On 03/01/2018 01:03 PM, Ron Johnson wrote:

On 03/01/2018 02:32 PM, David G. Johnston wrote:


There's always the "account number", which is usually synthetic. 
Credit Card numbers are also synthetic. 


Actually, no:

https://en.wikipedia.org/wiki/Payment_card_number

There is a method to the madness, not just random issuance of 
numbers. It was made it relatively easy for folks to *generate 
numbers*. Hence the addition of CSC codes.


Right.  And how do the issuers generate the individual account 
identifier within their IIN ranges?


Who knows, that is their business, though there is nothing to say they 
don't use some sort of internal 'natural' logic. It has been awhile 
since we have gone down this rabbit hole on this list, mostly because 
it is an issue that is usually left at 'we agree to disagree'. Though 
the thing that always strikes me is the assumption that a 
number/surrogate key is less 'natural' then some other sort of tag or 
combination of tags. Because that is what PK's are, a tag to identify 
a record.

+1.






ICD numbers are (relatively)

synthetic, too.

But that doesn't mean we have to use them willy-nilly everywhere.


--
Angular momentum makes the world go 'round.






---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-01 Thread Gavin Flower

On 02/03/18 06:47, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar > wrote:



>Adding a surrogate key to such a table just adds overhead,
although that could be useful
>in case specific rows need updating or deleting without also
modifying the other rows with
>that same data - normally, only insertions and selections happen
on such tables though,
>and updates or deletes are absolutely forbidden - corrections
happen by inserting rows with
>an opposite transaction.

I routinely add surrogate keys like serial col to a table already
having a nice candidate keys
to make it easy to join tables.  SQL starts looking ungainly when
you have a 3 col primary
key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as you 
describe) is good practice.
Sure there may be a unique key according to business logic (which may 
be consist of those "ungainly" multiple columns), but guess what, 
business logic changes, and then you're screwed! So using a primary 
key whose sole purpose is to be a primary key makes perfect sense to me.


I once worked in a data base that had primary keys of at least 4 
columns, all character fields, Primary Key could easily exceed 45 
characters.  Parent child structure was at least 4 deep.


A child table only needs to know its parent, so there is no logical need 
to include its parent and higher tables primary keys, and then have to 
add a field to make the composite primary key unique!  So if every table 
has int (or long) primary keys, then a child only need a single field to 
reference its parent.


Some apparently safe Natural Keys might change unexpectedly.  A few 
years aback there was a long thread on Natural versus Surrogate keys - 
plenty of examples were using Natural Keys can give grief when they had 
to be changed!  I think it best to isolate a database from external 
changes as much as is practicable.


Surrogate keys also simply coding, be it in SQL or Java, or whatever 
language is flavour of the month.  Also it makes setting up testdata and 
debugging easier.


I almost invariably define a Surrogate key when I design tables.


Cheers,
Gavin





Re: Enforce primary key on every table during dev?

2018-03-01 Thread Adrian Klaver

On 03/01/2018 01:26 PM, Ron Johnson wrote:

On 03/01/2018 03:14 PM, Adrian Klaver wrote:

On 03/01/2018 01:03 PM, Ron Johnson wrote:

On 03/01/2018 02:32 PM, David G. Johnston wrote:


There's always the "account number", which is usually synthetic. 
Credit Card numbers are also synthetic. 


Actually, no:

https://en.wikipedia.org/wiki/Payment_card_number

There is a method to the madness, not just random issuance of numbers. 
It was made it relatively easy for folks to *generate numbers*. Hence 
the addition of CSC codes.


Right.  And how do the issuers generate the individual account 
identifier within their IIN ranges?


Who knows, that is their business, though there is nothing to say they 
don't use some sort of internal 'natural' logic. It has been awhile 
since we have gone down this rabbit hole on this list, mostly because it 
is an issue that is usually left at 'we agree to disagree'. Though the 
thing that always strikes me is the assumption that a number/surrogate 
key is less 'natural' then some other sort of tag or combination of 
tags. Because that is what PK's are, a tag to identify a record.






ICD numbers are (relatively)

synthetic, too.

But that doesn't mean we have to use them willy-nilly everywhere.


--
Angular momentum makes the world go 'round.



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 03:14 PM, Adrian Klaver wrote:

On 03/01/2018 01:03 PM, Ron Johnson wrote:

On 03/01/2018 02:32 PM, David G. Johnston wrote:


There's always the "account number", which is usually synthetic. Credit 
Card numbers are also synthetic. 


Actually, no:

https://en.wikipedia.org/wiki/Payment_card_number

There is a method to the madness, not just random issuance of numbers. It 
was made it relatively easy for folks to *generate numbers*. Hence the 
addition of CSC codes.


Right.  And how do the issuers generate the individual account identifier 
within their IIN ranges?




ICD numbers are (relatively)

synthetic, too.

But that doesn't mean we have to use them willy-nilly everywhere.


--
Angular momentum makes the world go 'round.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread David G. Johnston
On Thu, Mar 1, 2018 at 2:06 PM, Tim Cross  wrote:

> +1. And a good test of your underlying data model is whether you can
> identify a natural primary key. If you can't, chances are your model is
> immature/flawed and needs more analysis.
>

​https://en.wikipedia.org/wiki/All_models_are_wrong

Unfortunately identifying a natural primary key doesn't guarantee that
one's model is mature, unblemished, and complete - the model writer may
just not know what they don't know.  But they may know enough, or the
application is constrained enough, for it to be useful anyway.

David J.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Adrian Klaver

On 03/01/2018 01:03 PM, Ron Johnson wrote:

On 03/01/2018 02:32 PM, David G. Johnston wrote:


There's always the "account number", which is usually synthetic. Credit 
Card numbers are also synthetic.  


Actually, no:

https://en.wikipedia.org/wiki/Payment_card_number

There is a method to the madness, not just random issuance of numbers. 
It was made it relatively easy for folks to generate numbers. Hence the 
addition of CSC codes.


ICD numbers are (relatively)

synthetic, too.

But that doesn't mean we have to use them willy-nilly everywhere.


--
Angular momentum makes the world go 'round.



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Tim Cross

Ron Johnson  writes:

> On 03/01/2018 02:08 PM, marcelo wrote:
>>
>>
>> On 01/03/2018 16:42 , Ron Johnson wrote:
>>> On 03/01/2018 01:11 PM, marcelo wrote:

 On 01/03/2018 16:00 , Ron Johnson wrote:
>>> [snip]
> If your only unique index is a synthetic key, then you can insert the 
> same "business data" multiple times with different synthetic keys.
>
>
> -- 
> Angular momentum makes the world go 'round.
 IMHO, business logic can and must preclude "garbage insertion". Except 
 you are inserting data directly to database using SQL, any n-tier 
 architecture will be checking data validity.
>>>
>>> Any n-tier architecture that's bug-free.
>>>
>> Do you know about unit testing?
>
> Way Back When Dinosaurs Still Roamed The Earth and I first learned the 
> trade, the focus was on proper design instead of throwing crud against the 
> wall and hoping tests caught any bugs. Because, of course, unit tests are 
> only as good as you imagination in devising tests.

+1. And a good test of your underlying data model is whether you can
identify a natural primary key. If you can't, chances are your model is
immature/flawed and needs more analysis.

-- 
Tim Cross



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 02:32 PM, David G. Johnston wrote:
On Thu, Mar 1, 2018 at 1:24 PM, Ron Johnson >wrote:


Why have the overhead of a second unique index?  If it's "ease of
joins", then I agree with Francisco Olarte and use the business logic
keys in your joins even though it's a bit of extra work.


​The strongest case, for me, when a surrogate key is highly desirable is 
when there is no truly natural key and the best key for the model is 
potentially alterable. Specific, the "name" of something.  If I add myself 
to a database and make name unique, so David Johnston, then someone else 
comes along with the same name and now I want to add the new person as, 
say David A. Johnston AND rename my existing record to David G. Johnston.  
I keep the needed uniqueness ​and don't need to cobble together other data 
elements.  Or, if I were to use email address as the key the same physical 
entity can now change their address without me having to cascade update 
all FK instances too. Avoiding the FK cascade when enforcing a non-ideal 
PK is a major good reason to assign a surrogate.


There's always the "account number", which is usually synthetic. Credit Card 
numbers are also synthetic.  ICD numbers are (relatively) synthetic, too.


But that doesn't mean we have to use them willy-nilly everywhere.


--
Angular momentum makes the world go 'round.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread geoff hoffman
I found this thread very interesting. 

A pivot table is a perfectly valid use case where a compound unique key on two 
or more columns performs the same function as a primary key without one. 

I’m not nearly as familiar with Postgres as I am with MySQL (which is why I 
recently joined this list)... it may be possible to define a collection of 
tables as a primary key. But if only a unique key is specified in this case, 
everyone would still be ok with the result from a logic design standpoint.

 I think Melvin, way up the thread, had the best answer- be the DBA and have a 
review process. Don’t let folks go adding tables as they like.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 02:44 PM, Daevor The Devoted wrote:



On Thu, Mar 1, 2018 at 10:32 PM, David G. Johnston 
> wrote:


On Thu, Mar 1, 2018 at 1:24 PM, Ron Johnson >wrote:

Why have the overhead of a second unique index?  If it's "ease of
joins", then I agree with Francisco Olarte and use the business
logic keys in your joins even though it's a bit of extra work.


​The strongest case, for me, when a surrogate key is highly desirable
is when there is no truly natural key and the best key for the model
is potentially alterable.  Specific, the "name" of something.  If I
add myself to a database and make name unique, so David Johnston, then
someone else comes along with the same name and now I want to add the
new person as, say David A. Johnston AND rename my existing record to
David G. Johnston.  I keep the needed uniqueness ​and don't need to
cobble together other data elements.  Or, if I were to use email
address as the key the same physical entity can now change their
address without me having to cascade update all FK instances too.
Avoiding the FK cascade when enforcing a non-ideal PK is a major good
reason to assign a surrogate.

David J.


This is exactly my point: you cannot know when a Business Rule is going to 
change. Consider, for example, your Social Security number (or ID number 
as we call it in South Africa). This is unique, right?.


No, the SSN is not unique. 
https://www.computerworld.com/article/2552992/it-management/not-so-unique.html


Tomorrow, however, data of people from multiple countries gets added to 
your DB, and BAM! that ID number is suddenly no longer unique. Business 
Rules can and do change, and we do not know what may change in the future. 
Hence, it is safest to have the surrogate in place from the start, and 
avoid the potential migraine later on.


Disclaimer: this is just my opinion based on my experience (and the pain I 
had to go through when Business Rules changed). I have not done any 
research or conducted any studies on this.



--
Angular momentum makes the world go 'round.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 02:32 PM, marcelo wrote:



On 01/03/2018 17:21 , Ron Johnson wrote:

On 03/01/2018 02:08 PM, marcelo wrote:



On 01/03/2018 16:42 , Ron Johnson wrote:

On 03/01/2018 01:11 PM, marcelo wrote:


On 01/03/2018 16:00 , Ron Johnson wrote:

[snip]
If your only unique index is a synthetic key, then you can insert the 
same "business data" multiple times with different synthetic keys.



--
Angular momentum makes the world go 'round.
IMHO, business logic can and must preclude "garbage insertion". Except 
you are inserting data directly to database using SQL, any n-tier 
architecture will be checking data validity.

bl
Any n-tier architecture that's bug-free.


Do you know about unit testing?


Way Back When Dinosaurs Still Roamed The Earth and I first learned the 
trade, the focus was on proper design instead of throwing crud against 
the wall and hoping tests caught any bugs.  Because, of course, unit 
tests are only as good as you imagination in devising tests.



So, you are fully convinced that there´s no bug free software... Same as I 
(and you) can code following the business rules, you (and me) can design 
unit tests not from "imagination" but from same rules.
Moreover: you can have a surrogate key (to speedup foreign keys) and 
simultaneously put a unique constraint on the columns requiring it. What´s 
the question?


Implementing tests to cover edge cases is much harder than implementing 
business rules in natural (and foreign) keys.



--
Angular momentum makes the world go 'round.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 17:32 , David G. Johnston wrote:
On Thu, Mar 1, 2018 at 1:24 PM, Ron Johnson >wrote:


Why have the overhead of a second unique index?  If it's "ease of
joins", then I agree with Francisco Olarte and use the business
logic keys in your joins even though it's a bit of extra work.


​The strongest case, for me, when a surrogate key is highly desirable 
is when there is no truly natural key and the best key for the model 
is potentially alterable. Specific, the "name" of something.  If I add 
myself to a database and make name unique, so David Johnston, then 
someone else comes along with the same name and now I want to add the 
new person as, say David A. Johnston AND rename my existing record to 
David G. Johnston.  I keep the needed uniqueness ​and don't need to 
cobble together other data elements.  Or, if I were to use email 
address as the key the same physical entity can now change their 
address without me having to cascade update all FK instances too. 
Avoiding the FK cascade when enforcing a non-ideal PK is a major good 
reason to assign a surrogate.


David J.


I suffered myself what David said as an example...


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Daevor The Devoted
On Thu, Mar 1, 2018 at 10:36 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, Mar 1, 2018 at 1:32 PM, marcelo  wrote:
>
>>  What´s the question?
>>
>>
> ​Whether the OP, who hasn't come back, knew they were starting a flame war
> by asking this question...
>
> There is no context-less "right place" to place validation logic, nor are
> the various options mutually exclusive.
>
> David J.
> ​
>

This I can wholeheartedly agree with. And my apologies if I came across as
"flaming". Not my intention at all. I'm simply here to learn (and, well,
offer my opinion from time to time :) )


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Daevor The Devoted
On Thu, Mar 1, 2018 at 10:32 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, Mar 1, 2018 at 1:24 PM, Ron Johnson  wrote:
>
>> Why have the overhead of a second unique index?  If it's "ease of joins",
>> then I agree with Francisco Olarte and use the business logic keys in your
>> joins even though it's a bit of extra work.
>>
>
> ​The strongest case, for me, when a surrogate key is highly desirable is
> when there is no truly natural key and the best key for the model is
> potentially alterable.  Specific, the "name" of something.  If I add myself
> to a database and make name unique, so David Johnston, then someone else
> comes along with the same name and now I want to add the new person as, say
> David A. Johnston AND rename my existing record to David G. Johnston.  I
> keep the needed uniqueness ​and don't need to cobble together other data
> elements.  Or, if I were to use email address as the key the same physical
> entity can now change their address without me having to cascade update all
> FK instances too.  Avoiding the FK cascade when enforcing a non-ideal PK is
> a major good reason to assign a surrogate.
>
> David J.
>
>
This is exactly my point: you cannot know when a Business Rule is going to
change. Consider, for example, your Social Security number (or ID number as
we call it in South Africa). This is unique, right?. Tomorrow, however,
data of people from multiple countries gets added to your DB, and BAM! that
ID number is suddenly no longer unique. Business Rules can and do change,
and we do not know what may change in the future. Hence, it is safest to
have the surrogate in place from the start, and avoid the potential
migraine later on.

Disclaimer: this is just my opinion based on my experience (and the pain I
had to go through when Business Rules changed). I have not done any
research or conducted any studies on this.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Rakesh Kumar


> On Mar 1, 2018, at 12:47 , Daevor The Devoted  wrote:
> 
> 
> I was always of the opinion that a mandatory surrogate key (as you describe) 
> is good practice.
> Sure there may be a unique key according to business logic (which may be 
> consist of those

> "ungainly" multiple columns), but guess what, business logic changes, and 
> then you're screwed!


> So using a primary key whose sole purpose is to be a primary key makes 
> perfect sense to me.

I did not get your point.  Can you explain why a change of business logic makes 
it difficult to change existing 
rows with surrogate key.

thanks.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread David G. Johnston
On Thu, Mar 1, 2018 at 1:32 PM, marcelo  wrote:

>  What´s the question?
>
>
​Whether the OP, who hasn't come back, knew they were starting a flame war
by asking this question...

There is no context-less "right place" to place validation logic, nor are
the various options mutually exclusive.

David J.
​


Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 17:21 , Ron Johnson wrote:

On 03/01/2018 02:08 PM, marcelo wrote:



On 01/03/2018 16:42 , Ron Johnson wrote:

On 03/01/2018 01:11 PM, marcelo wrote:


On 01/03/2018 16:00 , Ron Johnson wrote:

[snip]
If your only unique index is a synthetic key, then you can insert 
the same "business data" multiple times with different synthetic 
keys.



--
Angular momentum makes the world go 'round.
IMHO, business logic can and must preclude "garbage insertion". 
Except you are inserting data directly to database using SQL, any 
n-tier architecture will be checking data validity.

bl
Any n-tier architecture that's bug-free.


Do you know about unit testing?


Way Back When Dinosaurs Still Roamed The Earth and I first learned the 
trade, the focus was on proper design instead of throwing crud against 
the wall and hoping tests caught any bugs.  Because, of course, unit 
tests are only as good as you imagination in devising tests.



So, you are fully convinced that there´s no bug free software... Same as 
I (and you) can code following the business rules, you (and me) can 
design unit tests not from "imagination" but from same rules.
Moreover: you can have a surrogate key (to speedup foreign keys) and 
simultaneously put a unique constraint on the columns requiring it. 
What´s the question?


---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-01 Thread David G. Johnston
On Thu, Mar 1, 2018 at 1:24 PM, Ron Johnson  wrote:

> Why have the overhead of a second unique index?  If it's "ease of joins",
> then I agree with Francisco Olarte and use the business logic keys in your
> joins even though it's a bit of extra work.
>

​The strongest case, for me, when a surrogate key is highly desirable is
when there is no truly natural key and the best key for the model is
potentially alterable.  Specific, the "name" of something.  If I add myself
to a database and make name unique, so David Johnston, then someone else
comes along with the same name and now I want to add the new person as, say
David A. Johnston AND rename my existing record to David G. Johnston.  I
keep the needed uniqueness ​and don't need to cobble together other data
elements.  Or, if I were to use email address as the key the same physical
entity can now change their address without me having to cascade update all
FK instances too.  Avoiding the FK cascade when enforcing a non-ideal PK is
a major good reason to assign a surrogate.

David J.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 02:09 PM, Daevor The Devoted wrote:



On Thu, Mar 1, 2018 at 9:00 PM, Ron Johnson > wrote:


On 03/01/2018 12:32 PM, Daevor The Devoted wrote:



On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson > wrote:


On 03/01/2018 11:47 AM, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar
> wrote:


>Adding a surrogate key to such a table just adds overhead,
although that could be useful
>in case specific rows need updating or deleting without
also modifying the other rows with
>that same data - normally, only insertions and selections
happen on such tables though,
>and updates or deletes are absolutely forbidden -
corrections happen by inserting rows with
>an opposite transaction.

I routinely add surrogate keys like serial col to a table
already having a nice candidate keys
to make it easy to join tables.  SQL starts looking ungainly
when you have a 3 col primary
key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as
you describe) is good practice.
Sure there may be a unique key according to business logic
(which may be consist of those "ungainly" multiple columns), but
guess what, business logic changes, and then you're screwed!


And so you drop the existing index and build a new one.  I've
done it before, and I'll do it again.


So using a primary key whose sole purpose is to be a primary key
makes perfect sense to me.


I can't stand synthetic keys.  By their very nature, they're so
purposelessly arbitrary, and allow you to insert garbage into the
table.


Could you perhaps elaborate on how a surrogate key allows one to
insert garbage into the table? I'm afraid I don't quite get what
you're saying.


If your only unique index is a synthetic key, then you can insert the
same "business data" multiple times with different synthetic keys.


-- 
Angular momentum makes the world go 'round.




That might be where we're talking past each other: I do not advocate for 
the arbitrary primary key being the only unique index. Absolutely not. 
Whatever the business rules say is unique must also have unique indexes. 
If it's a business constraint on the data, it must be enforced in the DB 
(at least, that's how I try to do things).


Why have the overhead of a second unique index?  If it's "ease of joins", 
then I agree with Francisco Olarte and use the business logic keys in your 
joins even though it's a bit of extra work.


--
Angular momentum makes the world go 'round.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread David G. Johnston
On Thu, Mar 1, 2018 at 1:06 PM, Daevor The Devoted 
wrote:

>
>> This seems like hierarchical data
>

​Hence the "this is contrived" disclaimer - but if one allows for
employee-department to be many-to-many, and thus requiring a joining table,
this still applies even if the specific choice to nouns doesn't make sense.

David J.
​


Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo



On 01/03/2018 16:42 , Ron Johnson wrote:

On 03/01/2018 01:11 PM, marcelo wrote:


On 01/03/2018 16:00 , Ron Johnson wrote:

[snip]

If your only unique index is a synthetic key, then you can insert
the same "business data" multiple times with different synthetic keys.


--
Angular momentum makes the world go 'round.

IMHO, business logic can and must preclude "garbage insertion".
Except you are inserting data directly to database using SQL, any
n-tier architecture will be checking data validity.


Any n-tier architecture that's bug-free.


Do you know about unit testing?

---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus




Re: Enforce primary key on every table during dev?

2018-03-01 Thread Daevor The Devoted
On Thu, Mar 1, 2018 at 8:52 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, Mar 1, 2018 at 11:32 AM, Daevor The Devoted 
> wrote:
>
>> Could you perhaps elaborate on how a surrogate key allows one to insert
>> garbage into the table? I'm afraid I don't quite get what you're saying.
>>
>
> ​A bit contrived but it makes the point:​
>
> *Company:*
> C1 (id c1)
> C2 (id c2)
>
> *Department:*
> C1-D1 (id d1)
> C1-D2 (id d2)
> C2-D1 (id d3)
> C2-D2 (id d4)
>
> *Employee:*
> C1-E1 (id e1)
> C1-E2 (id e2)
> C2-E1 (id e3)
> C2-E2 (id e4)
>
> *​Employee-Department​:*
> e1-d1
> e2-d2
> e3-d2
> e4-d4
>
> The pair e3-d2 is invalid because e3 belongs to company c2 while d2
> belongs to company c1 - but we've hidden the knowledge ​of c# behind the
> surrogate key and now we can insert garbage into employee-department.
>
> David J.
>
>
This seems like hierarchical data, where employee's parent should be
department, and department's parent is company. So it wouldn't be possible
to "insert garbage" since Company is not stored in the Employee table, only
a reference to Department (and Company determined via Department). Isn't
that how normal hierarchical data works?


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 01:11 PM, marcelo wrote:


On 01/03/2018 16:00 , Ron Johnson wrote:

[snip]
If your only unique index is a synthetic key, then you can insert the 
same "business data" multiple times with different synthetic keys.



--
Angular momentum makes the world go 'round.
IMHO, business logic can and must preclude "garbage insertion". Except you 
are inserting data directly to database using SQL, any n-tier architecture 
will be checking data validity.


Any n-tier architecture that's bug-free.

--
Angular momentum makes the world go 'round.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 01:05 PM, Melvin Davidson wrote:



On Thu, Mar 1, 2018 at 2:00 PM, Ron Johnson > wrote:


On 03/01/2018 12:32 PM, Daevor The Devoted wrote:


[snip]



If your only unique index is a synthetic key, then you can insert the
same "business data" multiple times with different synthetic keys.



-- 
Angular momentum makes the world go 'round.



*If you are going to go to the trouble of having a surrogate/synthetic 
key, then you may as well have a primary key , which is much better. *


I completely agree.

--
Angular momentum makes the world go 'round.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread marcelo


On 01/03/2018 16:00 , Ron Johnson wrote:

On 03/01/2018 12:32 PM, Daevor The Devoted wrote:



On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson > wrote:



On 03/01/2018 11:47 AM, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar
> wrote:


>Adding a surrogate key to such a table just adds overhead,
although that could be useful
>in case specific rows need updating or deleting without
also modifying the other rows with
>that same data - normally, only insertions and selections
happen on such tables though,
>and updates or deletes are absolutely forbidden -
corrections happen by inserting rows with
>an opposite transaction.

I routinely add surrogate keys like serial col to a table
already having a nice candidate keys
to make it easy to join tables.  SQL starts looking ungainly
when you have a 3 col primary
key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as
you describe) is good practice.
Sure there may be a unique key according to business logic
(which may be consist of those "ungainly" multiple columns), but
guess what, business logic changes, and then you're screwed!


And so you drop the existing index and build a new one.  I've
done it before, and I'll do it again.


So using a primary key whose sole purpose is to be a primary key
makes perfect sense to me.


I can't stand synthetic keys.  By their very nature, they're so
purposelessly arbitrary, and allow you to insert garbage into the
table.


Could you perhaps elaborate on how a surrogate key allows one to 
insert garbage into the table? I'm afraid I don't quite get what 
you're saying.


If your only unique index is a synthetic key, then you can insert the 
same "business data" multiple times with different synthetic keys.



--
Angular momentum makes the world go 'round.
IMHO, business logic can and must preclude "garbage insertion". Except 
you are inserting data directly to database using SQL, any n-tier 
architecture will be checking data validity.



---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Melvin Davidson
On Thu, Mar 1, 2018 at 2:00 PM, Ron Johnson  wrote:

> On 03/01/2018 12:32 PM, Daevor The Devoted wrote:
>
>
>
> On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson  wrote:
>
>>
>> On 03/01/2018 11:47 AM, Daevor The Devoted wrote:
>>
>>
>> On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar 
>> wrote:
>>
>>>
>>> >Adding a surrogate key to such a table just adds overhead, although
>>> that could be useful
>>> >in case specific rows need updating or deleting without also modifying
>>> the other rows with
>>> >that same data - normally, only insertions and selections happen on
>>> such tables though,
>>> >and updates or deletes are absolutely forbidden - corrections happen by
>>> inserting rows with
>>> >an opposite transaction.
>>>
>>> I routinely add surrogate keys like serial col to a table already having
>>> a nice candidate keys
>>> to make it easy to join tables.  SQL starts looking ungainly when you
>>> have a 3 col primary
>>> key and need to join it with child tables.
>>>
>>>
>> I was always of the opinion that a mandatory surrogate key (as you
>> describe) is good practice.
>> Sure there may be a unique key according to business logic (which may be
>> consist of those "ungainly" multiple columns), but guess what, business
>> logic changes, and then you're screwed!
>>
>>
>> And so you drop the existing index and build a new one.  I've done it
>> before, and I'll do it again.
>>
>> So using a primary key whose sole purpose is to be a primary key makes
>> perfect sense to me.
>>
>>
>> I can't stand synthetic keys.  By their very nature, they're so
>> purposelessly arbitrary, and allow you to insert garbage into the table.
>>
>
> Could you perhaps elaborate on how a surrogate key allows one to insert
> garbage into the table? I'm afraid I don't quite get what you're saying.
>
>
> If your only unique index is a synthetic key, then you can insert the same
> "business data" multiple times with different synthetic keys.
>
>
> --
> Angular momentum makes the world go 'round.
>

* If you are going to go to the trouble of having a surrogate/synthetic
key, then you may as well have a primary key , which is much better. *

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 12:32 PM, Daevor The Devoted wrote:



On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson > wrote:



On 03/01/2018 11:47 AM, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar > wrote:


>Adding a surrogate key to such a table just adds overhead,
although that could be useful
>in case specific rows need updating or deleting without also
modifying the other rows with
>that same data - normally, only insertions and selections happen
on such tables though,
>and updates or deletes are absolutely forbidden - corrections
happen by inserting rows with
>an opposite transaction.

I routinely add surrogate keys like serial col to a table already
having a nice candidate keys
to make it easy to join tables.  SQL starts looking ungainly when
you have a 3 col primary
key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as you
describe) is good practice.
Sure there may be a unique key according to business logic (which may
be consist of those "ungainly" multiple columns), but guess what,
business logic changes, and then you're screwed!


And so you drop the existing index and build a new one. I've done it
before, and I'll do it again.


So using a primary key whose sole purpose is to be a primary key
makes perfect sense to me.


I can't stand synthetic keys.  By their very nature, they're so
purposelessly arbitrary, and allow you to insert garbage into the table.


Could you perhaps elaborate on how a surrogate key allows one to insert 
garbage into the table? I'm afraid I don't quite get what you're saying.


If your only unique index is a synthetic key, then you can insert the same 
"business data" multiple times with different synthetic keys.



--
Angular momentum makes the world go 'round.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread David G. Johnston
On Thu, Mar 1, 2018 at 11:32 AM, Daevor The Devoted 
wrote:

> Could you perhaps elaborate on how a surrogate key allows one to insert
> garbage into the table? I'm afraid I don't quite get what you're saying.
>

​A bit contrived but it makes the point:​

*Company:*
C1 (id c1)
C2 (id c2)

*Department:*
C1-D1 (id d1)
C1-D2 (id d2)
C2-D1 (id d3)
C2-D2 (id d4)

*Employee:*
C1-E1 (id e1)
C1-E2 (id e2)
C2-E1 (id e3)
C2-E2 (id e4)

*​Employee-Department​:*
e1-d1
e2-d2
e3-d2
e4-d4

The pair e3-d2 is invalid because e3 belongs to company c2 while d2 belongs
to company c1 - but we've hidden the knowledge ​of c# behind the surrogate
key and now we can insert garbage into employee-department.

David J.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Francisco Olarte
On Thu, Mar 1, 2018 at 5:22 PM, Steven Lembark  wrote:
> If you can say that "rows containing the same values are not
> duplicates"

Not a native speaker, but "Rows having the same values" seems to me
the definition of duplicate ( ;-), J.K. )

> then you have a database that cannot be queried, audited,
> or managed effectively. The problem is that you cannot identify the
> rows, and thus cannot select related ones, update them (e.g., to
> expire outdated records), or validate the content against any external
> values (e.g., audit POS tapes using the database).

Good point. All the times I've found myself with complete duplicates
allowed I've alwasy found the correct model is no duplicates + count
field ( with possible splits as you pointed below ). I would not have
a "marbles" table with (red, red, blue, white, red, white), I would
switch it to red=3, blue=1, white=2.

Francisco Olarte.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Francisco Olarte
On Thu, Mar 1, 2018 at 1:07 PM, Rakesh Kumar  wrote:
...
> I routinely add surrogate keys like serial col to a table already having a 
> nice candidate keys
> to make it easy to join tables.  SQL starts looking ungainly when you have a 
> 3 col primary
> key and need to join it with child tables.

It does, but many times useful, let me explain:

table currencies ( code text, description text), primary key code (
i.e. "USD", "US Dollars" )
table sellers ( currency text, id number, .), primary key
(currency, id), foreign key currency references currencies
table buyers ( currency text, id number, .), primary key
(currency, id)  foreign key currency references currencies
table transactions ( currency text, seller_id number, buyer_id number,
trans_id number )
   primery key trans_id,
   foreign key currency references currencies,
   foreign key (currency, seller_id ) references sellers,
   foreign key (currency, buyer_id ) references buyers

This is a bit unwieldy, but it expreses my example constraint, buyers
can only buy from a seller with the same currency, there is no way to
insert a cross-currency transaction.

Of course, 3 femtoseconds after deployment the PHB will decide you can
do cross-currency sales.

Francisco Olarte.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Daevor The Devoted
On Thu, Mar 1, 2018 at 8:18 PM, Ron Johnson  wrote:

> On 03/01/2018 11:47 AM, Daevor The Devoted wrote:
>
>
> On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar 
> wrote:
>
>>
>> >Adding a surrogate key to such a table just adds overhead, although that
>> could be useful
>> >in case specific rows need updating or deleting without also modifying
>> the other rows with
>> >that same data - normally, only insertions and selections happen on such
>> tables though,
>> >and updates or deletes are absolutely forbidden - corrections happen by
>> inserting rows with
>> >an opposite transaction.
>>
>> I routinely add surrogate keys like serial col to a table already having
>> a nice candidate keys
>> to make it easy to join tables.  SQL starts looking ungainly when you
>> have a 3 col primary
>> key and need to join it with child tables.
>>
>>
> I was always of the opinion that a mandatory surrogate key (as you
> describe) is good practice.
> Sure there may be a unique key according to business logic (which may be
> consist of those "ungainly" multiple columns), but guess what, business
> logic changes, and then you're screwed!
>
>
> And so you drop the existing index and build a new one.  I've done it
> before, and I'll do it again.
>
> So using a primary key whose sole purpose is to be a primary key makes
> perfect sense to me.
>
>
> I can't stand synthetic keys.  By their very nature, they're so
> purposelessly arbitrary, and allow you to insert garbage into the table.
>
> --
> Angular momentum makes the world go 'round.
>

Could you perhaps elaborate on how a surrogate key allows one to insert
garbage into the table? I'm afraid I don't quite get what you're saying.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Francisco Olarte
On Thu, Mar 1, 2018 at 9:20 AM, Alban Hertroys  wrote:

> Not to mention that not all types of tables necessarily have suitable 
> candidates for a primary key.

They do if they are in 1NF. ( no dupes alllowed )

> An example of such tables is a monetary transaction table that contains 
> records for deposits and withdrawals to accounts. It will have lots of 
> foreign key references to other tables, but rows containing the same values 
> are probably not duplicates.

That's a bad example. They would normally have a transaction id, or a
timestamp, or a sequence counter. PKs can expand all non-nullable
columns. You could try to come with a real example, but all the times
I've found these in one of my dessigns is because I didn't correctly
model the "real world".

> Adding a surrogate key to such a table just adds overhead, although that 
> could be useful in case specific rows need updating or deleting without also 
> modifying the other rows with that same data - normally, only insertions and 
> selections happen on such tables though, and updates or deletes are 
> absolutely forbidden - corrections happen by inserting rows with an opposite 
> transaction.

And normally you would need to pinpoint an individual transaction for
selection, hard to do if you do not have a pk.

Francisco Olarte.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Francisco Olarte
Melvin:

On Thu, Mar 1, 2018 at 1:47 AM, Melvin Davidson  wrote:
> But this begs the question,
> why are "developers" allowed to design database tables? That should be the
> job of the DBA!

That's the DBA wearing her developer hat. ( I agree with the spirit )

Francisco Olarte.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 11:47 AM, Daevor The Devoted wrote:


On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar > wrote:



>Adding a surrogate key to such a table just adds overhead, although
that could be useful
>in case specific rows need updating or deleting without also
modifying the other rows with
>that same data - normally, only insertions and selections happen on
such tables though,
>and updates or deletes are absolutely forbidden - corrections happen
by inserting rows with
>an opposite transaction.

I routinely add surrogate keys like serial col to a table already
having a nice candidate keys
to make it easy to join tables.  SQL starts looking ungainly when you
have a 3 col primary
key and need to join it with child tables.


I was always of the opinion that a mandatory surrogate key (as you 
describe) is good practice.
Sure there may be a unique key according to business logic (which may be 
consist of those "ungainly" multiple columns), but guess what, business 
logic changes, and then you're screwed!


And so you drop the existing index and build a new one.  I've done it 
before, and I'll do it again.


So using a primary key whose sole purpose is to be a primary key makes 
perfect sense to me.


I can't stand synthetic keys.  By their very nature, they're so 
purposelessly arbitrary, and allow you to insert garbage into the table.


--
Angular momentum makes the world go 'round.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Daevor The Devoted
On Thu, Mar 1, 2018 at 2:07 PM, Rakesh Kumar  wrote:

>
> >Adding a surrogate key to such a table just adds overhead, although that
> could be useful
> >in case specific rows need updating or deleting without also modifying
> the other rows with
> >that same data - normally, only insertions and selections happen on such
> tables though,
> >and updates or deletes are absolutely forbidden - corrections happen by
> inserting rows with
> >an opposite transaction.
>
> I routinely add surrogate keys like serial col to a table already having a
> nice candidate keys
> to make it easy to join tables.  SQL starts looking ungainly when you have
> a 3 col primary
> key and need to join it with child tables.
>
>
I was always of the opinion that a mandatory surrogate key (as you
describe) is good practice.
Sure there may be a unique key according to business logic (which may be
consist of those "ungainly" multiple columns), but guess what, business
logic changes, and then you're screwed! So using a primary key whose sole
purpose is to be a primary key makes perfect sense to me.


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 11:07 AM, Steve Atkins wrote:
[snip]

"Every table should have a primary key, whether natural or surrogate"
is a great guideline, and everyone should follow it until they understand
when they shouldn't.


Most people think they know, but they don't.


--
Angular momentum makes the world go 'round.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Steve Atkins
(Melvin's mail doesn't quote properly; I've tried to fix it but may have broken 
something)

> On Mar 1, 2018, at 8:50 AM, Melvin Davidson  wrote:
> 
> 
> On Thu, Mar 1, 2018 at 11:43 AM, Alban Hertroys  wrote:
> 
> 
>> On 1 March 2018 at 17:22, Steven Lembark  wrote:
>>> If you have a design with un-identified data it means that you havn't
>>> normalized it properly: something is missing from the table with
>>> un-identifiable rows.
>> 
>> While that holds true for a relational model, in reporting for
>> example, it is common practice to denormalize data without a
>> requirement to be able to identify a single record. The use case for
>> such tables is providing quick aggregates on the data. Often this
>> deals with derived data. It's not that uncommon to not have a primary
>> or even a uniquely identifiable key on such tables.
>> 
>> I do not disagree that having a primary key on a table is a bad thing,
>> but I do disagree that a primary key is a requirement for all tables.
>> 
>> More generally: For every rule there are exceptions. Even for this one.
> 
> You may perceive that to be "common practice", but in reality it is not, and 
> in fact a bad one. As was previously stated, PosgreSQL is a _relational_ 
> database,
> and breaking that premise will eventually land you in very big trouble. There 
> is no solid reason not to a primary key for every table.

Sure there is. It's an additional index and significant additional insert / 
update overhead.
If you're never going to retrieve single rows, nor join in such a way that 
uniqueness
on this side is required there's no need for a unique identifier.

It's a rare case that you won't want a primary key, and I'll often add 
a surrogate one for convenience even when it's not actually needed,
but there are cases where it's appropriate not to have one, even in
OLTP work. Log tables, for example.

"Every table should have a primary key, whether natural or surrogate"
is a great guideline, and everyone should follow it until they understand
when they shouldn't.

More generally: For every rule there are exceptions. Even for this one.

Cheers,
  Steve


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Melvin Davidson
On Thu, Mar 1, 2018 at 11:43 AM, Alban Hertroys  wrote:

> On 1 March 2018 at 17:22, Steven Lembark  wrote:
> >
> >> On 03/01/2018 02:20 AM, Alban Hertroys wrote:
> >> [snip]
> >> > Not to mention that not all types of tables necessarily have
> >> > suitable candidates for a primary key. You could add a surrogate
> >> > key based on a serial type, but in such cases that may not serve
> >> > any purpose other than to have some arbitrary primary key.
> >> >
> >> > An example of such tables is a monetary transaction table that
> >> > contains records for deposits and withdrawals to accounts.
>
> (...)
>
> > Start with Date's notion that a database exists to correclty represent
> > data about the real world. Storing un-identified data breaks this
> > since we have no idea what the data means or have any good way of
> > getting it back out. Net result is that any workable relational
> > database will have at least one candidate key for any table in it.
>
> (...)
>
> > If you have a design with un-identified data it means that you havn't
> > normalized it properly: something is missing from the table with
> > un-identifiable rows.
>
> While that holds true for a relational model, in reporting for
> example, it is common practice to denormalize data without a
> requirement to be able to identify a single record. The use case for
> such tables is providing quick aggregates on the data. Often this
> deals with derived data. It's not that uncommon to not have a primary
> or even a uniquely identifiable key on such tables.
>
> I do not disagree that having a primary key on a table is a bad thing,
> but I do disagree that a primary key is a requirement for all tables.
>
> More generally: For every rule there are exceptions. Even for this one.
> --
> If you can't see the forest for the trees,
> Cut the trees and you'll see there is no forest.
>
>



*> it is common practice to denormalize data without a>requirement to be
able to identify a single record *

*You may perceive that to be "common practice", but in reality it is not,
and in fact a bad one. As was previously stated, PosgreSQL is a
_relational_ database,*
*and breaking that premise will eventually land you in very big trouble.
There is no solid reason not to a primary key for every table.*

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Alban Hertroys
On 1 March 2018 at 17:22, Steven Lembark  wrote:
>
>> On 03/01/2018 02:20 AM, Alban Hertroys wrote:
>> [snip]
>> > Not to mention that not all types of tables necessarily have
>> > suitable candidates for a primary key. You could add a surrogate
>> > key based on a serial type, but in such cases that may not serve
>> > any purpose other than to have some arbitrary primary key.
>> >
>> > An example of such tables is a monetary transaction table that
>> > contains records for deposits and withdrawals to accounts.

(...)

> Start with Date's notion that a database exists to correclty represent
> data about the real world. Storing un-identified data breaks this
> since we have no idea what the data means or have any good way of
> getting it back out. Net result is that any workable relational
> database will have at least one candidate key for any table in it.

(...)

> If you have a design with un-identified data it means that you havn't
> normalized it properly: something is missing from the table with
> un-identifiable rows.

While that holds true for a relational model, in reporting for
example, it is common practice to denormalize data without a
requirement to be able to identify a single record. The use case for
such tables is providing quick aggregates on the data. Often this
deals with derived data. It's not that uncommon to not have a primary
or even a uniquely identifiable key on such tables.

I do not disagree that having a primary key on a table is a bad thing,
but I do disagree that a primary key is a requirement for all tables.

More generally: For every rule there are exceptions. Even for this one.
-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Steven Lembark

> On 03/01/2018 02:20 AM, Alban Hertroys wrote:
> [snip]
> > Not to mention that not all types of tables necessarily have
> > suitable candidates for a primary key. You could add a surrogate
> > key based on a serial type, but in such cases that may not serve
> > any purpose other than to have some arbitrary primary key.
> >
> > An example of such tables is a monetary transaction table that
> > contains records for deposits and withdrawals to accounts. It will
> > have lots of foreign key references to other tables, but rows
> > containing the same values are probably not duplicates. Adding a
> > surrogate key to such a table just adds overhead, although that
> > could be useful in case specific rows need updating or deleting
> > without also modifying the other rows with that same data -
> > normally, only insertions and selections happen on such tables
> > though, and updates or deletes are absolutely forbidden -
> > corrections happen by inserting rows with an opposite transaction.  
> 
> Wouldn't the natural pk of such a table be timestamp+seqno, just as
> the natural pk of a transaction_detail table be transaction_no+seqno?

Start with Date's notion that a database exists to correclty represent
data about the real world. Storing un-identified data breaks this 
since we have no idea what the data means or have any good way of 
getting it back out. Net result is that any workable relational 
database will have at least one candidate key for any table in it.

If you can say that "rows containing the same values are not
duplicates" then you have a database that cannot be queried, audited,
or managed effectively. The problem is that you cannot identify the 
rows, and thus cannot select related ones, update them (e.g., to 
expire outdated records), or validate the content against any external
values (e.g., audit POS tapes using the database).

In the case of a monitary transaction you need a transaction
table, which will have most of the FK's, and a ledger for the 
transaction amounts.

A minimum candidate key for the transaction table would be account, 
timestamp, authorizing customer id, and channel. This allows two 
people to, say, make deposits at the same time or the same authorizing 
account (e.g., a credit card number) to be processed at the same time 
in two places.

The data for a transaction would include things like the final status, 
in-house authorizing agent, completion time.

The ledger entries would include the transaction SK, sequence within
the transaction, amount, and account. The ledger's candidate key is 
a transaction SK + sequence number -- the amount and account don't 
work because you may end up, say, making multiple deposits of $10
to your checking account on the same transaction.

The ledger's sequence value can be problematic, requiring external
code or moderately messy triggers to manage. Timestamps don't always
work and are subject to clock-skew. One way to avoid this is require
that a single transaction contain only unique amounts and accounts.
At that point the ledger becomes a degenerate table of  transaction id, 
amount, account (i.e., the entire table is nothing but a unique index).

This might require generating multiple database transactions for a
single external process (e.g., a customer walking up to the teller)
but does simplify processing quite a bit.

In both cases, having an SK on the ledger is useful for audit queries,
which might have to process a large number of ledger entries in code.
Extracting the ledger SK's in one query and walking down them using
a unique index can be more effecient than having to extract the values.

Either way, you can identify all of the transactions as unique and 
all of the ledger entries for that transaction. At that point the 
database can be queried for data, updated as necessary, audited 
against external data.

If you have a design with un-identified data it means that you havn't
normalized it properly: something is missing from the table with 
un-identifiable rows.

-- 
Steven Lembark   1505 National Ave
Workhorse Computing Rockford, IL 61103
lemb...@wrkhors.com+1 888 359 3508



Re: Enforce primary key on every table during dev?

2018-03-01 Thread bto...@computer.org
- Original Message -
> From: "Tim Cross" 
> Sent: Wednesday, February 28, 2018 4:07:43 PM
> 
> Jeremy Finzel  writes:
> 
> > We want to enforce a policy, partly just to protect those who might forget,
> > for every table in a particular schema to have a primary key.  This can't
> > be done with event triggers as far as I can see, because it is quite
> > legitimate to do:
> >
> > BEGIN;
> > CREATE TABLE foo (id int);
> > ALTER TABLE foo ADD PRIMARY KEY (id);
> > COMMIT;
> >
> > It would be nice to have some kind of "deferrable event trigger" or some
> > way to enforce that no transaction commits which added a table without a
> > primary key.
> >
> 
> I think you would be better off having an automated report which alerts
> you to tables lacking a primary key and deal with that policy through
> other means. Using triggers in this way often leads to unexpected
> behaviour and difficult to identify bugs. The policy is a management
> policy and probably should be dealt with via management channels rather
> than technical ones. Besides, the likely outcome will be your developers
> will just adopt the practice of adding a serial column to every table,
> which in itself doesn't really add any value.

I concur with other respondents that suggest this is more of a policy issue. In 
fact, you yourself identify it right there in the first sentence as a policy 
issue! 

One tool that changed my life (as a PostgreSQL enthusiast) forever is David 
Wheeler's pgTAP (http://pgtap.org/) tool. It includes a suite of functionality 
to assess the database schema via automated testing. Part of a rigorous 
development environment might include using this tool so that any 
application/database changes be driven by tests, and then your code review 
process would assure that the appropriate tests are added to the pgTAP script 
to confirm that changes meet a policy standard such as what you are demanding. 
I can't imagine doing PostgreSQL development without it now.

Same guy also produced a related tool called Sqitch (http://sqitch.org/) for 
data base change management. Use these tools together, so that before a 
developer is allowed to check in a feature branch, your teams' code review 
process maintains rigorous oversight of modifications.

-- B




Re: Enforce primary key on every table during dev?

2018-03-01 Thread Rakesh Kumar

>Adding a surrogate key to such a table just adds overhead, although that could 
>be useful 
>in case specific rows need updating or deleting without also modifying the 
>other rows with 
>that same data - normally, only insertions and selections happen on such 
>tables though, 
>and updates or deletes are absolutely forbidden - corrections happen by 
>inserting rows with 
>an opposite transaction.

I routinely add surrogate keys like serial col to a table already having a nice 
candidate keys
to make it easy to join tables.  SQL starts looking ungainly when you have a 3 
col primary
key and need to join it with child tables.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Ron Johnson

On 03/01/2018 02:20 AM, Alban Hertroys wrote:
[snip]

Not to mention that not all types of tables necessarily have suitable 
candidates for a primary key. You could add a surrogate key based on a serial 
type, but in such cases that may not serve any purpose other than to have some 
arbitrary primary key.

An example of such tables is a monetary transaction table that contains records 
for deposits and withdrawals to accounts. It will have lots of foreign key 
references to other tables, but rows containing the same values are probably 
not duplicates.
Adding a surrogate key to such a table just adds overhead, although that could 
be useful in case specific rows need updating or deleting without also 
modifying the other rows with that same data - normally, only insertions and 
selections happen on such tables though, and updates or deletes are absolutely 
forbidden - corrections happen by inserting rows with an opposite transaction.


Wouldn't the natural pk of such a table be timestamp+seqno, just as the 
natural pk of a transaction_detail table be transaction_no+seqno?


--
Angular momentum makes the world go 'round.



Re: Enforce primary key on every table during dev?

2018-03-01 Thread Alban Hertroys

> On 1 Mar 2018, at 1:47, Melvin Davidson  wrote:

> > I think you would be better off having an automated report which alerts
> >you to tables lacking a primary key and deal with that policy through
> >other means. 
> 
> Perhaps a better solution is to have a meeting with the developers and 
> explain to them 
> WHY the policy of enforcing a primary key is important. Also, explain the 
> purpose of
> primary keys and why it is not always suitable to just use an integer or 
> serial as the key,
> but rather why natural unique (even multi column) keys are better. But this 
> begs the question, 
> why are "developers" allowed to design database tables? That should be the 
> job of the DBA! At
> the very minimum, the DBA should be reviewing and have the authority to 
> approve of disapprove 
> of table/schema designs/changes .

Not to mention that not all types of tables necessarily have suitable 
candidates for a primary key. You could add a surrogate key based on a serial 
type, but in such cases that may not serve any purpose other than to have some 
arbitrary primary key.

An example of such tables is a monetary transaction table that contains records 
for deposits and withdrawals to accounts. It will have lots of foreign key 
references to other tables, but rows containing the same values are probably 
not duplicates.
Adding a surrogate key to such a table just adds overhead, although that could 
be useful in case specific rows need updating or deleting without also 
modifying the other rows with that same data - normally, only insertions and 
selections happen on such tables though, and updates or deletes are absolutely 
forbidden - corrections happen by inserting rows with an opposite transaction.

Regards,

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.




Re: Enforce primary key on every table during dev?

2018-02-28 Thread Melvin Davidson
On Wed, Feb 28, 2018 at 4:07 PM, Tim Cross  wrote:

>
> Jeremy Finzel  writes:
>
> > We want to enforce a policy, partly just to protect those who might
> forget,
> > for every table in a particular schema to have a primary key.  This can't
> > be done with event triggers as far as I can see, because it is quite
> > legitimate to do:
> >
> > BEGIN;
> > CREATE TABLE foo (id int);
> > ALTER TABLE foo ADD PRIMARY KEY (id);
> > COMMIT;
> >
> > It would be nice to have some kind of "deferrable event trigger" or some
> > way to enforce that no transaction commits which added a table without a
> > primary key.
> >
>
> I think you would be better off having an automated report which alerts
> you to tables lacking a primary key and deal with that policy through
> other means. Using triggers in this way often leads to unexpected
> behaviour and difficult to identify bugs. The policy is a management
> policy and probably should be dealt with via management channels rather
> than technical ones. Besides, the likely outcome will be your developers
> will just adopt the practice of adding a serial column to every table,
> which in itself doesn't really add any value.
>
> Tim
>
>
> --
> Tim Cross
>
>










*> I think you would be better off having an automated report which
alerts>you to tables lacking a primary key and deal with that policy
through>other means. Perhaps a better solution is to have a meeting with
the developers and explain to them WHY the policy of enforcing a primary
key is important. Also, explain the purpose ofprimary keys and why it is
not always suitable to just use an integer or serial as the key,but rather
why natural unique (even multi column) keys are better. But this begs the
question, why are "developers" allowed to design database tables? That
should be the job of the DBA! Atthe very minimum, the DBA should be
reviewing and have the authority to approve of disapprove of table/schema
designs/changes .*

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


Re: Enforce primary key on every table during dev?

2018-02-28 Thread Tim Cross

Jeremy Finzel  writes:

> We want to enforce a policy, partly just to protect those who might forget,
> for every table in a particular schema to have a primary key.  This can't
> be done with event triggers as far as I can see, because it is quite
> legitimate to do:
>
> BEGIN;
> CREATE TABLE foo (id int);
> ALTER TABLE foo ADD PRIMARY KEY (id);
> COMMIT;
>
> It would be nice to have some kind of "deferrable event trigger" or some
> way to enforce that no transaction commits which added a table without a
> primary key.
>

I think you would be better off having an automated report which alerts
you to tables lacking a primary key and deal with that policy through
other means. Using triggers in this way often leads to unexpected
behaviour and difficult to identify bugs. The policy is a management
policy and probably should be dealt with via management channels rather
than technical ones. Besides, the likely outcome will be your developers
will just adopt the practice of adding a serial column to every table,
which in itself doesn't really add any value.

Tim


-- 
Tim Cross



Re: Enforce primary key on every table during dev?

2018-02-28 Thread David G. Johnston
On Wed, Feb 28, 2018 at 6:34 AM, Jeremy Finzel  wrote:

> We want to enforce a policy, partly just to protect those who might
> forget, for every table in a particular schema to have a primary key.  This
> can't be done with event triggers as far as I can see, because it is quite
> legitimate to do:
>

​Add a query to your test suite that queries the catalogs and fails if this
policy is violated.  There is nothing in a running PostgreSQL server
instance that is going to enforce this for you.

David J.


Re: Enforce primary key on every table during dev?

2018-02-28 Thread John McKown
On Wed, Feb 28, 2018 at 7:57 AM, Adrian Klaver 
wrote:

> On 02/28/2018 05:52 AM, John McKown wrote:
>
>> On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel > >wrote:
>>
>> We want to enforce a policy, partly just to protect those who might
>> forget, for every table in a particular schema to have a primary
>> key.  This can't be done with event triggers as far as I can see,
>> because it is quite legitimate to do:
>>
>> BEGIN;
>> CREATE TABLE foo (id int);
>> ALTER TABLE foo ADD PRIMARY KEY (id);
>> COMMIT;
>>
>> It would be nice to have some kind of "deferrable event trigger" or
>> some way to enforce that no transaction commits which added a table
>> without a primary key.
>>
>> Any ideas?
>>
>> Thanks,
>> Jeremy
>>
>>
>>
>> ​What stops somebody from doing:
>>
>> CREATE TABLE foo (filler text primary key default null, realcol1 int,
>> realcol2 text);
>>
>> And then just never bother to ever insert anything into the column
>> FILLER? It fulfills your stated requirement​ of every table having a
>>
>
> Then you would get this:
>
> test=# CREATE TABLE foo (filler text primary key default null, realcol1
> int, realcol2 text);
> CREATE TABLE
> test=# insert into  foo (realcol1, realcol2) values (1, 'test');
> ERROR:  null value in column "filler" violates not-null constraint
> DETAIL:  Failing row contains (null, 1, test).
>

​Hum, it's been so long, I totally forgot. Which makes me wonder why the
parser doesn't "know" that a default of NULL for a primary key is going to
fail anyway and flag it at CREATE time. Oh, well. Thanks.​



>
> primary key. Of course, you could amend the policy to say a "non-NULL
>> primary key".
>>
>>
>>
>> --
>> I have a theory that it's impossible to prove anything, but I can't prove
>> it.
>>
>> Maranatha! <><
>> John McKown
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
I have a theory that it's impossible to prove anything, but I can't prove
it.

Maranatha! <><
John McKown


Re: Enforce primary key on every table during dev?

2018-02-28 Thread Melvin Davidson
On Wed, Feb 28, 2018 at 8:57 AM, Adrian Klaver 
wrote:

> On 02/28/2018 05:52 AM, John McKown wrote:
>
>> On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel > >wrote:
>>
>> We want to enforce a policy, partly just to protect those who might
>> forget, for every table in a particular schema to have a primary
>> key.  This can't be done with event triggers as far as I can see,
>> because it is quite legitimate to do:
>>
>> BEGIN;
>> CREATE TABLE foo (id int);
>> ALTER TABLE foo ADD PRIMARY KEY (id);
>> COMMIT;
>>
>> It would be nice to have some kind of "deferrable event trigger" or
>> some way to enforce that no transaction commits which added a table
>> without a primary key.
>>
>> Any ideas?
>>
>> Thanks,
>> Jeremy
>>
>>
>>
>> ​What stops somebody from doing:
>>
>> CREATE TABLE foo (filler text primary key default null, realcol1 int,
>> realcol2 text);
>>
>> And then just never bother to ever insert anything into the column
>> FILLER? It fulfills your stated requirement​ of every table having a
>>
>
> Then you would get this:
>
> test=# CREATE TABLE foo (filler text primary key default null, realcol1
> int, realcol2 text);
> CREATE TABLE
> test=# insert into  foo (realcol1, realcol2) values (1, 'test');
> ERROR:  null value in column "filler" violates not-null constraint
> DETAIL:  Failing row contains (null, 1, test).
>
>
> primary key. Of course, you could amend the policy to say a "non-NULL
>> primary key".
>>
>>
>>
>> --
>> I have a theory that it's impossible to prove anything, but I can't prove
>> it.
>>
>> Maranatha! <><
>> John McKown
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
As Adrian pointed out, by definition, PK's create a constraint which are
NOT NULLABLE;

Here is the SQL to check for tables with no primary key.









*SELECT n.nspname,c.relname as table,
c.reltuples::bigint   FROM pg_class c JOIN pg_namespace n ON (n.oid
=c.relnamespace ) WHERE relkind = 'r' AND   relhaspkey =
FALSEORDER BY n.nspname, c.relname;*

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


Re: Enforce primary key on every table during dev?

2018-02-28 Thread Adrian Klaver

On 02/28/2018 05:52 AM, John McKown wrote:
On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel >wrote:


We want to enforce a policy, partly just to protect those who might
forget, for every table in a particular schema to have a primary
key.  This can't be done with event triggers as far as I can see,
because it is quite legitimate to do:

BEGIN;
CREATE TABLE foo (id int);
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;

It would be nice to have some kind of "deferrable event trigger" or
some way to enforce that no transaction commits which added a table
without a primary key.

Any ideas?

Thanks,
Jeremy



​What stops somebody from doing:

CREATE TABLE foo (filler text primary key default null, realcol1 int, 
realcol2 text);


And then just never bother to ever insert anything into the column 
FILLER? It fulfills your stated requirement​ of every table having a 


Then you would get this:

test=# CREATE TABLE foo (filler text primary key default null, realcol1 
int, realcol2 text);

CREATE TABLE
test=# insert into  foo (realcol1, realcol2) values (1, 'test');
ERROR:  null value in column "filler" violates not-null constraint
DETAIL:  Failing row contains (null, 1, test).

primary key. Of course, you could amend the policy to say a "non-NULL 
primary key".




--
I have a theory that it's impossible to prove anything, but I can't 
prove it.


Maranatha! <><
John McKown



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Enforce primary key on every table during dev?

2018-02-28 Thread John McKown
On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel  wrote:

> We want to enforce a policy, partly just to protect those who might
> forget, for every table in a particular schema to have a primary key.  This
> can't be done with event triggers as far as I can see, because it is quite
> legitimate to do:
>
> BEGIN;
> CREATE TABLE foo (id int);
> ALTER TABLE foo ADD PRIMARY KEY (id);
> COMMIT;
>
> It would be nice to have some kind of "deferrable event trigger" or some
> way to enforce that no transaction commits which added a table without a
> primary key.
>
> Any ideas?
>
> Thanks,
> Jeremy
>


​What stops somebody from doing:

CREATE TABLE foo (filler text primary key default null, realcol1 int,
realcol2 text);

And then just never bother to ever insert anything into the column FILLER?
It fulfills your stated requirement​ of every table having a primary key.
Of course, you could amend the policy to say a "non-NULL primary key".



-- 
I have a theory that it's impossible to prove anything, but I can't prove
it.

Maranatha! <><
John McKown