Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread David Adams via 4D_Tech
> This all started with David making a broad blanket statement about "data
> integrity" and "row duplication" and how using "synthetic" record ID keys
> ruined the ability to automatically​ ​filter out "dupes". (I _think_ that
> was your point David. Please correct me if not.) And in that strict sense,
> if a "row" is really actually

I've been a bit taken aback at how this thread has gone, to be honest.

My point was really basic: Bolding on a random number field makes a row
unique, it doesn't make the underlying *data* unique.

That's it. That's more-or-less me using one of my standard techniques:

1) Find the obvious.
2) Draw a circle around it.
3) Declare it mine.

Patent pending...

Seriously, I'm saying something that just shouldn't be even the tiniest bit
contentious. That point is nothing but an established principle from
database design, it's not something I'd normally consider debatable. I'd
put it in the same category as when someone says, "Race condition? I've
figured out a shortcut." No you haven't. That's pretty much in line with
things the patent office won't even review:

* Perpetual motion machines.

* Squirrel-proof bird feeders.

Until such time as the fundamental laws of nature change, neither of those
inventions can work.

A lot of the stuff about duplicates doesn't bear on the design question
about the table's columns.

Keller 123
Keller 123
Keler 123

How many rooms are there? You have no way to know. It could be 1, 2, or 3:

1: You've got a pure duplicate and a typo-based duplicate.
2: You've got a pure duplicate and two buildings with very similar names.
3: You've got two buildings with very similar names *and you are missing
enough data to identify rows uniquely.*

In that last case,there's a design question. If you haven't identified
enough columns of *data* from the real world to uniquely identify each row,
you don't have a data model that can be implemented in a relational
database. I mean, sure, you can use a tool to store the data - but I don't
see how you model that. What makes each row unique? I mean out in the real
world. Perhaps there's a floor, or a direction, or some kind of vernacular
detail that actual people use to distinguish rooms. In which case, you've
found a missing field! That's part of the point of modeling. But what if
there really is nothing? Well, then it's pretty common to add something,
like a number. But that then (generally) needs to flow out into the real
world. So, you synthesize data in the database, add it to the row *and* to
the real world. Serial numbers are a perfect example of this. Same with
account numbers, customer numbers, etc. And UUIDs are functionally serial
numbers. They have nothing at all to do with the data in the row, they're
purely an implementation-level convenience.

As some people seem to be getting the impression I'm against UUIDs, that's
incorrect. I'm glad they're native (I used them sooner than that), I've got
nothing against them. But they are what they are and not more. Any
complaints I had about how they were implemented in 4D is irrelevant now as
it's been since V14. Too old to matter. I also have been getting the
impression when people say "primary key", they don't always appreciate what
a "key" really is - an attribute or set of attributes that are entirely
about the row of data that also uniquely identify a row. That's a key...off
the top of my head, I'm sure there's a better summary out there. The 4D
world has long had a peculiarly hostile relationship to normalization,
starting with 4D itself (subtables, wrong examples in the manuals, etc.)
I've seen the same cropping up again with how object fields are being
promoted. (Slapping some JSON in a text field and calling your system NoSQL
isn't necessarily helpful...)

But just slapping a random-but-unique number (sequential or not) onto the
row? That easily masks real duplicates. And why other databases prominently
support multi-field constraints on tables - for precisely this reason.
Again, you *can* do this in 4D with a compound index set to unique. So, the
feature exists, I just haven't seen it being used widely. Perhaps it
because engine-level errors feel disproportionately painful to deal with in
4D? That could just be me. Sincerely, I may be more allergic to 4D
engine-level errors than others. I pretty much turn off the unique
attribute and check uniqueness myself. Hmmm. I might be missing upping my
game here. I kind of remember that years ago hitting a duplicate error
wasn't easy to trap for and got ugly. I can't swear that's true, I just
remember having that impression. Long story shorter, I probably
haven't tried 4D's uniqueness controls in years. Do people use them? How
are they in V16? Since I'm doing some Postgres stuff these days, yeah, over
there I set up multi-column constraints, no question.

There are lots of other kinds of duplicates out there that can crop up,
even if you have a good model behind your tables. Life is hard. But it's
important and 

​Re: UUID vs Longint primary key

2017-08-08 Thread steve simpson via 4D_Tech
On Tue, Aug 8, 2017 at 11:00 AM,
​
Chip Scheide <4d_o...@pghrepository.org> wrote:

>
> Worse, I've found that the same product, from the same vendor in
> differing purchase amounts (1 vs case) is the same part number, but
> different pricing! So.. even a check on part numbers is insufficient to
> stop duplicate entries.
>
> ​Well I believe there are "dupes" and then there are dupes. There are
unacceptable dupes and there are dupes you have to or need to live with.
And then there are "near dupes". Like your example above. That is not
"really" a dupe because there is one field/col, pack size, that is NOT
duplicated (case vs 1). On and on.

This all started with David making a broad blanket statement about "data
integrity" and "row duplication" and how using "synthetic" record ID keys
ruined the ability to automatically​

​filter out "dupes". (I _think_ that was your point David. Please correct
me if not.) And in that strict sense, if a "row" is really actually
absolutely duplicated, that is _probably_ bad. Or maybe not if you didn't
include that "pack size" field that would have changed the row to unique.
Or all the other examples cited on this thread about duplicate names that
were not _really_ duplications; they just needed a little more information
included in the "row" to better define it. In fact we all used to
experience it right here nearly daily with our Walt Nelson(Guam) vs Walt
Nelson(Seattle) signatures. Constantly confusing without that one little
added tidbit.

So my point was, ​it all depends. And sometimes you have to design your own
system differently or provide tools within your current system to suss out
what, how, why and when a "dupe" occurred. And how to - or IF to - fix it
or prevent it or even find it.

I too am definitely a convert to using
 UUID over longint
​. AND in using them in preference to some construction using row data
itself which may well change as the business grows/changes and will NOT
play well when you absorb new data sources (buy a competitor for example
with nearly identical inventory items or combine existing standalone data
installations into one big common enterprise bucket, or decide for all
kinds of business reasons to extract a certain batch and combine it with
another batch in a different bucket, etc.etc.etc. Data duplication of one
sort or another are bound to occur in many of these "growth" scenarios and
more often than not the merging and cleaning of those dupes is not
reducible to some sort of algorithm without human hands to help. Or
whatever.

As ​
Neil succinctly describe
​d
:

 - UUID is faster (do to "random" data in the index)
>  - UUID solves problems with distributed systems that sync
>  - UUID fixes the home grown sequence problem with transactions
>  - UUID is not easily readable by human and keeps me from being tempted to
> expose them :)
>
> ​And to Chip's point, I DO sometimes expose those UUIDs as read only info
on certain Admin Review pages. I sometimes place a button to "copy to
pasteboard" if it is appropriate that the admin might desire to do some
searching with that UUID - for as we all know​, they are hellishly
difficult to type. And in many projects I retain that seq longint idea
because it really IS a useful human marker that is easier than a date:time
stamp to read and quickly sort on and in general "glom" as you scan down
long lists of rows. But I've been burned way too many times to ever use it
again as a unique recordID. I now consider it a user interface type aid
only, still useful as a ProductID or some such in many cases. But not as a
Unique-Unvarying-Forever-Regardless-Of-Source-Or-Destination-Record-Key.

Steve Simpson
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread Chip Scheide via 4D_Tech
ok - I'm confused.
what is the difference between a display only field showing an internal 
ID number,
and a duplicate, display only ID number showing on the entry form?

 
On Tue, 8 Aug 2017 08:07:00 -0600, npdennis wrote:
>> I find that by placing the internal linking value (non-editable) on an 
>> entry form
>> GREATLY enhances the ability/simplicity of tracking down data issues.
> 
> This same thing can also be done by adding the same field to the same 
> table but not linking off of it internally :)
> 
> 
> 
> --
> Neil Dennis
> 4D Developer since 1990
> 
> 
> GreaText - Designing Software for the Way You Work
> 716 East 1850 N
> North Ogden, UT 84414
> 
> mailto:npden...@greatext.com
> http://www.greatext.com/
> 
> 
---
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing 
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread Chip Scheide via 4D_Tech
David,
how do you tell these two issues apart:
Customer name : John Smith
Customer Name : John Smyth

is this a typo (one should be Smyth and is not, or one should be Smith 
and is not)?
is it real (2 John Smiths with different spellings)

I see this problem with a 'free form' entry inventory data. i.e. the 
user can enter whatever in the item name/description field.

I do not see (for my inventory) a way to keep duplicate entries like:
Glove, latex, Medium
Medium Glove, latex
Latex glove size medium

yes, I can check part numbers, but... same glove from different 
suppliers can/will have different part numbers
ex: 
Joe's supply house: 12345
Fred's house of supplies: 56342-m

Worse, I've found that the same product, from the same vendor in 
differing purchase amounts (1 vs case) is the same part number, but 
different pricing! So.. even a check on part numbers is insufficient to 
stop duplicate entries.


I have setup a means for the users to remove duplicates, but of course 
they never do

always welcome a way to improve this situation
Chip

On Sun, 6 Aug 2017 13:06:28 -0700, David Adams via 4D_Tech wrote:
> Since I'm diving into Postgres these days (mostly on the import side - my
> table designs aren't even right yet - pacing myself), I see that they have
> a wide range of tools for enforcing row uniqueness and referential
---
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing 
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread npdennis via 4D_Tech
> I find that by placing the internal linking value (non-editable) on an 
> entry form
> GREATLY enhances the ability/simplicity of tracking down data issues.

This same thing can also be done by adding the same field to the same table but 
not linking off of it internally :)



--
Neil Dennis
4D Developer since 1990


GreaText - Designing Software for the Way You Work
716 East 1850 N
North Ogden, UT 84414

mailto:npden...@greatext.com
http://www.greatext.com/


**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread Chip Scheide via 4D_Tech
yes,
BUT -
I find that by placing the internal linking value (non-editable) on an 
entry form
GREATLY enhances the ability/simplicity of tracking down data issues.

ex (without viewable internal key):
user: "... customer John Smith does not show the correct invoice(s)
Dev/Sys admim: John Smith.. just a minute... hmm There are 43 John 
Smith's in the system, which one do you mean?
user: you know JOHN SMITH!
. [no need to continue  :) ]

ex: (with visible key):
user: "... customer John Smith does not show the correct invoice(s)
Dev/Sys admin: Can you please give me the number in the upper left hand 
corner, just below the screen title 'Customer Information'...
user:  1234567
Dev/Sys admin: just a moment... Ok got it. Now what is the exact 
problem
. [of course at this point getting the user to give useful 
information is next to impossible, but at least there is now a 
reference to the problem record/customer...]


On Sun, 6 Aug 2017 13:08:14 -0600, npdennis via 4D_Tech wrote:
> 
> On the other hand, you do model the data after business relations, 
> but the keys that tie that relation data need/should never be seen in 
> a well designed system. If a user readable key is needed by business, 
> then there should be another data piece that the user can read (like 
> an MRN, medical record number, or an abbreviation that is unique and 
> human readable) But these should never be used to link together data 
> in a structure in primary key foreign key relation.
> 
---
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing 
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-07 Thread David Adams via 4D_Tech
Since it sounds like pretty much everyone uses UUIDs for links (me too,
although I may not have an actual arrow drawn), maybe se can turn this
thread in a different direction? Given that random synthetic keys do
nothing to avoid duplicate rows - and may actually make duplicate rows more
likely - what do people do to avoid duplicates. To be clear what I mean by
a duplicate:

1  Keller  123
2  Keller 123
3  Keler 123

Rows 1 and 2 are duplicates, row 3 is not. It might be messy/bad data, it
might not - but there's no way for the engine to know that. 1 and 2 are
duplicates, despite having unique made up numbers for a key. So, the
question is: How do you deal with that problem? Here are some solutions
I've used and/or seen:

* Cover your ears and say "neyh, neyk, neyh I CAN'T HEAR YOUR!" It's not a
great technique, but it is simple and cheap to implement.

* Ignore the possibility of duplicate rows unless/until it obviously bites
you. Depending on data, this might be good enough. In same cases, it's just
not acceptable. (Screwing up 100 log Web lines out of 1,000,000?
Meaningless. Screwing up 100 transactions in your accounts? Yeah, not
allowed.)

* Carefully program your system to detect and prevent duplicate rows:
-- at point of entry (client-side/UI, etc.)
-- at the engine level (triggers, indexes)

* Do after-the-fact audits. Either on demand or on a schedule.

That's pretty much the world of techniques I've seen.

Any thoughts?
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-07 Thread Charles Miller via 4D_Tech
It might be my memory, but I do not think that is how it worked in the 
beginning I was working on relational model dbs before SQL was a language

Rdb at DEC for example.

That was one of the ket differences between model types how you created and 
maintained the relations. I do not believe that there was even a thought of 
primary keys

Regards

Chuck

 Chuck Miller Voice: (617) 739-0306
 Informed Solutions, Inc. Fax: (617) 232-1064   
mailto:miller.cjaygmail.com 
 Brookline, MA 02446 USA Registered 4D Developer
   Providers of 4D and Sybase connectivity
  http://www.informed-solutions.com  



> On Aug 7, 2017, at 12:14 PM, steve simpson via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
>> 
>> 
>> On the other hand, you do model the data after business relations, but the
>> keys that tie that relation data need/should never be seen in a well
>> designed system. If a user readable key is needed by business, then there
>> should be another data piece that the user can read (like an MRN, medical
>> record number, or an abbreviation that is unique and human readable) But
>> these should never be used to link together data in a structure in primary
>> key foreign key relation.
>> ​ [snip]​
>> 
>> 
>> 
> +1 My point exactly, only stated much better.​

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

​Re: UUID vs Longint primary key

2017-08-07 Thread steve simpson via 4D_Tech
On Mon, Aug 7, 2017 at 11:51 AM, <
​
<4d_tech-requ...@lists.4d.com>
​
npden...@greatext.com> wrote:

>
> ​[snip]
>
> On the other hand, you do model the data after business relations, but the
> keys that tie that relation data need/should never be seen in a well
> designed system. If a user readable key is needed by business, then there
> should be another data piece that the user can read (like an MRN, medical
> record number, or an abbreviation that is unique and human readable) But
> these should never be used to link together data in a structure in primary
> key foreign key relation.
> ​ [snip]​
>
>
>
+1 My point exactly, only stated much better.​
​
​

Steve Simpson
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-07 Thread David Adams via 4D_Tech
> "Yes, use UUIDs in a 4D context rather than SeqNos. And, by the way, the
way 4D implemented these
> UUIDs - I don't approve with it."

Not really.

The difference between sequence numbers (your own or 4D's) and UUIDs isn't
something I addressed. Don't care. Whatever is better in your system -
conceptually they're more alike than different.

I've been saying pretty plainly that I'm happy to have UUIDs natively in 4D
and that I use them for links all the time.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-07 Thread Marcus Straßmann via 4D_Tech
David,

in short, you are answering the OP's question as:

"Yes, use UUIDs in a 4D context rather than SeqNos. And, by the way, the way 4D 
implemented these UUIDs - I don't approve with it."

Fine :-)

Marcus


MacStrass - Marcus Straßmann
Softwareentwicklung und Beratung
Auf der Markscheide 35
D-44807 Bochum

Mobil: +49 (173) 374 39 92
eMail: macstr...@macstrass.de

> Am 04.08.2017 um 21:52 schrieb David Adams via 4D_Tech <4d_tech@lists.4d.com>:
> 
> 4D's UUIDs function as globally unique row *serial numbers*. That's great
> for backups and convenient for physical relations, but it has exactly zero
> to do with a real "primary key" or relational integrity. I would have
> preferred to see UUIDs implemented as a virtual field:
> 
> [Customer].UUID()
> 
> or
> 
> Get record UUID ([Customer])
> 
> ...and let 4D manage them internally, automatically, and invisibly. But
> that's not how it worked out.
> 
> Yeah, UUIDs are impossible to type and I'm still not 100% clear on how you
> search for empty ones. I've seen the tech tips etc., and still get some
> really eerie weird results at times.
> 
> In my own code, I tend to run verification methods that check for real
> (logical) keys. Namely, the field or combination of fields that uniquely
> identifies the *data* in the row. A UUID-as-physical-key very easily masks
> duplicate rows because it is, by definition, unique. You row might be
> unique, but your data might not be, if you know what I'm saying.
> 
> Anyway, I tend to scan for duplicate data, orphan records (a child record
> that *requires* a parent and doesn't have one is an orphan. Not all child
> records require parents, but 4D's diagrams, etc. don't distinguish between
> the two.) Oh, an widows. Mostly, I find lurking duplicates and orphans.
> Widows? Not so often. Not sure if that's just me. In theory, none of these
> things should ever happen. But then again, a whole lot of things work out
> better in theory...
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread David Adams via 4D_Tech
On Sun, Aug 6, 2017 at 5:33 PM, steve simpson via 4D_Tech <
4d_tech@lists.4d.com> wrote:

> David and Chuck, point taken about "rows" being unique because of and
> created with the containing data of each row -- up to a ... well "a point".




The misunderstand of relational theory, design and practice - and open
hostility towards it - is peculiar to the 4D world. With *any* other RDBMS,
all of the stuff I mentioned is just part of what people know and assume.
Because science.

So, I won't be getting into it more than that. But will say, it's well
worth understanding what normalization *actually* is, what it's good for,
what it's not good for, and how you deal with that. It's also worth
understand that transactionally-oriented databases are ideal for a wide
range or problems, ORDBS for a somewhat wider range of problems, and
doc-oriented/NoSQL databases for a very narrow range of problems. There are
good applications for each. (And you can often do different design
topologies in the same RDBMS, even 4D.)

Dan and I wrote a nice summary of normalization in our book and it's up
somewhere for download as a PDF.  A couple of college professors have
contacted us down the years to see about using it in class. ("Of course!")
because the standard database theory texts are, well, dense to the point of
incomprehensibility...what Dan and I wrote is pretty straightforward and
practical.

And lest anyone get hung up on the word theory:
1) I like the theory *because* it helps my practical work.
2) In science, a theory is more substantial than a law. (Just
saying...that's how the words work, but it's commonly misunderstood.)

NB: The whole "normalization sucks" business would make you look really,
really odd in any other RDBMS community so, if anyone ventures out into the
wider world, make sure that you 1) understand normalization, 2) understand
the difference between logical and physical models, and 3) don't tell
anyone that you think normalization doesn't make sense.

Adams out.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

​Re: UUID vs Longint primary key

2017-08-06 Thread steve simpson via 4D_Tech
David and Chuck, point taken about "rows" being unique because of and
created with the containing data of each row -- up to a ... well "a point".
Like you said, a real pain when real world reality steps in. I believe the
only truism involved here is your statement that "...
​i
t then leaves the task of doing integrity checks and so on to you.
​" Exactly! Regardless of the way you decide to handle 4D record
uniqueness, you will surely encounter ​the ongoing need to continually
manage integrity "on your own". Hoping that your "unique key" will do that
for you is likely not realistic, regardless of your method.

I'm sure that many of us have had to eventually abandon the 4D long int
"auto increment" in favor of some sort of crafted version of it when we
expanded to mixing data sources. For instance, you build and distribute an
application that starts up out the gate with recordID #1, then #2, then #3,
etc with each install. But  5 years later the client wants everyone - all 4
thousand installs - to start sending data to a central accounting
processing point -- retaining uniqueness across the enterprise. Oooops.
Suddenly, you have to make sure that each and every installation has some
sort of retrofitted unique number, something not envisioned when it was
first developed. Whoa. Big problem.

I must say that the only "real" solution to this so far in the 4D world -
for me at least - has been the auto UUID key field. Finally a way to create
a record (and its children) that can be moved around willy-nilly without
angst. OK, granted it does not solve the issue of "integrity" from row to
row. But neither does your own example of finding uniqueness within each
row's actual data (i.e.
​
Keller123, 1
​
Keller123, 2
​
Keller123, etc); that is a task you have to deal with regardless of how you
handle your own unique key - logical ways - important realistic vertical
market biz logic ways -- issues that may be unique to each project -- logic
that may change over time. It has NOTHING to do with the record key in of
of itself. Chuck Miller said in response, "...
​
I am not David but I agree with his assessment. Relational model databases
by definition are not supposed to use keys that have no meaning. They are
supposed to create relations that have meaning". All I can say is that
"meaning" per row, and meaning contained within children rows -- especially
meaning to "humans" -- may not really exist and may have little to cling to
or may even change as projects expand and morph.

I'm really just reacting to what I experience as "theory" mismatch with
real world reality. We need to separate "record uniqueness" as an idea from
"meaning" of the data within a row, since no matter how you handle record
uniqueness you will still need to craft your own unique project needs to
insure "data integrity", which can take many forms and have many challenges
beyond some simple "row" logic. And realize that these needs will likely
change over time as the project morphs. So the first need is to handle "4D
record ID uniqueness". (and in my experience so far, ONLY UUIDs do that
satisfactorily.) Then address your project specific "data integrity" needs
and understand that you'll have to deal with that on an ongoing basis
regardless of how you handle record uniqueness as your app grows and morphs
and expands and begins to interact with other outside data sources.

And BTW, we all have users who get a lot of info from the visual feedback
of incremental "record numbers". OK, no problem. Just another field:
"record number", far different from "unique internal record KEYID". And
likely understandable and fixable for human needs if dupes are introduced
with data mergers. Just another "user comprehension data integrity" task.

Steve Simpson
​Cimarron Software

On Sun, Aug 6, 2017 at 3:00 PM, <4d_tech-requ...@lists.4d.com> wrote:

>
>
> Message: 3
> Date: Sun, 6 Aug 2017 11:03:57 -0400
> From: Chuck Miller <cjmil...@informed-solutions.com>
> To: 4D iNug Technical <4d_tech@lists.4d.com>
> Subject: Re: ​
> ​​
> Re: UUID vs Longint primary key
>
> ​​
> I am not David but I agree with his assessment. Relational model databases
> by definition are not supposed to use keys that have no meaning. They are
> supposed to create relations that have meaning. Even the use of numbers
> breaks the theoretical rule. The problem is that we have pushed all to use
> relational model rather that what he logical model proposes. Who out there
> remembers hierarchical, tree, or inverted structure models.
>
> Regards
> Chuck
> --
>
> Message: 4
> Date: Sun, 6 Aug 2017 08:16:07 -0700
> From: David Adams &l

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread David Adams via 4D_Tech
> Relational model databases by definition are not supposed to use keys
that have no meaning.

On Sun, Aug 6, 2017 at 12:08 PM, npdennis via 4D_Tech <4d_tech@lists.4d.com>
wrote:
> You may have stated that backwards…

Nope, that's exactly what I meant. But apologies for not making the point
more carefully. I'll make up for that now.

> relational model databases by design should never, never, never use
business data for keys.
> This was the number one rule drilled into me by my college relational
database classes. I
> found this to be very wise advice in the real world.

I agree! But there are two different pieces to this. In the one part, no
synthetic keys and in the other, synthetic keys. When you did your RDBMS
classes, they probably discussed formal design versus system
implementation. When you're creating a normalized *design*, you don't have
artificial/synthetic keys - because your focus is on the pure data itself
and such keys have no meaning. (They're a database artifact, not part of
the real world entity you're modeling.) The "real" primary key gets sorted
out when you're putting your design into second or third "normal form."
(Even if you don't think of it in those terms. I mean the actual design
step, whatever name you use - if you even use a name.) When you go to
*implement* your tables, it's sensible to use a synthetic key. So, in my
example, I used

Building + Room

...because that's enough to uniquely identify each room with no spare
values in the field, no redundant rows, and nothing that belongs in another
table. (Well, you would have a unique set of Building records, as noted in
the earlier post.) So, that's a correct and sensible *design* and is how
I'd think about the actual data. But implementation? Another story. The
building is renamed and I have to go back and update all of the related
room records? PITA.

It's 100% normal to think about the formal table design and the physical
implementation as distinct steps with different rules. And your VIN
example, like Social Security Number, is an excellent example of a "natural
key" that turns out not to be a great idea. I use synthetic keys for *all*
of my tables (well, there might be a rare exception where I use a natural
key, but that's super rare...and I probably end up regretting that most of
the time.)

In my earlier post I was just assuming that the distinction between design
and implementation was clear, but those steps aren't distinct in 4D. (They
are distinct in design tools and used to be a pretty common concept - they
still are distinct in design-oriented RDBMS development tools.)

There's a natural tension here, and there are two competing tasks that
require work to handle. You only get one of them for free, the other you
have to take care of on your own.

* Natural keys and multi-field keys the truly define a row are a *pain* in
the real world where "unique" values aren't always entered
correctly/stable/etc. To avoid hassles with updates, synthetic keys are
super helpful. They give you _stable keys_. (Mutable keys are absolutely a
pain.)

 Synthetic keys take away the pain of related table changes on UPDATE.

* Synthetic keys have _nothing_ to do with the real data in the row. That's
kind of the point. Which means that you get *no help* in preventing
duplicate rows of data. Like my example of the duplicate building + room
with different IDs.

 Synthetic keys give you the pain of having to hand-check for
duplicates.

The problem shows up when you have "unique" rows with duplicate data. That
gets ugly in a big fat hurry when it's a 1 table and children link to
different rows in the 1 table that aren't really "different." Note that I'm
not talking here about the kind of messy data - like customer names and
addresses - where you need to merge records. That's a related but different
question. My building + room example is nice and clean. The data is
identical, it's just that somehow a duplicate got entered. (I don't want to
muddy the waters with messy data, it's not necessary or helpful for the
current discussion.)

Here's a possibly apocryphal tangent from my first years in the work world.
Back when things were typed. On typewriters. It was customary to put at the
bottom of each page 1/5, 2/5 and so on. I heard tell of a report that had a
sixth page added. Rather than retype all of it, here's what the boss got
back at the bottom of the six pages:

1/5
2/5
3/5
4/5
6/6
Not the same thing, but I always thought of it as a good example ;-)

In 4D, you get a benefit from using synthetic keys - you get stable keys
for update for free. I'll take that deal! But you get no automatic help
with duplicate control. You can add this by implementing a compound index
marked as unique. Or you can write custom scanner code to audit your data.
The exact options depend on the database, but the work is required either
way. In 4D databases, what I most often see is that people aren't checking
for duplicate rows either at the engine level 

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread npdennis via 4D_Tech
> Relational model databases by definition are not supposed to use keys that 
> have no meaning.

You may have stated that backwards… relational model databases by design should 
never, never, never use business data for keys. This was the number one rule 
drilled into me by my college relational database classes. I found this to be 
very wise advice in the real world. It came in handy when building a database 
for vehicles and the suggestion was to use VIN as a primary key since they were 
unique… it turns our that as unique as VIN should be, they are not… it was no 
problem for out database to handle vehicles with the same VIN because we 
followed the non-business-data-for-primary-key rule.

On the other hand, you do model the data after business relations, but the keys 
that tie that relation data need/should never be seen in a well designed 
system. If a user readable key is needed by business, then there should be 
another data piece that the user can read (like an MRN, medical record number, 
or an abbreviation that is unique and human readable) But these should never be 
used to link together data in a structure in primary key foreign key relation.

Just my two cents.

Neil

--
Neil Dennis
4D Developer since 1990


GreaText - Designing Software for the Way You Work
716 East 1850 N
North Ogden, UT 84414

mailto:npden...@greatext.com
http://www.greatext.com/



**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread David Adams via 4D_Tech
On Sun, Aug 6, 2017 at 7:47 AM, steve simpson via 4D_Tech <
4d_tech@lists.4d.com> wrote:

> On  Fri, 4 Aug 2017 12:52:28 -0700
> ​
> David Adams 
> ​ wrote:​
>
> >
> > ​[snip] ​
> > 4D's UUIDs function as globally unique row *serial numbers*. That's great
> > for backups and convenient for physical relations, but it has exactly
> zero
> > to do with a real "primary key" or relational integrity.
> > ​ [snip]​
>
>
Sure, but only briefly (collective sigh of relief, I'm sure.)

When you design a relational database, you sort out the "keys" by finding
the field or combination of fields in a table that make the row unique. The
next part of that design is that each field in the row should be about the
row and nothing but the row. What does a UUID or longint ID have to do with
the data in the row? Nothing at all. This kind of "synthetic key" is an
implementation-level accommodation. It has nothing to do with the data,
it's just something bolted on for practical, real-world reasons. For
example, you might have a unique customer name (just saying) and they might
want to change it later. Ugh! With a unique-but-meaningless ID, that's no
problem.

I use synthetic keys all of the time and have forever. So, no quarrel with
using them for links and so on. But they do bring on another problem: These
sorts of IDs not only do not maintain the uniqueness of your rows, they
very easily obscure duplicate rows.

Imagine that you've got an asset management system and it stores the
location of each desk in a facility. There's a table for every room in the
facility. So, you've got rooms and desks. Both are finite, both are real
and tangible. So, over in your room table youve got something like

Building Keller
Room123

Back in the real world, the building names are unique and, within a
building, room names or numbers are unique. So, you've got a unique row
here. Imagine using that data as a key:

Keller123

That's a sound key and the row design is also solid. But this kind of key
is a real pain down the row when things change. It may be that a new
numbering scheme is introduced or a building is renamed for some reason. (A
big donation, etc.) The actual collection of real-world buildings and rooms
hasn't changed (in this example), but you need to update the records to
reflect the new real-world scheme. So you have to go back and change the
[Room] record, and then any linked [Desk] records, or any other links. Ugh.

I'll be most, if not all of us, got burned by using a "real" key like this
early on. I did. So, longints and now UUIDs. Easier! So, now I'll use a
longint:

1  Keller 123

I link on 1, I can rename the building or room, no drama.

So what was I talking about? So far I'm a fan of using longints/UUIds (I
am.) The problem is this:

1  Keller 123
2  Keller 123
3  Keller 124

You've got unique *rows* but duplicated *data*. There are two Keller 123
records - but they've got different unique longints bolted on so they
aren't treated as duplicates. The "unique" IDs are masking the duplicates.
Notice that the "unique ID" has nothing at all to do with the rooms in the
world. It's just a unique number. Unique rows, duplicate data.

A UUID, longint, etc. is a practical way to link records (and the one I
use), but it then leaves the task of doing integrity checks and so on to
you.

As I understand it, UUIDs were added to provide globally unique serial
numbers to records for the sake of journaled backups. That's a good goal,
and 4D Backups seems to work really well now.  I never upgraded my main
source from V13 because, for quite some time, it was a bit of a mess to
sort out the UUIDs and I couldn't be blethered to pour the time into
figuring out. After a few years, I found a reliable set of SQL instructions
to do a retrofit. In any case, I'm working on something now that I believe
began in V15 so it's already been sorted out there from the start.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread Chuck Miller via 4D_Tech
I am not David but I agree with his assessment. Relational model databases by 
definition are not supposed to use keys that have no meaning. They are supposed 
to create relations that have meaning. Even the use of numbers breaks the 
theoretical rule. The problem is that we have pushed all to use relational 
model rather that what he logical model proposes. Who out there remembers 
hierarchical, tree, or inverted structure models. 

Regards 
Chuck

Sent from my iPhone

> On Aug 6, 2017, at 10:47 AM, steve simpson via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> On  Fri, 4 Aug 2017 12:52:28 -0700
> ​
> David Adams 
> ​ wrote:​
> 
>> 
>> ​[snip] ​
>> 4D's UUIDs function as globally unique row *serial numbers*. That's great
>> for backups and convenient for physical relations, but it has exactly zero
>> to do with a real "primary key" or relational integrity.
>> ​ [snip]​
>> 
> 
> ​Care to elaborate on that statement? I don't get why you'd say that.​
> _
> Steve Simpson
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

​Re: UUID vs Longint primary key

2017-08-06 Thread steve simpson via 4D_Tech
On  Fri, 4 Aug 2017 12:52:28 -0700
​
David Adams 
​ wrote:​

>
> ​[snip] ​
> 4D's UUIDs function as globally unique row *serial numbers*. That's great
> for backups and convenient for physical relations, but it has exactly zero
> to do with a real "primary key" or relational integrity.
> ​ [snip]​
>

​Care to elaborate on that statement? I don't get why you'd say that.​
_
Steve Simpson
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Nigel Greenlee via 4D_Tech
+2

Nigel Greenlee


> On 4 Aug 2017, at 12:09, Marcus Straßmann via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> +1
> 
> 
> MacStrass - Marcus Straßmann
> Softwareentwicklung und Beratung
> Auf der Markscheide 35
> D-44807 Bochum
> 
> Mobil: +49 (173) 374 39 92
> eMail: macstr...@macstrass.de
> 
> Am 04.08.2017 um 10:39 schrieb Herr Alexander Heintz via 4D_Tech 
> <4d_tech@lists.4d.com>:
> 
>>> Someone can explain when is better use UUID and when Longint field in 
>>> primary key?
>> 
>> If you can even remotely think of a scenario where you might ever be 
>> challenged to merge databases, go with a UUID!!!
>> It will save you incredible headaches!
>> Switched to UUIDs long ago, never looked back, best decision ever!
>> 
>> Of course if such a scenario is unthinkable, you can use longints
>> 
>> **
>> 4D Internet Users Group (4D iNUG)
>> FAQ:  http://lists.4d.com/faqnug.html
>> Archive:  http://lists.4d.com/archives.html
>> Options: http://lists.4d.com/mailman/options/4d_tech
>> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
>> **
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Marcus Straßmann via 4D_Tech
+1


MacStrass - Marcus Straßmann
Softwareentwicklung und Beratung
Auf der Markscheide 35
D-44807 Bochum

Mobil: +49 (173) 374 39 92
eMail: macstr...@macstrass.de

Am 04.08.2017 um 10:39 schrieb Herr Alexander Heintz via 4D_Tech 
<4d_tech@lists.4d.com>:

>> Someone can explain when is better use UUID and when Longint field in 
>> primary key?
> 
> If you can even remotely think of a scenario where you might ever be 
> challenged to merge databases, go with a UUID!!!
> It will save you incredible headaches!
> Switched to UUIDs long ago, never looked back, best decision ever!
> 
> Of course if such a scenario is unthinkable, you can use longints
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Kirk Brooks via 4D_Tech
I'll add my vote to Jody's approach.

I prefer UUIDs for primary keys for all the reasons mentioned.
Having a longint index field is useful as well but for different reasons as
Jody explains. My most common use is to be able to quickly sort a table in
the order records were created. And it is superior for referencing records
- attempting to type a UUID is not something one wants to do.

On Fri, Aug 4, 2017 at 6:51 AM, Jody Bevan via 4D_Tech <4d_tech@lists.4d.com
> wrote:

> On another note, we assign a counter ourselves (not auto generated). There
> are times where we reset counters and being able to do that is great.
>

Resetting the counter can be done with ​SET DATABASE PARAMETER ( [Table] ;
Table sequence number ; value ). I haven't had any issues with bad sequence
numbers since about vers 6.

-- 
Kirk Brooks
San Francisco, CA
===

*The only thing necessary for the triumph of evil is for good men to do
nothing.*

*- Edmund Burke*
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Jody Bevan via 4D_Tech
So what might be not so good with UUIDs? 

After watching how people work with our systems for 26 years (most of it 
without UUID capability) I noticed that they often referred to the record by 
the Unique LongInt we created for each record. For the first year we hid our 
unique ID - why would they want that. Then we exposed it to the admins for a 
year. Then through popular request we exposed it to everyone. Then we not only 
exposed it but changed the name of the display of the field to JonokeMed #. 
Referring to record # 150, or 275,983 is much easier than referring to a long 
UUID.

A UUID just is not a workable solution for that - at least in my mind. 
Therefore our solution today is that we do use UUIDs for all records for the 
reason of any possibility of merging records. We have had clients move across 
the Canada to join up with another client, and they need the records merged. 
Arrh a lot of work to write that the first time, and then continue to support 
it with each new table.

We use the UUID as the Primary Key record, but still have a Key field. This 
number is exposed to the end users to use as they wish. On another note, we 
assign a counter ourselves (not auto generated). There are times where we reset 
counters and being able to do that is great. 

Jody Bevan
ARGUS Productions Inc.
Developer

Argus Productions Inc. 




> On Aug 4, 2017, at 7:40 AM, Keith Culotta via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> The UUID has been the easiest to work with.  When a customer needs a 
> recognizable ID, an auto-incremented Longint can be added to the table.  
> UUIDs are perfect for avoiding conflicts when updating or exchanging records 
> between databases.  They can also be used as Process IDs or SVG IDs when one 
> of those is linked to a record.  All this can be accomplished with a Longint, 
> but it's more straightforward with UUIDs.  
> 
> Keith - CDI
> 
>> On Aug 4, 2017, at 3:27 AM, stardata.info via 4D_Tech <4d_tech@lists.4d.com> 
>> wrote:
>> 
>> Hi All,
>> 
>> Someone can explain when is better use UUID and when Longint field in 
>> primary key?
>> 
>> Thanks
>> Ferdinando

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread npdennis via 4D_Tech
> I can’t speak to any performance differences when using them as indexed key 
> fields for searches … they likely have similar performance, but that’s just a 
> guess.

I can speak to the performance, they are not the same. UUID is faster.

4D uses BTrees (binary trees or sometimes called red/black trees) to store 
index structures. A sequential number is the slowest possible way to create or 
add to a BTree. The treee needs to remain balanced, so there is a lot of 
rebalancing the needs to occur when numbers are added sequentially. UUID’s are 
much more random and require less tree balancing and are therefore faster for 
delete, create operations.

I started using UUID since version 11 and love them.

--
Neil Dennis
4D Developer since 1990


GreaText - Designing Software for the Way You Work
716 East 1850 N
North Ogden, UT 84414

mailto:npden...@greatext.com
http://www.greatext.com/


**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Ronald Rosell via 4D_Tech
I can’t speak to any performance differences when using them as indexed key 
fields for searches … they likely have similar performance, but that’s just a 
guess.

But here’s one situation where UUID has an advantage.  We historically used the 
Sequence Number (a longint) as the primary key, until we discovered a 
shortcoming:  Merging data from multiple data files leads to duplicated primary 
keys.  With a sequence number, unless you somehow modify it (modifying the 
automatically-generated Sequence Number with some sort of instance prefix for 
the datafile) you will run into situations where different datafiles will have 
the same sequence numbers for various records.  That’s not a problem if those 
files will forever remain separate, but it requires a lot of re-sequencing of 
those tables and related tables if the files are ever merged or need to 
communicate with one another while referencing primary keys.

With UUID, it is highly unlikely (nearly impossible) that two data files will 
generate the same UUID for any records;  the chances are nearly 100% that all 
of the UUID’s will be unique, across every instance of your database.  So, if 
you later set up some kind of inter-database communication referencing records 
by primary key, or have to merge data files, you’re much less likely to run 
into issues.
__

Ron Rosell
President
StreamLMS


> On Aug 4, 2017, at 1:27 AM, stardata.info via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> Hi All,
> 
> Someone can explain when is better use UUID and when Longint field in primary 
> key?
> 
> Thanks
> Ferdinando
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**