Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread David Adams via 4D_Tech
> This all started with David making a broad blanket statement about "data
> integrity" and "row duplication" and how using "synthetic" record ID keys
> ruined the ability to automatically​ ​filter out "dupes". (I _think_ that
> was your point David. Please correct me if not.) And in that strict sense,
> if a "row" is really actually

I've been a bit taken aback at how this thread has gone, to be honest.

My point was really basic: Bolding on a random number field makes a row
unique, it doesn't make the underlying *data* unique.

That's it. That's more-or-less me using one of my standard techniques:

1) Find the obvious.
2) Draw a circle around it.
3) Declare it mine.

Patent pending...

Seriously, I'm saying something that just shouldn't be even the tiniest bit
contentious. That point is nothing but an established principle from
database design, it's not something I'd normally consider debatable. I'd
put it in the same category as when someone says, "Race condition? I've
figured out a shortcut." No you haven't. That's pretty much in line with
things the patent office won't even review:

* Perpetual motion machines.

* Squirrel-proof bird feeders.

Until such time as the fundamental laws of nature change, neither of those
inventions can work.

A lot of the stuff about duplicates doesn't bear on the design question
about the table's columns.

Keller 123
Keller 123
Keler 123

How many rooms are there? You have no way to know. It could be 1, 2, or 3:

1: You've got a pure duplicate and a typo-based duplicate.
2: You've got a pure duplicate and two buildings with very similar names.
3: You've got two buildings with very similar names *and you are missing
enough data to identify rows uniquely.*

In that last case,there's a design question. If you haven't identified
enough columns of *data* from the real world to uniquely identify each row,
you don't have a data model that can be implemented in a relational
database. I mean, sure, you can use a tool to store the data - but I don't
see how you model that. What makes each row unique? I mean out in the real
world. Perhaps there's a floor, or a direction, or some kind of vernacular
detail that actual people use to distinguish rooms. In which case, you've
found a missing field! That's part of the point of modeling. But what if
there really is nothing? Well, then it's pretty common to add something,
like a number. But that then (generally) needs to flow out into the real
world. So, you synthesize data in the database, add it to the row *and* to
the real world. Serial numbers are a perfect example of this. Same with
account numbers, customer numbers, etc. And UUIDs are functionally serial
numbers. They have nothing at all to do with the data in the row, they're
purely an implementation-level convenience.

As some people seem to be getting the impression I'm against UUIDs, that's
incorrect. I'm glad they're native (I used them sooner than that), I've got
nothing against them. But they are what they are and not more. Any
complaints I had about how they were implemented in 4D is irrelevant now as
it's been since V14. Too old to matter. I also have been getting the
impression when people say "primary key", they don't always appreciate what
a "key" really is - an attribute or set of attributes that are entirely
about the row of data that also uniquely identify a row. That's a key...off
the top of my head, I'm sure there's a better summary out there. The 4D
world has long had a peculiarly hostile relationship to normalization,
starting with 4D itself (subtables, wrong examples in the manuals, etc.)
I've seen the same cropping up again with how object fields are being
promoted. (Slapping some JSON in a text field and calling your system NoSQL
isn't necessarily helpful...)

But just slapping a random-but-unique number (sequential or not) onto the
row? That easily masks real duplicates. And why other databases prominently
support multi-field constraints on tables - for precisely this reason.
Again, you *can* do this in 4D with a compound index set to unique. So, the
feature exists, I just haven't seen it being used widely. Perhaps it
because engine-level errors feel disproportionately painful to deal with in
4D? That could just be me. Sincerely, I may be more allergic to 4D
engine-level errors than others. I pretty much turn off the unique
attribute and check uniqueness myself. Hmmm. I might be missing upping my
game here. I kind of remember that years ago hitting a duplicate error
wasn't easy to trap for and got ugly. I can't swear that's true, I just
remember having that impression. Long story shorter, I probably
haven't tried 4D's uniqueness controls in years. Do people use them? How
are they in V16? Since I'm doing some Postgres stuff these days, yeah, over
there I set up multi-column constraints, no question.

There are lots of other kinds of duplicates out there that can crop up,
even if you have a good model behind your tables. Life is hard. But it's
important and 

​Re: UUID vs Longint primary key

2017-08-08 Thread steve simpson via 4D_Tech
On Tue, Aug 8, 2017 at 11:00 AM,
​
Chip Scheide <4d_o...@pghrepository.org> wrote:

>
> Worse, I've found that the same product, from the same vendor in
> differing purchase amounts (1 vs case) is the same part number, but
> different pricing! So.. even a check on part numbers is insufficient to
> stop duplicate entries.
>
> ​Well I believe there are "dupes" and then there are dupes. There are
unacceptable dupes and there are dupes you have to or need to live with.
And then there are "near dupes". Like your example above. That is not
"really" a dupe because there is one field/col, pack size, that is NOT
duplicated (case vs 1). On and on.

This all started with David making a broad blanket statement about "data
integrity" and "row duplication" and how using "synthetic" record ID keys
ruined the ability to automatically​

​filter out "dupes". (I _think_ that was your point David. Please correct
me if not.) And in that strict sense, if a "row" is really actually
absolutely duplicated, that is _probably_ bad. Or maybe not if you didn't
include that "pack size" field that would have changed the row to unique.
Or all the other examples cited on this thread about duplicate names that
were not _really_ duplications; they just needed a little more information
included in the "row" to better define it. In fact we all used to
experience it right here nearly daily with our Walt Nelson(Guam) vs Walt
Nelson(Seattle) signatures. Constantly confusing without that one little
added tidbit.

So my point was, ​it all depends. And sometimes you have to design your own
system differently or provide tools within your current system to suss out
what, how, why and when a "dupe" occurred. And how to - or IF to - fix it
or prevent it or even find it.

I too am definitely a convert to using
 UUID over longint
​. AND in using them in preference to some construction using row data
itself which may well change as the business grows/changes and will NOT
play well when you absorb new data sources (buy a competitor for example
with nearly identical inventory items or combine existing standalone data
installations into one big common enterprise bucket, or decide for all
kinds of business reasons to extract a certain batch and combine it with
another batch in a different bucket, etc.etc.etc. Data duplication of one
sort or another are bound to occur in many of these "growth" scenarios and
more often than not the merging and cleaning of those dupes is not
reducible to some sort of algorithm without human hands to help. Or
whatever.

As ​
Neil succinctly describe
​d
:

 - UUID is faster (do to "random" data in the index)
>  - UUID solves problems with distributed systems that sync
>  - UUID fixes the home grown sequence problem with transactions
>  - UUID is not easily readable by human and keeps me from being tempted to
> expose them :)
>
> ​And to Chip's point, I DO sometimes expose those UUIDs as read only info
on certain Admin Review pages. I sometimes place a button to "copy to
pasteboard" if it is appropriate that the admin might desire to do some
searching with that UUID - for as we all know​, they are hellishly
difficult to type. And in many projects I retain that seq longint idea
because it really IS a useful human marker that is easier than a date:time
stamp to read and quickly sort on and in general "glom" as you scan down
long lists of rows. But I've been burned way too many times to ever use it
again as a unique recordID. I now consider it a user interface type aid
only, still useful as a ProductID or some such in many cases. But not as a
Unique-Unvarying-Forever-Regardless-Of-Source-Or-Destination-Record-Key.

Steve Simpson
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-08 Thread Jody Bevan via 4D_Tech
Chip:

I have always tried doing at the point of data entry. Obviously no code is 
perfect, and either is the object between the keyboard and the chair. My 
experience is that few Administrators would take the time to filter through 
duplicates. Getting them to do system maintenance was almost impossible. They 
had too many other responsibilities. Typically they only worked on this if a 
problem was identified. For that we have a built in record merger.

Application: Medical

If a phone number is associated with the record for an individual - then I will 
use that. How it is implemented depends on the application (even within the 
application).

Lets say I am adding in a person. They enter in a name (which is not that good 
as people will call themselves by different names). I remember one patient that 
would present with a different name (About 6 personalities) in a walk in 
clinic. If a new record was going to be created they would try and get a phone 
number from the patient. When the phone number was put in a dialog would come 
up displaying all the names of people with the same phone number. It let the 
staff (if they ‘felt like it’) to try and filter for duplicates at that point.

For those staff that were diligent it prevented many duplicates in the medical 
data base. We had clinics that had 420,000 ‘regular’ patients. Then they had a 
walk in that had another 500,000 irregular patients. Their Administration was 
more eager to keep the duplicates down.

We also used the Health Insurance Number as another indicator. Not all 
patient’s would present with their health care card though. With ‘universal’ 
health care it is not as critical.

Application: National Order / Deliver Desk

We do the same in Canada for Postal Code and Street name

With a national order delivery system we had the addresses divided up so that 
we could look up Postal Codes (Canada) by address. This was also used for 
keeping duplicates down. This is when I learned that in Canada Postal Codes 
change weekly (nationally - there are postal codes changing somewhere).


Looking in your system for unique or even semi-unique data can be used to help 
keep duplicates down.
In the medical system we let them have a picture in the patient’s record that 
was presented (if security settings permitted) at the time of entry to the 
facility.

Those are the ones that I have used.

Jody



Jody Bevan
ARGUS Productions Inc.
Developer

Argus Productions Inc. 




> On Aug 8, 2017, at 8:25 AM, Chip Scheide via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> Jody,
> what are your normal duplicate reducing/removal technic(s)?

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread Chip Scheide via 4D_Tech
ok - I'm confused.
what is the difference between a display only field showing an internal 
ID number,
and a duplicate, display only ID number showing on the entry form?

 
On Tue, 8 Aug 2017 08:07:00 -0600, npdennis wrote:
>> I find that by placing the internal linking value (non-editable) on an 
>> entry form
>> GREATLY enhances the ability/simplicity of tracking down data issues.
> 
> This same thing can also be done by adding the same field to the same 
> table but not linking off of it internally :)
> 
> 
> 
> --
> Neil Dennis
> 4D Developer since 1990
> 
> 
> GreaText - Designing Software for the Way You Work
> 716 East 1850 N
> North Ogden, UT 84414
> 
> mailto:npden...@greatext.com
> http://www.greatext.com/
> 
> 
---
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing 
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-08 Thread Chip Scheide via 4D_Tech
Jody,
what are your normal duplicate reducing/removal technic(s)?

On Mon, 7 Aug 2017 12:28:21 -0600, Jody Bevan via 4D_Tech wrote:
> John:
> 
> Thanks for sharing this. I had not thought of this way after all 
> these years. I will take the idea and apply where appropriate, in 
> addition to my normal code to reduce duplicates.
> 
> 
> Jody Bevan
> ARGUS Productions Inc.
> Developer
> 
> Argus Productions Inc. 
> 
> 
> 
> 
>> On Aug 7, 2017, at 11:50 AM, John Baughman via 4D_Tech 
>> <4d_tech@lists.4d.com> wrote:
>> 
>> What I have done is to have a field in the table  that contains the 
>> keys that make the record unique with all spaces, special 
>> characters, and vowels removed, as well as eliminating any 
>> consecutive consonants . See my example below. So far this has 
>> worked pretty well for me and I guess would fall under David’s 
>> category of "Carefully program your system to detect and prevent 
>> duplicate rows”
>> 
>> John
>> 
>> ―― Example 
>> $firstName:=[Contacts]firstName (John)
>> $lastName:=[Contacts]lastName (Baughman)
>> $company:=[Contacts]company (BY’te DESIGN Hawwaii)  notice I have 
>> mistakenly put more than 1 w in Hawaii.
>> $DupeCheck:= $firstName+ $lastName+$company
>> [Contacts]DupeCheck:=AlphaOnlyNoVowels ($DupeCheck;"*”)  
>> //AlphaOnlyNoVowels does the heavy lifting. The asterisk tells the 
>> method to remove consecutive consonants.
>> 
>> [Contacts]DupeCheck now contains  “JHNBGHMNBYTDSGNHW”
>> 
>> I wrap the above in a duplicate checking method for the [Contacts] 
>> table called ContactsDuplicateManager
>> 
>> Whenever a record is updated or created in the Contacts table…
>> 
>> $DupeCheck:= ContactsDuplicateManager ("isDuplcate”;[Contacts]ID)
>> 
>> The ContactsDuplicate method creates the check string as above and 
>> searches the contacts table for duplicates using the 
>> [Contacts]DupeCheck field. If no duplicates are found it returns the 
>> check string. If a duplicate is found it returns the check string 
>> with a prepended asterisk.  The contact ID if passed prevents the 
>> dupe check from finding the record being updated. If this is a new 
>> record 0 is passed for the Contact ID. So…
>> 
>> If ($DupeCheck =“*@“
>>   Handle the duplicate in context. If, for example this is a user 
>> updating or creating a contact record, warn the user of the possible 
>> duplicate with available options.
>> 
>> else
>>   [Contacts]DupeCheck:=$DupeCheck
>>   SAVE RECORD([Contacts)
>> 
>> end if
>> 
>> -
>> 
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **
---
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing 
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread Chip Scheide via 4D_Tech
David,
how do you tell these two issues apart:
Customer name : John Smith
Customer Name : John Smyth

is this a typo (one should be Smyth and is not, or one should be Smith 
and is not)?
is it real (2 John Smiths with different spellings)

I see this problem with a 'free form' entry inventory data. i.e. the 
user can enter whatever in the item name/description field.

I do not see (for my inventory) a way to keep duplicate entries like:
Glove, latex, Medium
Medium Glove, latex
Latex glove size medium

yes, I can check part numbers, but... same glove from different 
suppliers can/will have different part numbers
ex: 
Joe's supply house: 12345
Fred's house of supplies: 56342-m

Worse, I've found that the same product, from the same vendor in 
differing purchase amounts (1 vs case) is the same part number, but 
different pricing! So.. even a check on part numbers is insufficient to 
stop duplicate entries.


I have setup a means for the users to remove duplicates, but of course 
they never do

always welcome a way to improve this situation
Chip

On Sun, 6 Aug 2017 13:06:28 -0700, David Adams via 4D_Tech wrote:
> Since I'm diving into Postgres these days (mostly on the import side - my
> table designs aren't even right yet - pacing myself), I see that they have
> a wide range of tools for enforcing row uniqueness and referential
---
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing 
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread npdennis via 4D_Tech
> I find that by placing the internal linking value (non-editable) on an 
> entry form
> GREATLY enhances the ability/simplicity of tracking down data issues.

This same thing can also be done by adding the same field to the same table but 
not linking off of it internally :)



--
Neil Dennis
4D Developer since 1990


GreaText - Designing Software for the Way You Work
716 East 1850 N
North Ogden, UT 84414

mailto:npden...@greatext.com
http://www.greatext.com/


**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-08 Thread Chip Scheide via 4D_Tech
yes,
BUT -
I find that by placing the internal linking value (non-editable) on an 
entry form
GREATLY enhances the ability/simplicity of tracking down data issues.

ex (without viewable internal key):
user: "... customer John Smith does not show the correct invoice(s)
Dev/Sys admim: John Smith.. just a minute... hmm There are 43 John 
Smith's in the system, which one do you mean?
user: you know JOHN SMITH!
. [no need to continue  :) ]

ex: (with visible key):
user: "... customer John Smith does not show the correct invoice(s)
Dev/Sys admin: Can you please give me the number in the upper left hand 
corner, just below the screen title 'Customer Information'...
user:  1234567
Dev/Sys admin: just a moment... Ok got it. Now what is the exact 
problem
. [of course at this point getting the user to give useful 
information is next to impossible, but at least there is now a 
reference to the problem record/customer...]


On Sun, 6 Aug 2017 13:08:14 -0600, npdennis via 4D_Tech wrote:
> 
> On the other hand, you do model the data after business relations, 
> but the keys that tie that relation data need/should never be seen in 
> a well designed system. If a user readable key is needed by business, 
> then there should be another data piece that the user can read (like 
> an MRN, medical record number, or an abbreviation that is unique and 
> human readable) But these should never be used to link together data 
> in a structure in primary key foreign key relation.
> 
---
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing 
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-07 Thread David Adams via 4D_Tech
I've found that after hours "helper" routines that fire off and run in the
> background working through data to flag dupes for admin oversight next
day is
> popular with some managers who prefer to make their own decision about
whether
> some stuff really is a dupe or not. Some duplicate data has to be
eliminated
> "right now" before the record is saved; whereas some might be interesting
to
> investigate a little more in depth. The whys and wherefores --what caused
it,
> who caused it, where did it come from, and why, is it legit or a real
mistake;
> such questions can often lead to better processes, better training, better
> form design, better import provisioning, or better import pre-cleaning,
etc.
> It all depends on the project and biz needs.

Nicely described. Human judgement is pretty important or critical with some
"duplicate" row choices. Says the guy with a name so common he's seen his
name taken of a flight manifest because they already had one David Adams on
board. Man, that guy sucks. Absolutely agree about the training benefits of
*quick* feedback on duplicates. You run a scan at night, find a list of
possible duplicates and then can go and find the person responsible. Like
you said, the perfect opportuity for better data entry tools ("Oh, the big
screen is too hard to enter when we get a phone call, so we always just add
a new customr record"), better training (see previous) and, hopefully,
better communications amongst everyone involved.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

​Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-07 Thread steve simpson via 4D_Tech
On Mon, Aug 7, 2017 at 3:00 PM,
​
​
David Adams  wrote:

>
>
> > How do you deal with that problem (Preventing duplicate data)
>

Definitely
​
"Carefully program your system to detect and prevent duplicate rows" as
appropriate. Generally such a Dupe Check can take many forms depending on
the business needs, the data in question, and the data entry
process/environment. I've not yet found a solution that fits all. Generally
it is much easier to do this if the "data entry user" is a browser post -
where you have the time and space to do more complicated look-ups. (More
and more of my own projects are web front ends to 4D in the backend.)
 ​I've used similar constructions as John's ContactsDuplicateManager example
as well, although I steer away from storing extra data if I can. For less
immediacy data needs I've found that after hours "helper" routines that
fire off and run in the background working through data to flag dupes for
admin oversight next day is popular with some managers who prefer to make
their own decision about whether some stuff really is a dupe or not. Some
duplicate data has to be eliminated "right now" before the record is saved;
whereas some might be interesting to investigate a little more in depth.
The whys and wherefores --what caused it, who caused it, where did it come
from, and why, is it legit or a real mistake; such questions can often lead
to better processes, better training, better form design, better import
provisioning, or better import pre-cleaning, etc. It all depends on the
project and biz needs.

Steve Simpson
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-07 Thread David Adams via 4D_Tech
As a variant on John's technique, I'll combine fields into a single text
block that I then run through a fast hashing algorithm that returns a
longint.

What good is that longint? It helps in two cases:

* If you're comparing two copies of the same record during an update/sync,
etc., then you can hash the new copy and see if the hash differs from the
stored original. If they match, you can figure there's been no update.
[Subject to availability, limitations apply. See 'hashing' for complete
terms and details.]

* When you don't know if the row is a duplicate or not, hash the incoming
data and see if it matches something else. If it  does not match, you've
got a new row. If it does match one or more rows, you *might* have a
duplicate, but at least you only need to gets a small # of records to find
out.

Regarding hashing, I don't use SHA1, MD5, etc. Because I don't need them
and don't want the overhead. Instead I use some hashing functions from an
old (10+ years ago) tech notes. They still work great and are a good match
for exactly this sort of application.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-07 Thread Jody Bevan via 4D_Tech
John:

Thanks for sharing this. I had not thought of this way after all these years. I 
will take the idea and apply where appropriate, in addition to my normal code 
to reduce duplicates.


Jody Bevan
ARGUS Productions Inc.
Developer

Argus Productions Inc. 




> On Aug 7, 2017, at 11:50 AM, John Baughman via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> What I have done is to have a field in the table  that contains the keys that 
> make the record unique with all spaces, special characters, and vowels 
> removed, as well as eliminating any consecutive consonants . See my example 
> below. So far this has worked pretty well for me and I guess would fall under 
> David’s category of "Carefully program your system to detect and prevent 
> duplicate rows”
> 
> John
> 
> —— Example 
> $firstName:=[Contacts]firstName (John)
> $lastName:=[Contacts]lastName (Baughman)
> $company:=[Contacts]company (BY’te DESIGN Hawwaii)  notice I have mistakenly 
> put more than 1 w in Hawaii.
> $DupeCheck:= $firstName+ $lastName+$company
> [Contacts]DupeCheck:=AlphaOnlyNoVowels ($DupeCheck;"*”)  //AlphaOnlyNoVowels 
> does the heavy lifting. The asterisk tells the method to remove consecutive 
> consonants.
> 
> [Contacts]DupeCheck now contains  “JHNBGHMNBYTDSGNHW”
> 
> I wrap the above in a duplicate checking method for the [Contacts] table 
> called ContactsDuplicateManager
> 
> Whenever a record is updated or created in the Contacts table…
> 
> $DupeCheck:= ContactsDuplicateManager ("isDuplcate”;[Contacts]ID)
> 
> The ContactsDuplicate method creates the check string as above and searches 
> the contacts table for duplicates using the [Contacts]DupeCheck field. If no 
> duplicates are found it returns the check string. If a duplicate is found it 
> returns the check string with a prepended asterisk.  The contact ID if passed 
> prevents the dupe check from finding the record being updated. If this is a 
> new record 0 is passed for the Contact ID. So…
> 
> If ($DupeCheck =“*@“
>   Handle the duplicate in context. If, for example this is a user updating or 
> creating a contact record, warn the user of the possible duplicate with 
> available options.
> 
> else
>   [Contacts]DupeCheck:=$DupeCheck
>   SAVE RECORD([Contacts)
> 
> end if
> 
> -
> 

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-07 Thread John Baughman via 4D_Tech
What I have done is to have a field in the table  that contains the keys that 
make the record unique with all spaces, special characters, and vowels removed, 
as well as eliminating any consecutive consonants . See my example below. So 
far this has worked pretty well for me and I guess would fall under David’s 
category of "Carefully program your system to detect and prevent duplicate rows”

John

—— Example 
$firstName:=[Contacts]firstName (John)
$lastName:=[Contacts]lastName (Baughman)
$company:=[Contacts]company (BY’te DESIGN Hawwaii)  notice I have mistakenly 
put more than 1 w in Hawaii.
$DupeCheck:= $firstName+ $lastName+$company
[Contacts]DupeCheck:=AlphaOnlyNoVowels ($DupeCheck;"*”)  //AlphaOnlyNoVowels 
does the heavy lifting. The asterisk tells the method to remove consecutive 
consonants.

[Contacts]DupeCheck now contains  “JHNBGHMNBYTDSGNHW”

I wrap the above in a duplicate checking method for the [Contacts] table called 
ContactsDuplicateManager

Whenever a record is updated or created in the Contacts table…

$DupeCheck:= ContactsDuplicateManager ("isDuplcate”;[Contacts]ID)

The ContactsDuplicate method creates the check string as above and searches the 
contacts table for duplicates using the [Contacts]DupeCheck field. If no 
duplicates are found it returns the check string. If a duplicate is found it 
returns the check string with a prepended asterisk.  The contact ID if passed 
prevents the dupe check from finding the record being updated. If this is a new 
record 0 is passed for the Contact ID. So…

If ($DupeCheck =“*@“
   Handle the duplicate in context. If, for example this is a user updating or 
creating a contact record, warn the user of the possible duplicate with 
available options.

else
   [Contacts]DupeCheck:=$DupeCheck
   SAVE RECORD([Contacts)
  
end if

-





> On Aug 7, 2017, at 6:56 AM, Dennis, Neil via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
>> How do you deal with that problem (Preventing duplicate data)
> 
> When unique data is required because of a business need, I do implement one 
> of your suggested methods: "Carefully program your system to detect and 
> prevent duplicate rows."
> 
> I would suggest not doing this in a trigger, but instead on data entry 
> (imports, user entry). The 4D command "Find in Field" works in many of these 
> cases.
> 
> Neil
> 
> 
> 
> 
> --
> 
> 
> Privacy Disclaimer: This message contains confidential information and is 
> intended only for the named addressee. If you are not the named addressee you 
> should not disseminate, distribute or copy this email. Please delete this 
> email from your system and notify the sender immediately by replying to this 
> email.  If you are not the intended recipient you are notified that 
> disclosing, copying, distributing or taking any action in reliance on the 
> contents of this information is strictly prohibited.
> 
> The Alternative Investments division of UMB Fund Services provides a full 
> range of services to hedge funds, funds of funds and private equity funds.  
> Any tax advice in this communication is not intended to be used, and cannot 
> be used, by a client or any other person or entity for the purpose of (a) 
> avoiding penalties that may be imposed on any taxpayer or (b) promoting, 
> marketing, or recommending to another party any matter addressed herein.
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

John Baughman
Kailua, Hawaii
(808) 262-0328
john...@hawaii.rr.com





**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-07 Thread Chuck Miller via 4D_Tech
perhaps you can use sounded and store that in a hidden field.

Regards
Chuck

 Chuck Miller Voice: (617) 739-0306
 Informed Solutions, Inc. Fax: (617) 232-1064   
 mailto:cjmillerinformed-solutions.com 
 Brookline, MA 02446 USA Registered 4D Developer
   Providers of 4D and Sybase connectivity
  http://www.informed-solutions.com  

This message and any attached documents contain information which may be 
confidential, subject to privilege or exempt from disclosure under applicable 
law.  These materials are intended only for the use of the intended recipient. 
If you are not the intended recipient of this transmission, you are hereby 
notified that any distribution, disclosure, printing, copying, storage, 
modification or the taking of any action in reliance upon this transmission is 
strictly prohibited.  Delivery of this message to any person other than the 
intended recipient shall not compromise or waive such confidentiality, 
privilege or exemption from disclosure as to this communication. 

> On Aug 7, 2017, at 12:56 PM, Dennis, Neil via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> When unique data is required because of a business need, I do implement one 
> of your suggested methods: "Carefully program your system to detect and 
> prevent duplicate rows."
> 
> I would suggest not doing this in a trigger, but instead on data entry 
> (imports, user entry). The 4D command "Find in Field" works in many of these 
> cases.
> 
> Neil

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Preventing Duplicate Data WAS UUID vs Longint primary key

2017-08-07 Thread Dennis, Neil via 4D_Tech
> How do you deal with that problem (Preventing duplicate data)

When unique data is required because of a business need, I do implement one of 
your suggested methods: "Carefully program your system to detect and prevent 
duplicate rows."

I would suggest not doing this in a trigger, but instead on data entry 
(imports, user entry). The 4D command "Find in Field" works in many of these 
cases.

Neil




--


Privacy Disclaimer: This message contains confidential information and is 
intended only for the named addressee. If you are not the named addressee you 
should not disseminate, distribute or copy this email. Please delete this email 
from your system and notify the sender immediately by replying to this email.  
If you are not the intended recipient you are notified that disclosing, 
copying, distributing or taking any action in reliance on the contents of this 
information is strictly prohibited.

The Alternative Investments division of UMB Fund Services provides a full range 
of services to hedge funds, funds of funds and private equity funds.  Any tax 
advice in this communication is not intended to be used, and cannot be used, by 
a client or any other person or entity for the purpose of (a) avoiding 
penalties that may be imposed on any taxpayer or (b) promoting, marketing, or 
recommending to another party any matter addressed herein.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-07 Thread David Adams via 4D_Tech
Since it sounds like pretty much everyone uses UUIDs for links (me too,
although I may not have an actual arrow drawn), maybe se can turn this
thread in a different direction? Given that random synthetic keys do
nothing to avoid duplicate rows - and may actually make duplicate rows more
likely - what do people do to avoid duplicates. To be clear what I mean by
a duplicate:

1  Keller  123
2  Keller 123
3  Keler 123

Rows 1 and 2 are duplicates, row 3 is not. It might be messy/bad data, it
might not - but there's no way for the engine to know that. 1 and 2 are
duplicates, despite having unique made up numbers for a key. So, the
question is: How do you deal with that problem? Here are some solutions
I've used and/or seen:

* Cover your ears and say "neyh, neyk, neyh I CAN'T HEAR YOUR!" It's not a
great technique, but it is simple and cheap to implement.

* Ignore the possibility of duplicate rows unless/until it obviously bites
you. Depending on data, this might be good enough. In same cases, it's just
not acceptable. (Screwing up 100 log Web lines out of 1,000,000?
Meaningless. Screwing up 100 transactions in your accounts? Yeah, not
allowed.)

* Carefully program your system to detect and prevent duplicate rows:
-- at point of entry (client-side/UI, etc.)
-- at the engine level (triggers, indexes)

* Do after-the-fact audits. Either on demand or on a schedule.

That's pretty much the world of techniques I've seen.

Any thoughts?
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-07 Thread Charles Miller via 4D_Tech
It might be my memory, but I do not think that is how it worked in the 
beginning I was working on relational model dbs before SQL was a language

Rdb at DEC for example.

That was one of the ket differences between model types how you created and 
maintained the relations. I do not believe that there was even a thought of 
primary keys

Regards

Chuck

 Chuck Miller Voice: (617) 739-0306
 Informed Solutions, Inc. Fax: (617) 232-1064   
mailto:miller.cjaygmail.com 
 Brookline, MA 02446 USA Registered 4D Developer
   Providers of 4D and Sybase connectivity
  http://www.informed-solutions.com  



> On Aug 7, 2017, at 12:14 PM, steve simpson via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
>> 
>> 
>> On the other hand, you do model the data after business relations, but the
>> keys that tie that relation data need/should never be seen in a well
>> designed system. If a user readable key is needed by business, then there
>> should be another data piece that the user can read (like an MRN, medical
>> record number, or an abbreviation that is unique and human readable) But
>> these should never be used to link together data in a structure in primary
>> key foreign key relation.
>> ​ [snip]​
>> 
>> 
>> 
> +1 My point exactly, only stated much better.​

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

​Re: UUID vs Longint primary key

2017-08-07 Thread steve simpson via 4D_Tech
On Mon, Aug 7, 2017 at 11:51 AM, <
​
<4d_tech-requ...@lists.4d.com>
​
npden...@greatext.com> wrote:

>
> ​[snip]
>
> On the other hand, you do model the data after business relations, but the
> keys that tie that relation data need/should never be seen in a well
> designed system. If a user readable key is needed by business, then there
> should be another data piece that the user can read (like an MRN, medical
> record number, or an abbreviation that is unique and human readable) But
> these should never be used to link together data in a structure in primary
> key foreign key relation.
> ​ [snip]​
>
>
>
+1 My point exactly, only stated much better.​
​
​

Steve Simpson
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-07 Thread David Adams via 4D_Tech
> "Yes, use UUIDs in a 4D context rather than SeqNos. And, by the way, the
way 4D implemented these
> UUIDs - I don't approve with it."

Not really.

The difference between sequence numbers (your own or 4D's) and UUIDs isn't
something I addressed. Don't care. Whatever is better in your system -
conceptually they're more alike than different.

I've been saying pretty plainly that I'm happy to have UUIDs natively in 4D
and that I use them for links all the time.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-07 Thread Marcus Straßmann via 4D_Tech
David,

in short, you are answering the OP's question as:

"Yes, use UUIDs in a 4D context rather than SeqNos. And, by the way, the way 4D 
implemented these UUIDs - I don't approve with it."

Fine :-)

Marcus


MacStrass - Marcus Straßmann
Softwareentwicklung und Beratung
Auf der Markscheide 35
D-44807 Bochum

Mobil: +49 (173) 374 39 92
eMail: macstr...@macstrass.de

> Am 04.08.2017 um 21:52 schrieb David Adams via 4D_Tech <4d_tech@lists.4d.com>:
> 
> 4D's UUIDs function as globally unique row *serial numbers*. That's great
> for backups and convenient for physical relations, but it has exactly zero
> to do with a real "primary key" or relational integrity. I would have
> preferred to see UUIDs implemented as a virtual field:
> 
> [Customer].UUID()
> 
> or
> 
> Get record UUID ([Customer])
> 
> ...and let 4D manage them internally, automatically, and invisibly. But
> that's not how it worked out.
> 
> Yeah, UUIDs are impossible to type and I'm still not 100% clear on how you
> search for empty ones. I've seen the tech tips etc., and still get some
> really eerie weird results at times.
> 
> In my own code, I tend to run verification methods that check for real
> (logical) keys. Namely, the field or combination of fields that uniquely
> identifies the *data* in the row. A UUID-as-physical-key very easily masks
> duplicate rows because it is, by definition, unique. You row might be
> unique, but your data might not be, if you know what I'm saying.
> 
> Anyway, I tend to scan for duplicate data, orphan records (a child record
> that *requires* a parent and doesn't have one is an orphan. Not all child
> records require parents, but 4D's diagrams, etc. don't distinguish between
> the two.) Oh, an widows. Mostly, I find lurking duplicates and orphans.
> Widows? Not so often. Not sure if that's just me. In theory, none of these
> things should ever happen. But then again, a whole lot of things work out
> better in theory...
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread David Adams via 4D_Tech
On Sun, Aug 6, 2017 at 5:33 PM, steve simpson via 4D_Tech <
4d_tech@lists.4d.com> wrote:

> David and Chuck, point taken about "rows" being unique because of and
> created with the containing data of each row -- up to a ... well "a point".




The misunderstand of relational theory, design and practice - and open
hostility towards it - is peculiar to the 4D world. With *any* other RDBMS,
all of the stuff I mentioned is just part of what people know and assume.
Because science.

So, I won't be getting into it more than that. But will say, it's well
worth understanding what normalization *actually* is, what it's good for,
what it's not good for, and how you deal with that. It's also worth
understand that transactionally-oriented databases are ideal for a wide
range or problems, ORDBS for a somewhat wider range of problems, and
doc-oriented/NoSQL databases for a very narrow range of problems. There are
good applications for each. (And you can often do different design
topologies in the same RDBMS, even 4D.)

Dan and I wrote a nice summary of normalization in our book and it's up
somewhere for download as a PDF.  A couple of college professors have
contacted us down the years to see about using it in class. ("Of course!")
because the standard database theory texts are, well, dense to the point of
incomprehensibility...what Dan and I wrote is pretty straightforward and
practical.

And lest anyone get hung up on the word theory:
1) I like the theory *because* it helps my practical work.
2) In science, a theory is more substantial than a law. (Just
saying...that's how the words work, but it's commonly misunderstood.)

NB: The whole "normalization sucks" business would make you look really,
really odd in any other RDBMS community so, if anyone ventures out into the
wider world, make sure that you 1) understand normalization, 2) understand
the difference between logical and physical models, and 3) don't tell
anyone that you think normalization doesn't make sense.

Adams out.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

​Re: UUID vs Longint primary key

2017-08-06 Thread steve simpson via 4D_Tech
David and Chuck, point taken about "rows" being unique because of and
created with the containing data of each row -- up to a ... well "a point".
Like you said, a real pain when real world reality steps in. I believe the
only truism involved here is your statement that "...
​i
t then leaves the task of doing integrity checks and so on to you.
​" Exactly! Regardless of the way you decide to handle 4D record
uniqueness, you will surely encounter ​the ongoing need to continually
manage integrity "on your own". Hoping that your "unique key" will do that
for you is likely not realistic, regardless of your method.

I'm sure that many of us have had to eventually abandon the 4D long int
"auto increment" in favor of some sort of crafted version of it when we
expanded to mixing data sources. For instance, you build and distribute an
application that starts up out the gate with recordID #1, then #2, then #3,
etc with each install. But  5 years later the client wants everyone - all 4
thousand installs - to start sending data to a central accounting
processing point -- retaining uniqueness across the enterprise. Oooops.
Suddenly, you have to make sure that each and every installation has some
sort of retrofitted unique number, something not envisioned when it was
first developed. Whoa. Big problem.

I must say that the only "real" solution to this so far in the 4D world -
for me at least - has been the auto UUID key field. Finally a way to create
a record (and its children) that can be moved around willy-nilly without
angst. OK, granted it does not solve the issue of "integrity" from row to
row. But neither does your own example of finding uniqueness within each
row's actual data (i.e.
​
Keller123, 1
​
Keller123, 2
​
Keller123, etc); that is a task you have to deal with regardless of how you
handle your own unique key - logical ways - important realistic vertical
market biz logic ways -- issues that may be unique to each project -- logic
that may change over time. It has NOTHING to do with the record key in of
of itself. Chuck Miller said in response, "...
​
I am not David but I agree with his assessment. Relational model databases
by definition are not supposed to use keys that have no meaning. They are
supposed to create relations that have meaning". All I can say is that
"meaning" per row, and meaning contained within children rows -- especially
meaning to "humans" -- may not really exist and may have little to cling to
or may even change as projects expand and morph.

I'm really just reacting to what I experience as "theory" mismatch with
real world reality. We need to separate "record uniqueness" as an idea from
"meaning" of the data within a row, since no matter how you handle record
uniqueness you will still need to craft your own unique project needs to
insure "data integrity", which can take many forms and have many challenges
beyond some simple "row" logic. And realize that these needs will likely
change over time as the project morphs. So the first need is to handle "4D
record ID uniqueness". (and in my experience so far, ONLY UUIDs do that
satisfactorily.) Then address your project specific "data integrity" needs
and understand that you'll have to deal with that on an ongoing basis
regardless of how you handle record uniqueness as your app grows and morphs
and expands and begins to interact with other outside data sources.

And BTW, we all have users who get a lot of info from the visual feedback
of incremental "record numbers". OK, no problem. Just another field:
"record number", far different from "unique internal record KEYID". And
likely understandable and fixable for human needs if dupes are introduced
with data mergers. Just another "user comprehension data integrity" task.

Steve Simpson
​Cimarron Software

On Sun, Aug 6, 2017 at 3:00 PM, <4d_tech-requ...@lists.4d.com> wrote:

>
>
> Message: 3
> Date: Sun, 6 Aug 2017 11:03:57 -0400
> From: Chuck Miller <cjmil...@informed-solutions.com>
> To: 4D iNug Technical <4d_tech@lists.4d.com>
> Subject: Re: ​
> ​​
> Re: UUID vs Longint primary key
>
> ​​
> I am not David but I agree with his assessment. Relational model databases
> by definition are not supposed to use keys that have no meaning. They are
> supposed to create relations that have meaning. Even the use of numbers
> breaks the theoretical rule. The problem is that we have pushed all to use
> relational model rather that what he logical model proposes. Who out there
> remembers hierarchical, tree, or inverted structure models.
>
> Regards
> Chuck
> --
>
> Message: 4
> Date: Sun, 6 Aug 2017 08:16:07 -0700
> From: David Adams &l

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread David Adams via 4D_Tech
> Relational model databases by definition are not supposed to use keys
that have no meaning.

On Sun, Aug 6, 2017 at 12:08 PM, npdennis via 4D_Tech <4d_tech@lists.4d.com>
wrote:
> You may have stated that backwards…

Nope, that's exactly what I meant. But apologies for not making the point
more carefully. I'll make up for that now.

> relational model databases by design should never, never, never use
business data for keys.
> This was the number one rule drilled into me by my college relational
database classes. I
> found this to be very wise advice in the real world.

I agree! But there are two different pieces to this. In the one part, no
synthetic keys and in the other, synthetic keys. When you did your RDBMS
classes, they probably discussed formal design versus system
implementation. When you're creating a normalized *design*, you don't have
artificial/synthetic keys - because your focus is on the pure data itself
and such keys have no meaning. (They're a database artifact, not part of
the real world entity you're modeling.) The "real" primary key gets sorted
out when you're putting your design into second or third "normal form."
(Even if you don't think of it in those terms. I mean the actual design
step, whatever name you use - if you even use a name.) When you go to
*implement* your tables, it's sensible to use a synthetic key. So, in my
example, I used

Building + Room

...because that's enough to uniquely identify each room with no spare
values in the field, no redundant rows, and nothing that belongs in another
table. (Well, you would have a unique set of Building records, as noted in
the earlier post.) So, that's a correct and sensible *design* and is how
I'd think about the actual data. But implementation? Another story. The
building is renamed and I have to go back and update all of the related
room records? PITA.

It's 100% normal to think about the formal table design and the physical
implementation as distinct steps with different rules. And your VIN
example, like Social Security Number, is an excellent example of a "natural
key" that turns out not to be a great idea. I use synthetic keys for *all*
of my tables (well, there might be a rare exception where I use a natural
key, but that's super rare...and I probably end up regretting that most of
the time.)

In my earlier post I was just assuming that the distinction between design
and implementation was clear, but those steps aren't distinct in 4D. (They
are distinct in design tools and used to be a pretty common concept - they
still are distinct in design-oriented RDBMS development tools.)

There's a natural tension here, and there are two competing tasks that
require work to handle. You only get one of them for free, the other you
have to take care of on your own.

* Natural keys and multi-field keys the truly define a row are a *pain* in
the real world where "unique" values aren't always entered
correctly/stable/etc. To avoid hassles with updates, synthetic keys are
super helpful. They give you _stable keys_. (Mutable keys are absolutely a
pain.)

 Synthetic keys take away the pain of related table changes on UPDATE.

* Synthetic keys have _nothing_ to do with the real data in the row. That's
kind of the point. Which means that you get *no help* in preventing
duplicate rows of data. Like my example of the duplicate building + room
with different IDs.

 Synthetic keys give you the pain of having to hand-check for
duplicates.

The problem shows up when you have "unique" rows with duplicate data. That
gets ugly in a big fat hurry when it's a 1 table and children link to
different rows in the 1 table that aren't really "different." Note that I'm
not talking here about the kind of messy data - like customer names and
addresses - where you need to merge records. That's a related but different
question. My building + room example is nice and clean. The data is
identical, it's just that somehow a duplicate got entered. (I don't want to
muddy the waters with messy data, it's not necessary or helpful for the
current discussion.)

Here's a possibly apocryphal tangent from my first years in the work world.
Back when things were typed. On typewriters. It was customary to put at the
bottom of each page 1/5, 2/5 and so on. I heard tell of a report that had a
sixth page added. Rather than retype all of it, here's what the boss got
back at the bottom of the six pages:

1/5
2/5
3/5
4/5
6/6
Not the same thing, but I always thought of it as a good example ;-)

In 4D, you get a benefit from using synthetic keys - you get stable keys
for update for free. I'll take that deal! But you get no automatic help
with duplicate control. You can add this by implementing a compound index
marked as unique. Or you can write custom scanner code to audit your data.
The exact options depend on the database, but the work is required either
way. In 4D databases, what I most often see is that people aren't checking
for duplicate rows either at the engine level 

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread npdennis via 4D_Tech
> Relational model databases by definition are not supposed to use keys that 
> have no meaning.

You may have stated that backwards… relational model databases by design should 
never, never, never use business data for keys. This was the number one rule 
drilled into me by my college relational database classes. I found this to be 
very wise advice in the real world. It came in handy when building a database 
for vehicles and the suggestion was to use VIN as a primary key since they were 
unique… it turns our that as unique as VIN should be, they are not… it was no 
problem for out database to handle vehicles with the same VIN because we 
followed the non-business-data-for-primary-key rule.

On the other hand, you do model the data after business relations, but the keys 
that tie that relation data need/should never be seen in a well designed 
system. If a user readable key is needed by business, then there should be 
another data piece that the user can read (like an MRN, medical record number, 
or an abbreviation that is unique and human readable) But these should never be 
used to link together data in a structure in primary key foreign key relation.

Just my two cents.

Neil

--
Neil Dennis
4D Developer since 1990


GreaText - Designing Software for the Way You Work
716 East 1850 N
North Ogden, UT 84414

mailto:npden...@greatext.com
http://www.greatext.com/



**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread David Adams via 4D_Tech
On Sun, Aug 6, 2017 at 7:47 AM, steve simpson via 4D_Tech <
4d_tech@lists.4d.com> wrote:

> On  Fri, 4 Aug 2017 12:52:28 -0700
> ​
> David Adams 
> ​ wrote:​
>
> >
> > ​[snip] ​
> > 4D's UUIDs function as globally unique row *serial numbers*. That's great
> > for backups and convenient for physical relations, but it has exactly
> zero
> > to do with a real "primary key" or relational integrity.
> > ​ [snip]​
>
>
Sure, but only briefly (collective sigh of relief, I'm sure.)

When you design a relational database, you sort out the "keys" by finding
the field or combination of fields in a table that make the row unique. The
next part of that design is that each field in the row should be about the
row and nothing but the row. What does a UUID or longint ID have to do with
the data in the row? Nothing at all. This kind of "synthetic key" is an
implementation-level accommodation. It has nothing to do with the data,
it's just something bolted on for practical, real-world reasons. For
example, you might have a unique customer name (just saying) and they might
want to change it later. Ugh! With a unique-but-meaningless ID, that's no
problem.

I use synthetic keys all of the time and have forever. So, no quarrel with
using them for links and so on. But they do bring on another problem: These
sorts of IDs not only do not maintain the uniqueness of your rows, they
very easily obscure duplicate rows.

Imagine that you've got an asset management system and it stores the
location of each desk in a facility. There's a table for every room in the
facility. So, you've got rooms and desks. Both are finite, both are real
and tangible. So, over in your room table youve got something like

Building Keller
Room123

Back in the real world, the building names are unique and, within a
building, room names or numbers are unique. So, you've got a unique row
here. Imagine using that data as a key:

Keller123

That's a sound key and the row design is also solid. But this kind of key
is a real pain down the row when things change. It may be that a new
numbering scheme is introduced or a building is renamed for some reason. (A
big donation, etc.) The actual collection of real-world buildings and rooms
hasn't changed (in this example), but you need to update the records to
reflect the new real-world scheme. So you have to go back and change the
[Room] record, and then any linked [Desk] records, or any other links. Ugh.

I'll be most, if not all of us, got burned by using a "real" key like this
early on. I did. So, longints and now UUIDs. Easier! So, now I'll use a
longint:

1  Keller 123

I link on 1, I can rename the building or room, no drama.

So what was I talking about? So far I'm a fan of using longints/UUIds (I
am.) The problem is this:

1  Keller 123
2  Keller 123
3  Keller 124

You've got unique *rows* but duplicated *data*. There are two Keller 123
records - but they've got different unique longints bolted on so they
aren't treated as duplicates. The "unique" IDs are masking the duplicates.
Notice that the "unique ID" has nothing at all to do with the rooms in the
world. It's just a unique number. Unique rows, duplicate data.

A UUID, longint, etc. is a practical way to link records (and the one I
use), but it then leaves the task of doing integrity checks and so on to
you.

As I understand it, UUIDs were added to provide globally unique serial
numbers to records for the sake of journaled backups. That's a good goal,
and 4D Backups seems to work really well now.  I never upgraded my main
source from V13 because, for quite some time, it was a bit of a mess to
sort out the UUIDs and I couldn't be blethered to pour the time into
figuring out. After a few years, I found a reliable set of SQL instructions
to do a retrofit. In any case, I'm working on something now that I believe
began in V15 so it's already been sorted out there from the start.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: ​Re: UUID vs Longint primary key

2017-08-06 Thread Chuck Miller via 4D_Tech
I am not David but I agree with his assessment. Relational model databases by 
definition are not supposed to use keys that have no meaning. They are supposed 
to create relations that have meaning. Even the use of numbers breaks the 
theoretical rule. The problem is that we have pushed all to use relational 
model rather that what he logical model proposes. Who out there remembers 
hierarchical, tree, or inverted structure models. 

Regards 
Chuck

Sent from my iPhone

> On Aug 6, 2017, at 10:47 AM, steve simpson via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> On  Fri, 4 Aug 2017 12:52:28 -0700
> ​
> David Adams 
> ​ wrote:​
> 
>> 
>> ​[snip] ​
>> 4D's UUIDs function as globally unique row *serial numbers*. That's great
>> for backups and convenient for physical relations, but it has exactly zero
>> to do with a real "primary key" or relational integrity.
>> ​ [snip]​
>> 
> 
> ​Care to elaborate on that statement? I don't get why you'd say that.​
> _
> Steve Simpson
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

​Re: UUID vs Longint primary key

2017-08-06 Thread steve simpson via 4D_Tech
On  Fri, 4 Aug 2017 12:52:28 -0700
​
David Adams 
​ wrote:​

>
> ​[snip] ​
> 4D's UUIDs function as globally unique row *serial numbers*. That's great
> for backups and convenient for physical relations, but it has exactly zero
> to do with a real "primary key" or relational integrity.
> ​ [snip]​
>

​Care to elaborate on that statement? I don't get why you'd say that.​
_
Steve Simpson
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Nigel Greenlee via 4D_Tech
+2

Nigel Greenlee


> On 4 Aug 2017, at 12:09, Marcus Straßmann via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> +1
> 
> 
> MacStrass - Marcus Straßmann
> Softwareentwicklung und Beratung
> Auf der Markscheide 35
> D-44807 Bochum
> 
> Mobil: +49 (173) 374 39 92
> eMail: macstr...@macstrass.de
> 
> Am 04.08.2017 um 10:39 schrieb Herr Alexander Heintz via 4D_Tech 
> <4d_tech@lists.4d.com>:
> 
>>> Someone can explain when is better use UUID and when Longint field in 
>>> primary key?
>> 
>> If you can even remotely think of a scenario where you might ever be 
>> challenged to merge databases, go with a UUID!!!
>> It will save you incredible headaches!
>> Switched to UUIDs long ago, never looked back, best decision ever!
>> 
>> Of course if such a scenario is unthinkable, you can use longints
>> 
>> **
>> 4D Internet Users Group (4D iNUG)
>> FAQ:  http://lists.4d.com/faqnug.html
>> Archive:  http://lists.4d.com/archives.html
>> Options: http://lists.4d.com/mailman/options/4d_tech
>> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
>> **
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Marcus Straßmann via 4D_Tech
+1


MacStrass - Marcus Straßmann
Softwareentwicklung und Beratung
Auf der Markscheide 35
D-44807 Bochum

Mobil: +49 (173) 374 39 92
eMail: macstr...@macstrass.de

Am 04.08.2017 um 10:39 schrieb Herr Alexander Heintz via 4D_Tech 
<4d_tech@lists.4d.com>:

>> Someone can explain when is better use UUID and when Longint field in 
>> primary key?
> 
> If you can even remotely think of a scenario where you might ever be 
> challenged to merge databases, go with a UUID!!!
> It will save you incredible headaches!
> Switched to UUIDs long ago, never looked back, best decision ever!
> 
> Of course if such a scenario is unthinkable, you can use longints
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Kirk Brooks via 4D_Tech
I'll add my vote to Jody's approach.

I prefer UUIDs for primary keys for all the reasons mentioned.
Having a longint index field is useful as well but for different reasons as
Jody explains. My most common use is to be able to quickly sort a table in
the order records were created. And it is superior for referencing records
- attempting to type a UUID is not something one wants to do.

On Fri, Aug 4, 2017 at 6:51 AM, Jody Bevan via 4D_Tech <4d_tech@lists.4d.com
> wrote:

> On another note, we assign a counter ourselves (not auto generated). There
> are times where we reset counters and being able to do that is great.
>

Resetting the counter can be done with ​SET DATABASE PARAMETER ( [Table] ;
Table sequence number ; value ). I haven't had any issues with bad sequence
numbers since about vers 6.

-- 
Kirk Brooks
San Francisco, CA
===

*The only thing necessary for the triumph of evil is for good men to do
nothing.*

*- Edmund Burke*
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Jody Bevan via 4D_Tech
So what might be not so good with UUIDs? 

After watching how people work with our systems for 26 years (most of it 
without UUID capability) I noticed that they often referred to the record by 
the Unique LongInt we created for each record. For the first year we hid our 
unique ID - why would they want that. Then we exposed it to the admins for a 
year. Then through popular request we exposed it to everyone. Then we not only 
exposed it but changed the name of the display of the field to JonokeMed #. 
Referring to record # 150, or 275,983 is much easier than referring to a long 
UUID.

A UUID just is not a workable solution for that - at least in my mind. 
Therefore our solution today is that we do use UUIDs for all records for the 
reason of any possibility of merging records. We have had clients move across 
the Canada to join up with another client, and they need the records merged. 
Arrh a lot of work to write that the first time, and then continue to support 
it with each new table.

We use the UUID as the Primary Key record, but still have a Key field. This 
number is exposed to the end users to use as they wish. On another note, we 
assign a counter ourselves (not auto generated). There are times where we reset 
counters and being able to do that is great. 

Jody Bevan
ARGUS Productions Inc.
Developer

Argus Productions Inc. 




> On Aug 4, 2017, at 7:40 AM, Keith Culotta via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> The UUID has been the easiest to work with.  When a customer needs a 
> recognizable ID, an auto-incremented Longint can be added to the table.  
> UUIDs are perfect for avoiding conflicts when updating or exchanging records 
> between databases.  They can also be used as Process IDs or SVG IDs when one 
> of those is linked to a record.  All this can be accomplished with a Longint, 
> but it's more straightforward with UUIDs.  
> 
> Keith - CDI
> 
>> On Aug 4, 2017, at 3:27 AM, stardata.info via 4D_Tech <4d_tech@lists.4d.com> 
>> wrote:
>> 
>> Hi All,
>> 
>> Someone can explain when is better use UUID and when Longint field in 
>> primary key?
>> 
>> Thanks
>> Ferdinando

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread npdennis via 4D_Tech
> I can’t speak to any performance differences when using them as indexed key 
> fields for searches … they likely have similar performance, but that’s just a 
> guess.

I can speak to the performance, they are not the same. UUID is faster.

4D uses BTrees (binary trees or sometimes called red/black trees) to store 
index structures. A sequential number is the slowest possible way to create or 
add to a BTree. The treee needs to remain balanced, so there is a lot of 
rebalancing the needs to occur when numbers are added sequentially. UUID’s are 
much more random and require less tree balancing and are therefore faster for 
delete, create operations.

I started using UUID since version 11 and love them.

--
Neil Dennis
4D Developer since 1990


GreaText - Designing Software for the Way You Work
716 East 1850 N
North Ogden, UT 84414

mailto:npden...@greatext.com
http://www.greatext.com/


**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: UUID vs Longint primary key

2017-08-04 Thread Ronald Rosell via 4D_Tech
I can’t speak to any performance differences when using them as indexed key 
fields for searches … they likely have similar performance, but that’s just a 
guess.

But here’s one situation where UUID has an advantage.  We historically used the 
Sequence Number (a longint) as the primary key, until we discovered a 
shortcoming:  Merging data from multiple data files leads to duplicated primary 
keys.  With a sequence number, unless you somehow modify it (modifying the 
automatically-generated Sequence Number with some sort of instance prefix for 
the datafile) you will run into situations where different datafiles will have 
the same sequence numbers for various records.  That’s not a problem if those 
files will forever remain separate, but it requires a lot of re-sequencing of 
those tables and related tables if the files are ever merged or need to 
communicate with one another while referencing primary keys.

With UUID, it is highly unlikely (nearly impossible) that two data files will 
generate the same UUID for any records;  the chances are nearly 100% that all 
of the UUID’s will be unique, across every instance of your database.  So, if 
you later set up some kind of inter-database communication referencing records 
by primary key, or have to merge data files, you’re much less likely to run 
into issues.
__

Ron Rosell
President
StreamLMS


> On Aug 4, 2017, at 1:27 AM, stardata.info via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> Hi All,
> 
> Someone can explain when is better use UUID and when Longint field in primary 
> key?
> 
> Thanks
> Ferdinando
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

UUID vs Longint primary key

2017-08-04 Thread stardata.info via 4D_Tech

Hi All,

Someone can explain when is better use UUID and when Longint field in 
primary key?


Thanks
Ferdinando
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**