On Thu, 9 Jun 2005, Steve Dibb wrote:

I'm working on creating a new table in our database, and I'm wondering -- why bother with the possibility of NULL values in a field? It seems like it always adds more work then its worth.

What's the benefit of adding the option of IS NULL when you usually have to check to see if its also blank (field = '' OR field IS NULL)?

Others have already covered the NULL != relational aspect pretty well, but here's another way to look at it. The relational model is based on set theory and predicate logic. If you've taken (what used to be called) CS 235 at BYU (or was it 236...), then you're familiar with predicate logic.

If you look at each table (relation) in your database, you can see each one as representing a particular 'predicate' in the area you're trying to model. For example a table:

Employee
--------
id
first name
last name

represents the predicate:

Employee [ID] is known by [First Name] [Last Name]

In predicate theory, which is based on 2-valued-logic (2VL), each predicate can evaluate to either true or false. Each row (tuple) in your table (relation) contains values that you can "plug-in" to your predicate to provide true assertions about the enterprise you're trying to model.

E.g.,

Employee 2 is known by Jim Jones
Employee 3 is known as Bill Brown
etc.

If you take the logical AND of each of the rows (tuples) in your table (relation) you then have a (possibly very long) assertion of truth about your enterprise. These represent the "axioms" of information from which you can extrapolate further information about your enterprise using relational operators.

I wont go into detail here, but a query (and if you've taken CS 236, you remember the proofs you did with predicate logic) is nothing more than a proof using the axioms metioned above to forumlate new predicates evaluating true about your enterprise. If you have designed your database well, and have stuck with 2VL (true and false) then you can exercise a large degree of trust in the results of your queries, because the results are mathematically provable.

Now when you introduce NULL into the picture, all of a sudden your predicates are no longer 2VL assertions, since now you have true, false, and "maybe". You've now moved into 3VL which is much more complex. And pretty much your ease to mathematically prove (as you can with 2VL) the validity of your query results just went out the window.

Designing databases without using nulls is challenging, but I have found, in my experience, is well worth the effort.

For further reading, I'd recommend O'Reilly's new book by C. J. Date called Database in Depth. For anyone who works with databases, a good grasp of predicate logic and set theory will do wonders for your understanding of what "relational databases" are all about. Yes it's "theory", but it's extremely practical theory.

FWIW,

-- Dan


.===================================.
| This has been a P.L.U.G. mailing. |
|      Don't Fear the Penguin.      |
|  IRC: #utah at irc.freenode.net   |
`==================================='

Reply via email to