<<
I have a datetime field in a large table where, obviously, almost every entry 
has a different value.  We never search for a specific value except for maybe a 
DEXTRACT where I assume an index on the field would be useless unless we add a 
computed column for that value, right?  But we often search where the field is 
not null or null.   Do indexes help with a "is null"?    And if they would, I 
assume I would decide whether to index based on how many are null; if 50% are 
null then an index probably wouldn't help but if only 5% are null then it might.

>>

Everything you said is correct (and NULLs are stored and detected by indexes).

However, when the goal is to index for only a tiny number of rows in a table 
(that is, you're only interested in finding the 1% of rows with NULL values, or 
the 1% with a 'Y' in a certain column) you might consider an alternative 
strategy.  For instance, you can create a table with only one column, the 
primary key column from the table you want to search.  Then, you can drop the 
primary key value for each row that has a NULL value in the column you're 
concerned about.  To "search" you join this table back to the original.  This 
requires more programming work but avoids the overhead of an extra index just 
to find that very small number of rows.

In some cases, you can do this without the extra table.  For instance, you'll 
often see a PhoneNumbers table with a column IsDefault that is supposed to 
contain 'Y' for one default phone number per person.  A better way to implement 
this is instead to put a DefaultPhoneNumberID column on the People table.  
There isn't enough information to know whether a structure like that might make 
sense in your situation.

Also, note that you can convert DEXTRACT searches to use the DATETIME column 
index directly if you express them as BETWEEN 01/01/2012 00:00:00 AND 
12/31/2012 11:59:59.  You can let the user type in dates and messy construction 
of the SQL where they can't see it.

And, finally, I think the computed DEXTRACT(DateTimeColumn) solution that you 
suggested (with an index on the computed column) is a perfectly valid 
optimization.
--
Larry

Reply via email to