I don't think I am willing to say generated columns are not useful without
indexes, sorry if previous email led to that, actually I have not
actually thought much about it.
I believe I was letting
my predjudice through, I have always thought that an interesting set
of problems could be solved if the optimizer work was done - so I
was concentrating on that. And I see that optimizer work as a stepping
stone to even more possible interesting stuff in the future - things
like gist indexes.
Having said that, I liked the generated column syntax since it seemed to
be a standard syntax, which seemed better than coming up with our own.
Rick Hillegas wrote:
I don't anticipate any significant time savings by scaling back the kind
of expressions we support. In any event, sarg-matching of
indexableExpressions should not involve text comparisons. I imagine it
would involve normalizing the parse trees and comparing the normalized
subgraphs.
If Mike is right and generated columns aren't useful unless the
optimizer can turn indexableExpressions into sargs, then the choice
between generated columns and expression indexes seems largely
religious to me. I am leaning toward expression indexes because they are
slightly simpler for the user: she only has to declare an index, not
declare a special column and then declare an index.
By the way, this is a very lively, fruitful, and clarifying email
thread. Thanks especially to Dan, Mike, Satheesh, and Manish.
Cheers,
-Rick
Satheesh Bandaram wrote:
Manish Khettry wrote:
I think using arbitrarily complex expressions to create indexes may be
overkill. Dan's idea of restricting expression indexes on function
calls on a single base column (actually one _or more_ base columns
shouldn't be that much harder) is probably better.
While it may be OK to start with a single base column, I think one or
more base column references would be good to support. Useful for
functional indexes on *area*, for example. Or total compensation,
which may involve base salary, bonus, commissions etc.
Isn't storing the expression text for a generated column also
problematic? I can think of a few other things
-- The expression will also have to be compiled and evaluated for DDL
like alter table (drop column) possibly others.
Doesn't Derby already support evaluating an expression at DDL time for
a default clause? Alter table add column evaluates this default
expression at the DDL time.
-- We'll need a good way to check if two expressions are the same.
Obviously a textual comparison may not work. Making sure that two
expr's are the same is not trivial if we want to do it right-- say
(x+y) > 10 is the same as 10 < (y+x).
Right... Common expression detection needs to be added. Would also be
useful to eliminate multiple evaluations of any common (sub-)expressions.
Satheesh
Manish
On 8/3/05, Rick Hillegas <[EMAIL PROTECTED]> wrote:
Thanks, Jeff. I agree. Recompiling these expressions can occur when we
compile the triggering SELECT/UPDATE/INSERT statements. Compiling a
SELECT/UPDATE/INSERT is a relatively heavyweight operation and I don't
think anyone will notice a little extra time spent recompiling these
expressions.
Cheers,
-Rick
Jeffrey Lichtman wrote:
The expression used for the generated column must be stored in the
data dictionary. Does derby serialize expressions? I was looking
around in the classes in impl/sq/compile (ValueNode and such) and
these classes don't seem to implement Formatable.
It would be better to store the expressions as text, and to parse and
bind them when needed. I learned the hard way while working on the
internals of other database systems that storing internal data
structures in the data dictionary causes trouble when the structures
change. Also, it's useful for users to be able to look at the text of
their DDL statements.
- Jeff Lichtman
[EMAIL PROTECTED]
Check out Swazoo Koolak's Web
Jukebox at
http://swazoo.com/