Can anyone comment on this behavior? Namely, can I get feedback as
to whether my interpretation of what makes a bestPlan--meaning the
best plan in this round--is correct? And if so, is this a bug in
the current Derby optimizer? And if so, any ideas as to how this
could be causing the two
The optimizer will have to remember this best plan even when it
considers a different join order for the outer query, because it's
possible that the subquery's best plan for that optimization will be
the best one overall.
It occurs to me that the code structure already supports this. Every
I believe the optimizer currently keeps track of only two best
plans - the best access path for each table in a query (or
subquery) as it's currently being considered in a join order, and
the best overall join order and path for each table for the best
plan it's found so far.
Am I right in
How could a predicate be pushable to only one side of a union? Can
you provide an example of a predicate that can be pushed only to one side?
If we take something along the lines of:
select ... from
t2,
(select * from t1 union values (1,2), (3,4), (5,6)) X1 (a,b)
where X1.a = t2.i;
In
4 - Make sure predicates that are pushed down into subqueries of a
UnionNode are correctly pulled back up (if they are unscoped) or
discarded (if they are scoped) for every permutation seen during optimization.
Predicates may have to be copied when pushed into the children of a
UnionNode.
I haven't been following this thread very closely, so please forgive
me if this has already been considered. I assume the problem is that
the backup algorithm creates a directory to contain the backup, so if
the location of the backup is in the database it tries to copy the
contents of the
This vote is for establishing Bryan Pendleton as a committer for Derby.
Please vote +1 if you approve of Bryan as a committer.
+1
-Jeff Lichtman
[EMAIL PROTECTED]
Check out Swazoo Koolak's Web
The tables referenced by this query in particular are SYS.SYSTABLES,
SYS.SYSSCHEMAS, SYS.SYSCONSTRAINTS, SYS.SYSFOREIGNKEYS,
SYS.SYSCONGLOMERATES, SYS.SYSCOLUMNS, SYS.SYSKEYS.
I'm not sure what you need, but I'll try to give an idea of the
relationships between these tables.
SYSTABLES has
Derby knows how to use an index to quickly find a minimum or a
maximum (by traversing down one side of the B-tree or the other).
It doesn't know how to do both in the same query, which would take
two traversals.
Is the work to fix this the same as making IN list use multiple probes,
and/or
...or as a cross product:
SELECT MIN(A.Id), MAX(B.Id) FROM Customer A, Customer B
This might be fairly efficient, given that each branch of the cross
product only returns a single row. I guess this could be a possible
internal translation as well.
Roy
In this case Derby would do the
1) SELECT MIN(Id) FROM Customer and SELECT MAX(Id) FROM Customer
are both fast, but SELECT MIN(Id), MAX(Id) FROM Customer is slow,
taking 5 seconds. Why?
Derby knows how to use an index to quickly find a minimum or a
maximum (by traversing down one side of the B-tree or the other). It
Here's my attempt at documenting
the optimizer overrides that are introduced by DERBY
-573.
The documentation should say which overrides are associated with the FROM
clause (joinOrder) and which with a TableExpression (everything
else).
Also, there's a typo: i t instead of it:
Important: Make
Would be great if we put some thought into this, rather than enabling a
somewhat messy output. It would be great if we came up with an XML
format that would allow easier compatible changes in the future.
Dan.
This tracing feature is intended for developers who are debugging
optimizer
I am curious about your statement This tracing feature is intended
for developers who are debugging optimizer problems. I don't believe
it should ever be exposed as a supported feature.
As soon as you expose a feature you have to support it, which means
documentation, compatibility
That said, it seems reasonable to think that the optimizer should at
least consider doing a hash join on the subquery, in which case the
join between T2 and T3 could be materialized and then a hash-join
could be done using the predicate x1.j = t1.i.
I think the best thing would be to
Based on logic in the code, the example query isn't flattenable. . .
That's because whoever wrote the code made it handle only the
simplest case. I doubt it would be hard to make it flatten many other
types of table subqueries.
My general philosophy toward query performance issues is that
What can be done about this? It looks like FromVTI.java implements
Optimizable, which I'm assuming is the interface to supply indexed
lookup. But, like I said, my head is swimming trying to figure out
if I either 1) don't understand how to enable an indexed lookup for
my VTI or 2) if it's
Is this what you were referring
to, Jeff?
derby.language.logQueryPlan
Thanks, but that's not it. logQueryPlan causes the final query plan to be
written to derby.log. I was thinking of a trace function that creates a
trail of all the decisions and calculations the optimizer makes in
derby.log.
This vote is for establishing Rick Hillegas as a committer for
Derby. Please vote +1 if you approve of Rick as a committer.
+1
-Jeff Lichtman
[EMAIL PROTECTED]
Check out Swazoo Koolak's Web
As a related itch, we might want to give some thought to tools for
tracking and sealing off security breaches. For instance, audit
trails and tools to deny access to malicious users and clients.
-Rick
Good idea. One possible tool could be a class/method inspector that
would look for risky
I'd like to setup a lunch for those of us in the Bay Area who are
working on Derby. I'm thinking of Wednesday February 1 in San
Francisco. Any interest?
If contributing advice counts as working, I'd like to be included.
Yes, I'm interested.
-Jeff Lichtman
is it possible to reuse the class
org.apache.derby.impl.sql.compile.ParserImpl to transform an SQL
query into a query tree (consisting of relational algebra operators).
The query tree produced by the parser is not in the form of
relational algebra operators. At this point of processing in
This is a repeat of my vote, which I previously sent only to derby-dev:
Logo candidates (place an X by your choice).
10. [X] (these all belong together...)
http://issues.apache.org/jira/secure/attachment/12321022/derby_logo_only.jpg
Here's my vote:
Logo candidates (place an X by your choice).
10. [X] (these all belong together...)
http://issues.apache.org/jira/secure/attachment/12321022/derby_logo_only.jpg
http://issues.apache.org/jira/secure/attachment/12321023/derby_with_text.jpg
CREATE TABLE t2 (i INTEGER, j INTEGER NOT NULL);
CREATE INDEX idx3 on t2(i);
CREATE UNIQUE INDEX idx4 on t2(j);
SELECT i FROM t2; -- use index idx3 (covers query)
SELECT j FROM t2; -- Do a table scan
Shouldn't the last SELECT use idx4 since the index is UNIQUE *and* j
is NOT NULL?
--
For those wondering how unique indices with multiple null values may
be implemented, it can be done pretty simple: The columns with null
values are not part of the index. This will if course affect the
execution plan.
Example:
CREATE TABLE t (i INTEGER, a INTEGER NOT NULL, b INTEGER);
CREATE
I have figured out what's behind the stack overflow reported by
Daniel Skiles in the derby-user list. To summarize, he gets a stack
overflow from this query:
SELECT path from filsystemfiles where
path not in (select path from existingfiles)
The stack trace shows a large number of
It looks like you have identified a number of serious defects here.
Your analysis seems solid to me and I doubt anyone is going to argue
with you. Could you log a bug so we can fix these issues hopefully
over the coming months?
Thanks,
-Rick
OK. This is now DERBY-634.
I'm a little unclear why this making UnaryOperatorNode extend
ParameterNode and the subsequent changes are needed. Currently
triggers/views can check to see if the statement has a parameter node
anywhere in the tree without having every node implement ParameterNode,
so what is different about
SQLToJavaValueNode.getJSQLType after checking isParameterNode does
ParameterNode casting so it can get to getJSQLType method. But since
UnaryOperatorNode is not of the type ParameterNode, the casting
fails, as expected. I can remove the casting but that would mean to
implement getJSQLType
I've been thinking about where - ? should be allowed in queries.
Currently, parameters are allowed only in certain situations - the
documentation lists seventeen different places where parameters can
be used. These places were chosen because it's possible to figure out
the type of the
I agree with your and Dan's point about possible precision loss. I
am trying to find a way where rather than hard coding the -?/+? to
Double, try to get it's datatype from the context where it is used.
There are circumstances where you can figure this out, but it's
ambiguous in the general
where t1.int_column + ? = - ?
The first parameter gets bound to the type of t1.int_column and then
second parameter gets bound to the same type too.
OK, then how about:
where - ? = t1.int_column + ?
-Jeff Lichtman
I have a simple patch to allow dynamic parameters for unary minus
and plus. The parameter will be bound to DOUBLE. The patch is
attached to JIRA. Can a commiter please commit it if everything
looks good? I have run all the tests and no new failures. Have also
added a test for this.
I'm not
This looks to me like an edge case. Rather than disable an entire
datatype, I'd recommend logging bugs on these edge cases. Unless
code archeology at IBM discloses some serious problem like data
corruption, I would recommend re-enabling the BOOLEAN datatype.
-Rick
I agree.
I have looked
However, before someone undisables the lines identified by Jeff, I
will mention that more work than meets the eye went into it - at
least into adjusting the tests. . .
Of course any new feature should have tests written for it. I didn't
mean that someone should hack out a line of code to
The expression used for the generated column must be stored in the
data dictionary. Does derby serialize expressions? I was looking
around in the classes in impl/sq/compile (ValueNode and such) and
these classes don't seem to implement Formatable.
It would be better to store the expressions
I'm wondering if creating java object for each tuple and let the gc do its
work would be more performant than having a reusable ByteBuffer that contains
many raw tuples? What do you think?
When we created Derby (nee Cloudscape), object instantiation and garbage
collection were still quite
The architecture of the Derby optimizer makes it difficult to do
further optimizations.
Yikes! Tread lightly there hoss, I think he (Jeffrey Lichtman) is
listening now.
I'm not that thin-skinned. Besides, if there can't be a frank and open
discussion of the strengths and weaknesses
1. Since you wrote the original optimizer if you were currently the lead
architect what would your recommendations be for enhancing or improving the
optimizer?
Ok, Dan, shove over!
It's hard to consider the optimizer by itself. Many optimizer enhancements
would work with changes in other areas,
Does someone have plan to contribute documentation of optimization of join
order and access path selection in derby?
No one has answered this yet, so I'll try to help. I wrote the original
implementation of the optimizer, but it's been a few years since I've
looked at it.
The optimizer
I think new Throwable(FRED).printStackTrace(System.out); may be some
leftover debugging code from a developer. Does anyone know more about
it?
However it got in there, it shouldn't be there and should be removed.
-Jeff Lichtman
[EMAIL
I am merely suggesting that it would be more efficient if JavaCC produced
'switch' statements for these rather than the nested 'if' statements.
So far, I haven't been able to determine how to modify the sqlgrammar.jj
file to achieve this.
It's time for me to out myself. I wrote most of
This would appear to indicate that the optimizer is rejecting a
NestedLoopJoinStrategy because the table is too big (i.e. larger than the
1MB default).
The logic to reject a join strategy that uses too much memory is in
OptimizerImpl.costBasedCostOptimizable(). It gives an estimated number of
The PROPERTIES keyword and optimizer hints have been removed
from Cloudscape before the code was contributed to Derby.
I'm not sure where you've found the PROPERTIES keyword, but
as far as I can tell it is no longer part of the manual.
The reason it was removed is because the syntax is
Well as I thought, it wasn't that hard to add it back in. . .
I just added a data type JAVA to the parser. . .
I'm curious about this datatype. Is it a new keyword to indicate to the
parser that a class name follows? What is the syntax?
Originally in Cloudscape, one only had to use the class
Does
Derby support rollbacks? And, if not, does anyone have any suggestions
for both working around it as well as a plan as to weather it will be
implemented or not?
Yes, it does, through the rollback method on the JDBC Connection class.
See the following for details:
47 matches
Mail list logo