[ 
https://issues.apache.org/jira/browse/DERBY-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750405#action_12750405
 ] 

Knut Anders Hatlen commented on DERBY-4357:
-------------------------------------------

Thanks for the clarifications, Rick.

I agree that if we add a mechanism to let the table function return the results 
in a specific order, it would be pointless to do an extra sort in Derby. I 
would argue though that there is a difference between pushing restrictions and 
ensuring ordering.

With the ordering, the table function itself declares that it can return the 
rows ordered by column X, so it's reasonable to expect that it's true that they 
will in fact be ordered by X. Also, Derby won't ever ask it to order by column 
Y. So for column X, where you have an easy way to do the ordering inside the 
table function, you can use the power of Java to order the rows, whereas for 
column Y you rely on the power of SQL, so you get to combine the strengths of 
both Java and SQL.

With the restrictions, the table function has no way to say that it only 
accepts restrictions on a certain column, or only certain kinds of operators. 
So if the table function is required to enforce all restrictions, you can no 
longer use the power of SQL for those restrictions where you have no advantages 
of implementing them in Java. (Also, though perhaps not likely to ever be a 
problem, new operators cannot be added later without breaking the existing 
restricted VTIs, as old table functions wouldn't know how to handle them, but 
they are not allowed to ignore them.)

As an example, say that we have a table function F(K,V) which is simply 
wrapping a hash table. A restriction such as K='abc' would be very helpful to 
push down, as it could be used as an argument to Hashtable.get() and reduce the 
number of rows returned to one (or zero). However, restrictions on K with 
operators such as <,>,<=,>=, or any restriction on V, would require that the 
entire Hashtable was scanned. One would also have to implement evaluation of 
each operator on each column, and one would have to handle arbitrarily nested 
AND/OR operators. Whereas this is indeed possible to do, it sounds like wasted 
work to reimplement what's already implemented in the SQL engine for no added 
benefit.

If the restrictions are only seen as a hint, this table function would need 
only a couple of lines of code to extract the equals restrictions on K. The 
full table scan needed for more complex restrictions could be left to the SQL 
engine, which should be able to perform the scan just as efficiently as the 
user's hand-coded scan. Also, the hand-coded scan will probably have undergone 
less testing and will be more error-prone than the code in the SQL engine.

> TableFunctions provide no information to limit underlying query
> ---------------------------------------------------------------
>
>                 Key: DERBY-4357
>                 URL: https://issues.apache.org/jira/browse/DERBY-4357
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.4.1.3, 10.4.2.0, 10.5.1.1, 10.5.2.0, 10.5.3.0
>         Environment: ALL
>            Reporter: Chris Goodacre
>         Attachments: RestrictedTableFunctions.html
>
>
> The API specification for TableFunctions cannot provide information to the 
> implementer of the TableFunction about the details of the query.  For 
> example: 
> (a) I defined a table function named MyFunction with columns a,b, & c
> (b) I bind the table function properly using the CREATE FUNCTION SQL.
> User executes the following SQL:
> select a,b from table ( MyFunction() ) where c = 123
> Without passing the column list and/or where clause as arguments to the table 
> function, my implementation can not know that it only needs two of the three 
> columns, and only rows where c = 123.
> For TableFunctions that are built to integrate distant/legacy data, the cost 
> of the query can be prohibitive.   It would be better if information 
> regarding the columns in the select and restrictions from the where clause 
> could be passed to the developer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to