Re: [jira] Updated: (DERBY-883) Enhance GROUP BY clause to support expressions instead of just column references.

Army Wed, 26 Jul 2006 14:12:39 -0700

Hi Manish,

Thanks for addressing my comments. The patch is looking better now--especiallywith the whitespace issues in version 4. Some follow-up comments below.


Manish Khettry wrote:
>> 1) TernarnyOpNode.isEquivalent(): I think we need to check "receiver"
>> equivalence as well, in addition to left and right operands.
>
> yes, you are right on this one. The new patch fixes this.

Thanks for fixing this. If possible, could you add a test case to your JUnittest to demonstrate? Could just be the query that I posted in my previous comments.

Actually the existing error message no longer makes sense with group by
expressions. For example, the line of code that you point to, also catches
errors like this:
select c1+1,  avg(c2) from test group by c1+10;
Does the existing error message make sense for this query?


Not with this query, no.  But two things to note here:

1) Before your patch, if you replace "group by c1+10" with "group by c2" you'llget error 42Y36 saying that column reference C1 is invalid, even though it'sactually part of an expression ("c1+1")--so that type of behavior *does* alreadyexist, even if it doesn't really "make sense". If you change the SQLSTATE anderror message for this case, the result is a change (albeit a pretty small one)in customer apps, which is really the issue here.

2) The key expression in my previous comment was "in cases like this", where"like this" meant queries where the existing error message *does* make sense.

For example, with the following query it makes sense to include the columnreference because we know what it is:


ij> select c1 from test group by c2;

ERROR 42Y36: Column reference 'C1' is invalid. For a SELECT list with a GROUPBY, the list may only contain grouping columns and valid aggregate expressions.

And that's what we'll get before your patch. But after your patch we a) get adifferent SQLSTATE and b) lose the column reference name:


ij> select c1 from test group by c2;

ERROR 42Y30: The SELECT list of a grouped query contains at least one invalidexpression. If a SELECT list has a GROUP BY, the list may only contain validgrouping expressions and valid aggregate expressions.

So now we've changed the way an existing query behaves (by returning a differentSQLSTATE)--which (I think?) means we need to mark this as "Existing ApplicationImpact"--*and* we're now providing the user with less information.

If it's possible to figure out when we have a simple column reference (c1)verses an expression (c1+1) and to throw a corresponding error, I think thatwould be ideal. But if you decide that it's not possible or not worth it topreserve the error for queries with simple column references, then feel free toleave this as it is. If that's your decision, though, I think you should askderby-dev if a different SQLSTATE for the same query counts as "ExistingApplication Impact", and if so, mark this issue as appropriate.

+ return (operator.equals(other.operator)
+        || (operand == other.operand)
+        || ((operand != null) && operand.isEquivalent(other)));


If you look at bindUnaryOperator for UnaryOperatorNode, you will see that
it is possible for operand to be null; not so in BinaryOperatorNode.

So the assumption here is that if (operand == other.operand) then both operandsare null? If so, a simple comment saying that could be helpful.

In GroupByList.java:

<snip code snippet>


That is a good question-- this bit of code add invisible column references

when a column reference in the group by list, is not in the projectionlist;

i.e.

select c2, sum(c3) from test group by c1;

will add c1 to the projection list as a generated column. I'm not sure if
you want to do the same kind of rewrite for expressions. I thought this
wasn't worth doing but I'm not sure anymore.

I just tried this query with and without your patch and it fails in both caseswith error 42Y36/42Y30. So I guess I'm a little confused as to why Derby wouldadd "c1" to the projection list and what good it does us?

If you're not sure whether or not it's worth adding equivalent logic forexpressions, do you plan to follow-up with this later? If not, a commentexplaining what you have found (or what your uncertainties are) might help otherdevelopers down the road (esp. if a bug comes up in this area later).

Most if not all of the routines which are deleted were made redundant by
this fix. For example, getColumnReference in GroupByColumn is not only
unused it makes no sense with group by expressions. I think its good to get
rid of this code but if you want, I'd be perfectly willing to add dead code
back in the source tree.

I guess it wasn't clear to me how all of the deleted methods were made redundantby *this* fix. In your comment when you posted the patch you just said "unusedmethods zapped", which sounded to me like you just happened to notice that themethods were unused and therefore deleted them. That kind of spontaneous"cleanup" is, like the whitespace changes, something that should beposted/committed as a separate issue. If, however, the methods are relevant anduseful before your changes but redundant and deletable after your changes, thenit's fine to have them in with this patch.

I disagree with you here-- when a contributor has spent a considerable
amount of time understanding and making a lot of changes to a class, it
makes sense to remove dead code at the same time. If you don't, stuff like
this will likely get ignored or slip through the cracks, imho.

I didn't mean to say that the changes were not useful nor that they shouldn't becommitted; just that it might make it easier to track "cleanup" vs "functional"changes by filing a separate Jira issue to remove the dead code and posting anisolated patch to that issue. The changes will still go in--no slipping throughthe cracks--but they will go in as part of their *own* Jira commit, which seemscleaner to me.

In any event, that was just a suggestion. No need to let this prevent thecommit of the patch.

7) There are a couple of places where you have commented lines out instead

yes, you're right. I've removed it.


There are still a couple of these lingering:

GroupByColumn.java:

        public String getColumnName()
        {
-               return colRef.getColumnName();
+               //return colRef.getColumnName();
+               return columnExpression.getColumnName();

ColumnReference.java:

+       protected boolean isEquivalent(ValueNode o) throws StandardException
+       {
+               if (!isSameNodeType(o)) {
+                       return false;
+               }
+               ColumnReference other = (ColumnReference)o;
+               return (tableNumber == other.tableNumber
+                               && columnName.equals(other.getColumnName()));
+               //return source.isEquivalent(other.source);
+       }

A. Unnecessary white-space changes?


Atleast some of these were done to improve existing indentation. Next time,
I'll resist the urge to improve readability.

Thanks for cleaning a bunch of these whitespace diffs up in the version fourpatch. That said, there are still quite a few white-space changes left--did youintentionally leave those in? Ex.


GroupByNode.java:

@@ -470,10 +489,10 @@
                        ** bottom project restrict.
                        */
                        newRC = (ResultColumn) getNodeFactory().getNode(
-                                                                               
C_NodeTypes.RESULT_COLUMN,
-                                                                               
"##aggregate result",
-                                                                               
aggregate.getNewNullResultExpression(),
-                                                                               
getContextManager());
+                                       C_NodeTypes.RESULT_COLUMN,
+                                       "##aggregate result",
+                                       aggregate.getNewNullResultExpression(),
+                                       getContextManager());
                        newRC.markGenerated();
                        newRC.bindResultColumnToExpression();
                        bottomRCL.addElement(newRC);

B. Lines longer than 80 chars:


I try hard to to keep things within 80 chars with tab stop set to 4 chars.
Do you see any changes where this is not the case?

Again, this is a fairly minor thing--you don't have to waste too much time onthis. When I look at the patch I still seem some lines that are longer than 80chars with a 4-char tab, but as you said this can be editor-dependent. So longas you're aware of the issue and trying to respect it, I'm not going to complain.


Some other things to note:

1 -- SubstituteExpressionVisitor.java:

* The copyright says 1998, 2004; I think this is supposed to just say 2006since that's when the file was created.

2 -- I see that in version 4 of the patch you updated UnaryOperatorNode to havethe following logic in isEquivalent():


+               UnaryOperatorNode other = (UnaryOperatorNode)o;
+               return (operator.equals(other.operator) &&
+                       ((operand == other.operand)||
+                        ((operand != null) && 
operand.isEquivalent(other.operand))));

Did you become aware of this problem because of some test(s) that you wererunning? If so, it'd be great if you could add the actual test case(s) to yourJUnit test for regression purposes. The more tests there are to check for thesekinds of subtleties, the better...


And finally:

> Manish Khettry updated DERBY-883:
> ---------------------------------
>
>     Attachment: 883.patch4.txt
>
> I found a problem in one of my changes. Also got rid of a lot of
> white space only diffs.

In the future it would be helpful if you could say what changed between versionsof the patch, so that reviewers don't have to scour the patches to try to figureout what changed. I managed to find out what "a problem" meant in this case,but it's generally a good idea to explicitly state the problem/changes in thecomments. Makes my life as a reviewer easier :)


Thanks for continuing to improve these patches,
Army

Re: [jira] Updated: (DERBY-883) Enhance GROUP BY clause to support expressions instead of just column references.

Reply via email to