I'm pretty sure this is a problem caused by how I'm using bind variables in 
the GROUP BY clause.  I guess the database has no way of knowing which of 
the different path[?] in the query correlate to other path[?] in the 
SELECT/GROUPING SETS and it's doing the best it can with limited 
information.  Adding a sort should/does not change the data, only the order 
of the rows.  I don't know if it's a postgresql bug or if it's just 
misuse/abuse of bind variables but the following script replicates the 
issue using straight SQL in the psql tool.  I think I can move this to a 
PostgreSQL forum since it doesn't seem to be jOOQ related.

Thanks again for providing the insight into the difference between the 
query executed using fetch() and the inlined query I was looking at.  Saved 
me from going to plan B and doing it in Excel ;-)

CREATE TABLE bind_group_by (
    path text [] PRIMARY KEY,
    value int
);
INSERT INTO bind_group_by (path, value) VALUES
    (ARRAY ['A1', 'B1', 'C1', 'D1'], 0),
    (ARRAY ['A1', 'B1', 'C1', 'D2'], 1),
    (ARRAY ['A1', 'B1', 'C1', 'D3'], 2),
    (ARRAY ['A1', 'B1', 'C2', 'D1'], 3),
    (ARRAY ['A1', 'B1', 'C2', 'D2'], 4),
    (ARRAY ['A1', 'B1', 'C2', 'D3'], 5),
    (ARRAY ['A1', 'B1', 'C3', 'D1'], 6),
    (ARRAY ['A1', 'B1', 'C3', 'D2'], 7),
    (ARRAY ['A1', 'B1', 'C3', 'D3'], 8),
    (ARRAY ['A1', 'B2', 'C1', 'D1'], 9),
    (ARRAY ['A1', 'B2', 'C1', 'D2'], 10),
    (ARRAY ['A1', 'B2', 'C1', 'D3'], 11),
    (ARRAY ['A1', 'B2', 'C2', 'D1'], 12),
    (ARRAY ['A1', 'B2', 'C2', 'D2'], 13),
    (ARRAY ['A1', 'B2', 'C2', 'D3'], 14),
    (ARRAY ['A1', 'B2', 'C3', 'D1'], 15),
    (ARRAY ['A1', 'B2', 'C3', 'D2'], 16),
    (ARRAY ['A1', 'B2', 'C3', 'D3'], 17),
    (ARRAY ['A1', 'B3', 'C1', 'D1'], 18),
    (ARRAY ['A1', 'B3', 'C1', 'D2'], 19),
    (ARRAY ['A1', 'B3', 'C1', 'D3'], 20),
    (ARRAY ['A1', 'B3', 'C2', 'D1'], 21),
    (ARRAY ['A1', 'B3', 'C2', 'D2'], 22),
    (ARRAY ['A1', 'B3', 'C2', 'D3'], 23),
    (ARRAY ['A1', 'B3', 'C3', 'D1'], 24),
    (ARRAY ['A1', 'B3', 'C3', 'D2'], 25),
    (ARRAY ['A1', 'B3', 'C3', 'D3'], 26),
    (ARRAY ['A2', 'B1', 'C1', 'D1'], 27),
    (ARRAY ['A2', 'B1', 'C1', 'D2'], 28),
    (ARRAY ['A2', 'B1', 'C1', 'D3'], 29),
    (ARRAY ['A2', 'B1', 'C2', 'D1'], 30),
    (ARRAY ['A2', 'B1', 'C2', 'D2'], 31),
    (ARRAY ['A2', 'B1', 'C2', 'D3'], 32),
    (ARRAY ['A2', 'B1', 'C3', 'D1'], 33),
    (ARRAY ['A2', 'B1', 'C3', 'D2'], 34),
    (ARRAY ['A2', 'B1', 'C3', 'D3'], 35),
    (ARRAY ['A2', 'B2', 'C1', 'D1'], 36),
    (ARRAY ['A2', 'B2', 'C1', 'D2'], 37),
    (ARRAY ['A2', 'B2', 'C1', 'D3'], 38),
    (ARRAY ['A2', 'B2', 'C2', 'D1'], 39),
    (ARRAY ['A2', 'B2', 'C2', 'D2'], 40),
    (ARRAY ['A2', 'B2', 'C2', 'D3'], 41),
    (ARRAY ['A2', 'B2', 'C3', 'D1'], 42),
    (ARRAY ['A2', 'B2', 'C3', 'D2'], 43),
    (ARRAY ['A2', 'B2', 'C3', 'D3'], 44),
    (ARRAY ['A2', 'B3', 'C1', 'D1'], 45),
    (ARRAY ['A2', 'B3', 'C1', 'D2'], 46),
    (ARRAY ['A2', 'B3', 'C1', 'D3'], 47),
    (ARRAY ['A2', 'B3', 'C2', 'D1'], 48),
    (ARRAY ['A2', 'B3', 'C2', 'D2'], 49),
    (ARRAY ['A2', 'B3', 'C2', 'D3'], 50),
    (ARRAY ['A2', 'B3', 'C3', 'D1'], 51),
    (ARRAY ['A2', 'B3', 'C3', 'D2'], 52),
    (ARRAY ['A2', 'B3', 'C3', 'D3'], 53),
    (ARRAY ['A3', 'B1', 'C1', 'D1'], 54),
    (ARRAY ['A3', 'B1', 'C1', 'D2'], 55),
    (ARRAY ['A3', 'B1', 'C1', 'D3'], 56),
    (ARRAY ['A3', 'B1', 'C2', 'D1'], 57),
    (ARRAY ['A3', 'B1', 'C2', 'D2'], 58),
    (ARRAY ['A3', 'B1', 'C2', 'D3'], 59),
    (ARRAY ['A3', 'B1', 'C3', 'D1'], 60),
    (ARRAY ['A3', 'B1', 'C3', 'D2'], 61),
    (ARRAY ['A3', 'B1', 'C3', 'D3'], 62),
    (ARRAY ['A3', 'B2', 'C1', 'D1'], 63),
    (ARRAY ['A3', 'B2', 'C1', 'D2'], 64),
    (ARRAY ['A3', 'B2', 'C1', 'D3'], 65),
    (ARRAY ['A3', 'B2', 'C2', 'D1'], 66),
    (ARRAY ['A3', 'B2', 'C2', 'D2'], 67),
    (ARRAY ['A3', 'B2', 'C2', 'D3'], 68),
    (ARRAY ['A3', 'B2', 'C3', 'D1'], 69),
    (ARRAY ['A3', 'B2', 'C3', 'D2'], 70),
    (ARRAY ['A3', 'B2', 'C3', 'D3'], 71),
    (ARRAY ['A3', 'B3', 'C1', 'D1'], 72),
    (ARRAY ['A3', 'B3', 'C1', 'D2'], 73),
    (ARRAY ['A3', 'B3', 'C1', 'D3'], 74),
    (ARRAY ['A3', 'B3', 'C2', 'D1'], 75),
    (ARRAY ['A3', 'B3', 'C2', 'D2'], 76),
    (ARRAY ['A3', 'B3', 'C2', 'D3'], 77),
    (ARRAY ['A3', 'B3', 'C3', 'D1'], 78),
    (ARRAY ['A3', 'B3', 'C3', 'D2'], 79),
    (ARRAY ['A3', 'B3', 'C3', 'D3'], 80);

SELECT 'static' AS query;
SELECT path[1], path[2], path[3], path, sum(value)
FROM bind_group_by
GROUP BY GROUPING SETS (
    (path[1], path[2], path[3], path),
    (path[1], path[2], path[3]),
    (path[1], path[2]),
    (path[1]),
    ()
)
ORDER BY 1, 2, 3, 4
;

SELECT 'prepared' AS query;
PREPARE prepared_group_by (int, int, int, int, int, int, int, int, int, 
int, int, int) AS
    SELECT path[$1], path[$2], path[$3], path, sum(value)
    FROM bind_group_by
    GROUP BY GROUPING SETS (
        (path[$4], path[$5], path[$6], path),
        (path[$7], path[$8], path[$9]),
        (path[$10], path[$11]),
        (path[$12]),
        ()
    )
    ORDER BY 1, 2, 3, 4
;
EXECUTE prepared_group_by (1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 1);

-- cleanup
DEALLOCATE prepared_group_by;
DROP TABLE bind_group_by;


On Tuesday, September 8, 2020 at 12:15:40 PM UTC-4 Aner Perez wrote:

> **hangs head in shame**  I wasn't really worried about the output sort 
> order, only that the correct data was generated.  I was going to tackle 
> sorting after I got the correct results in the data but only if it looked 
> wrong.  Oops.
>
> So what you're saying is that my change "fixed" the query but only by 
> coincidence.  The only part I don't understand is how the 2nd result set I 
> provided in my initial post above is possible.  The data I provided is 
> complete with some simple hand editing that I triple checked (but not 
> quadruple checked).  How is it possible that in some result rows the 
> *worker* column has an array of values in it but the *level1* column 
> which should be the first element of that array is null?   Notice that both 
> result sets have the same number of columns and the same the same numeric 
> data albeit in a different order.  It's just the level1, level2 and level3 
> columns that seem to have null in them for all groupings except for the 
> first.
>
> Even if I don't have a SORT BY, shouldn't the data generated be the same 
> except for it being presented in a different order?
>
> "right now" I'm wrapping my head around what I need to do to get the 
> correct sort with grouping sets and the grouping() function.
>
> On Tuesday, September 8, 2020 at 10:53:26 AM UTC-4 [email protected] 
> wrote:
>
>> On Tue, Sep 8, 2020 at 4:26 PM Aner Perez <[email protected]> wrote:
>>
>>> As always, your comments are spot on.
>>>
>>> I was aware that I had no ORDER BY but this was on purpose because I 
>>> wanted the rolled up sums to be displayed under the group they were 
>>> summarizing.  
>>>
>>
>> This is based on a wrong assumption. You should **never** assume any 
>> ordering, it may be completely arbitrary. Please use the approach that I've 
>> shown to keep control over the ordering.
>>  
>>
>>> The bind variables was the actual issue and it didn't even cross my mind 
>>> that they could affect the query results (or that I was actually using bind 
>>> variables anywhere in this query)!
>>>
>>
>> Again, don't think of this as "bind values affecting the results" and 
>> thus "bind values are bad" or whatever. You didn't specify any ordering, so 
>> the ordering is **random** (while it may sometimes be what you expected, 
>> **by 
>> accident**)
>>  
>>
>>> I had to change my custom field function from this version that uses 
>>> bind variables under the hood (from jOOQ issue #229)
>>>
>>>     public static <T> Field<T> arrayGet(Field<T[]> field, int index) {
>>>         return (Field) field("{0}[{1}]", 
>>> field.getDataType().getType().getComponentType(), field, index);
>>>     }
>>>
>>> to this which does not use bind variables:
>>>
>>>     public static <T> Field<T> arrayGet(Field<T[]> field, int index) {
>>>         return (Field) field(field.getQualifiedName()+"["+index+"]", 
>>> field.getDataType().getType().getComponentType());
>>>     }
>>>
>>
>> I strongly recommend you revert this and apply an ORDER BY clause. Don't 
>> rely on such a funky implementation detail of a SQL execution planner. Your 
>> observed ordering is **still** accidental, and any other funky 
>> side-effect like the one you've observed here will "destroy" your expected 
>> ordering again, which you got only by accident, perhaps at midnight during 
>> your well-deserved vacation when that super urgent report is running, and 
>> you'll have to fix it.
>>
>> Besides, please don't concatenate strings to produce identifiers through 
>> plain SQL API:
>> https://blog.jooq.org/2020/03/04/never-concatenate-strings-with-jooq/
>>
>> If you must follow the approach of avoiding bind values, then you can 
>> wrap your field and index using DSL.inline():
>>
>> https://www.jooq.org/doc/latest/manual/sql-building/bind-values/inlined-parameters/
>>  
>>
>>> I also didn't remember the existence of the grouping() function that can 
>>> be used in the select.  It's hard to internalize some of these things until 
>>> you actually have a use for them.  It'll come in useful in the near future.
>>>
>>
>> By "near", I'm sure, you mean "right now" 😉
>>
>

-- 
You received this message because you are subscribed to the Google Groups "jOOQ 
User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jooq-user/0c044309-d653-4364-bc98-b59670953deen%40googlegroups.com.

Reply via email to