[jira] [Commented] (PHOENIX-3773) Implement FIRST_VALUES aggregate function

Loknath Priyatham Teja Singamsetty (JIRA) Fri, 02 Jun 2017 01:49:01 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034333#comment-16034333
 ]


Loknath Priyatham Teja Singamsetty  commented on PHOENIX-3773:
--------------------------------------------------------------

[~jamestaylor]  

bq. Then, once you have all the values, combine them together using the 
PArrayDataType.appendItemToArray() method

I tried this last week as well but for some reason the output is not as 
expected. Yesterday made changes to use PArrayDataType.appendItemToArray(), 
upon debugging found two things:

a) For fixed length data types, the appendItemToArray is actually prepending 
the arrayBytes reversing the array construction. For the time being used 
prependItemToArray() method instead which fixed this issue. The following lines 
of code in appendItemToArray seems to be the reason behind this which copies 
the new bytes to front of array and older bytes to the end.

{quote}
            newArray = new byte[length + elementLength];

            System.arraycopy(arrayBytes, offset, newArray, 0, length); 
            System.arraycopy(elementBytes, elementOffset, newArray, length, 
elementLength);
{quote}
b) For variable length data types, the Array construction results in 
ArrayIndexOutOfBoundsException. Here is the stack trace

java.lang.ArrayIndexOutOfBoundsException: 32767
        at 
org.apache.phoenix.schema.types.PArrayDataType.prependItemToArray(PArrayDataType.java:545)
        at 
org.apache.phoenix.expression.aggregator.FirstLastValueBaseClientAggregator.evaluate(FirstLastValueBaseClientAggregator.java:117)
        at 
org.apache.phoenix.schema.KeyValueSchema.toBytes(KeyValueSchema.java:112)
        at 
org.apache.phoenix.schema.KeyValueSchema.toBytes(KeyValueSchema.java:93)
        at 
org.apache.phoenix.expression.aggregator.Aggregators.toBytes(Aggregators.java:112)
        at 
org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:82)
        at 
org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:778)
        at 
org.apache.phoenix.end2end.FirstValuesFunctionIT.varcharDatatypeSimpleTest(FirstValuesFunctionIT.java:100)

I'm debugging this further. 
 

bq.Probably a good idea to have a test that asks for the top 3 values when 
there are only 2 values to make sure that case works too (if you don't have 
that already).

Test case is included already.



> Implement FIRST_VALUES aggregate function
> -----------------------------------------
>
>                 Key: PHOENIX-3773
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3773
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: Loknath Priyatham Teja Singamsetty 
>              Labels: SFDC
>             Fix For: 4.11.0
>
>         Attachments: PHOENIX-3773_4.x-HBase-0.98.patch, 
> PHOENIX-3773_master.patch, PHOENIX-3773.patch, PHOENIX-3773.v2.patch, 
> PHOENIX-3773.v3.patch
>
>
> Similar to FIRST_VALUE, but would allow the user to specify how many values 
> to keep. This could use a MinMaxPriorityQueue under the covers and be much 
> more efficient than using multiple NTH_VALUE calls to do the same like this:
> {code}
> SELECT entity_id,
>        NTH_VALUE(user_id,1) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth1_user_id,
>        NTH_VALUE(user_id,2) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth2_user_id,
>        NTH_VALUE(user_id,3) WITHIN GROUP (ORDER BY last_read_date DESC) as 
> nth3_user_id,
>        count(*)
> FROM  MY_TABLE 
> WHERE tenant_id='00Dx0000000XXXX'
> AND entity_id in ('0D5x000000ABCD','0D5x000000ABCE')
> GROUP BY entity_id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PHOENIX-3773) Implement FIRST_VALUES aggregate function

Reply via email to