Re: Urgent problem with Phoenix function LAST_VALUE and joins

2021-09-29 Thread Istvan Toth
git clone https://github.com/apache/phoenix
mvn clean package
bin/phoenix_sandbox.py
use the host and port displayed to start sqlline.py

On Thu, Sep 30, 2021 at 4:21 AM Simon Mottram 
wrote:

> Sounds like a plan thanks.
>
> Is there any doco to help with getting setup?  I assume the phoenix dev
> channel is the place to ask questions.
>
> Cheers
>
> Simon
> --
> *From:* Istvan Toth 
> *Sent:* 29 September 2021 7:37 PM
> *To:* user@phoenix.apache.org 
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> Please try to reproduce with the latest released Phoenix version, or git
> master HEAD before opening the JIRA.
> You can use bin/phoenix_sandbox.py in the source repo/distribution to
> quickly start a test environment without installing Phoenix in a real
> cluster.
>
>
> On Wed, Sep 29, 2021 at 12:36 AM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Ok, should I bug this in your JIRA?
>
> I will try and setup a more reproducible series of steps
>
> Cheers
>
> S
> --
> *From:* Kadir Ozdemir 
> *Sent:* 29 September 2021 11:32 AM
> *To:* Geoffrey Jacoby 
> *Cc:* user 
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> Geoffrey,
>
> Thank you for pointing out this. I mixed up release numbers. Yes, the
> paging feature cannot be the reason here. My comment for disabling the
> paging feature should be ignored.
>
> Kadir
>
> On Tue, Sep 28, 2021 at 7:44 AM Geoffrey Jacoby 
> wrote:
>
> Kadir,
>
> The server-side paging changes are in Phoenix 5.1, which is at feature
> parity with 4.16 for HBase 1.x, not 5.0 (which was released in 2018 and is
> roughly equivalent to Phoenix 4.14).
>
> So if Simon's environment is Phoenix 5.0, it can't be affected by the
> server-side changes. (Though I see that he's running HBase 2.2, which I
> thought wasn't supported until Phoenix 5.1?)
>
> Geoffrey
>
> On Tue, Sep 28, 2021 at 4:07 AM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Hi thanks for the quick response
>
> I'm going to appear thick now, brace yourself.   By client do you mean the
> queryserver running in EMR which does have an hbase-site.xml or my Java API
> as a property, we don't have an hbase-site.xml (unless it's embedded in the
> thin client jar.
>
> Cheers
>
> S
> --
> *From:* Kadir Ozdemir 
> *Sent:* 28 September 2021 7:45 PM
> *To:* user 
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> In Phoenix 5.0 we introduced a new server side paging feature to eliminate
> timeouts due long running server side computations such aggregation and
> joins. I wonder if this new feature caused your issue. If so, the work
> around is to disable it by setting phoenix.server.paging.enabled to false
> in hbase-site.xml. This is a client side config param so you just need to
> restart your client. Hope this will fix your issue.
>
> On Mon, Sep 27, 2021 at 9:23 PM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Hi
>
> Got my fingers crossed that there's a work around for this as this really
> is a big problem for us
>
> We are using:
>
> Amazon EMR
>
> Release label:emr-6.1.0
> Hadoop distribution:Amazon
> Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0
>
> Thin Client version:
> phoenix-5.0.0-HBase-2.0-thin-client.jar
>
> We get the following error when doing an LAST_VALUE aggregation where
>
>   1.  A JOIN is empty
>   2.  The column is INTEGER or DATETIME
>
> Remote driver error: IllegalArgumentException: offset (25) + length (4)
> exceed the capacity of the array: 25
>
> The query that breaks is:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
> ON DOCID = OBSERVATION_VALUE_ID
>   AND TAGNAME = 'TRIAL_ID'
> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> I can refactor this using EXIST but get same error, presumably the driver
> knows to treat them the same:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> EXISTS (
> SELECT
> DOCID
> FROM
> VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
> WHERE
> DOCID = OBSERVATION_VALUE_ID
> AND TAGNAME = 'TRIAL_ID'
> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> If we remove the external reference we get no error, regardless of whether
> there are any hits or not
>
> -- these all work
> There are no hits for this query
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> BIOMATERIAL_TYPE = 'aardvark'

Re: Urgent problem with Phoenix function LAST_VALUE and joins

2021-09-29 Thread Simon Mottram
Sounds like a plan thanks.

Is there any doco to help with getting setup?  I assume the phoenix dev channel 
is the place to ask questions.

Cheers

Simon

From: Istvan Toth 
Sent: 29 September 2021 7:37 PM
To: user@phoenix.apache.org 
Subject: Re: Urgent problem with Phoenix function LAST_VALUE and joins

Please try to reproduce with the latest released Phoenix version, or git master 
HEAD before opening the JIRA.
You can use bin/phoenix_sandbox.py in the source repo/distribution to quickly 
start a test environment without installing Phoenix in a real cluster.


On Wed, Sep 29, 2021 at 12:36 AM Simon Mottram 
mailto:simon.mott...@cucumber.co.nz>> wrote:
Ok, should I bug this in your JIRA?

I will try and setup a more reproducible series of steps

Cheers

S

From: Kadir Ozdemir mailto:kozde...@salesforce.com>>
Sent: 29 September 2021 11:32 AM
To: Geoffrey Jacoby mailto:gjac...@salesforce.com>>
Cc: user mailto:user@phoenix.apache.org>>
Subject: Re: Urgent problem with Phoenix function LAST_VALUE and joins

Geoffrey,

Thank you for pointing out this. I mixed up release numbers. Yes, the paging 
feature cannot be the reason here. My comment for disabling the paging feature 
should be ignored.

Kadir

On Tue, Sep 28, 2021 at 7:44 AM Geoffrey Jacoby 
mailto:gjac...@salesforce.com>> wrote:
Kadir,

The server-side paging changes are in Phoenix 5.1, which is at feature parity 
with 4.16 for HBase 1.x, not 5.0 (which was released in 2018 and is roughly 
equivalent to Phoenix 4.14).

So if Simon's environment is Phoenix 5.0, it can't be affected by the 
server-side changes. (Though I see that he's running HBase 2.2, which I thought 
wasn't supported until Phoenix 5.1?)

Geoffrey

On Tue, Sep 28, 2021 at 4:07 AM Simon Mottram 
mailto:simon.mott...@cucumber.co.nz>> wrote:
Hi thanks for the quick response

I'm going to appear thick now, brace yourself.   By client do you mean the 
queryserver running in EMR which does have an hbase-site.xml or my Java API as 
a property, we don't have an hbase-site.xml (unless it's embedded in the thin 
client jar.

Cheers

S

From: Kadir Ozdemir mailto:kozde...@salesforce.com>>
Sent: 28 September 2021 7:45 PM
To: user mailto:user@phoenix.apache.org>>
Subject: Re: Urgent problem with Phoenix function LAST_VALUE and joins

In Phoenix 5.0 we introduced a new server side paging feature to eliminate 
timeouts due long running server side computations such aggregation and joins. 
I wonder if this new feature caused your issue. If so, the work around is to 
disable it by setting phoenix.server.paging.enabled to false in hbase-site.xml. 
This is a client side config param so you just need to restart your client. 
Hope this will fix your issue.

On Mon, Sep 27, 2021 at 9:23 PM Simon Mottram 
mailto:simon.mott...@cucumber.co.nz>> wrote:
Hi

Got my fingers crossed that there's a work around for this as this really is a 
big problem for us

We are using:

Amazon EMR

Release label:emr-6.1.0
Hadoop distribution:Amazon
Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0

Thin Client version:
phoenix-5.0.0-HBase-2.0-thin-client.jar

We get the following error when doing an LAST_VALUE aggregation where

  1.  A JOIN is empty
  2.  The column is INTEGER or DATETIME

Remote driver error: IllegalArgumentException: offset (25) + length (4) exceed 
the capacity of the array: 25

The query that breaks is:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
ON DOCID = OBSERVATION_VALUE_ID
  AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
 OFFSET 0;

I can refactor this using EXIST but get same error, presumably the driver knows 
to treat them the same:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
EXISTS (
SELECT
DOCID
FROM
VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
WHERE
DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
 OFFSET 0;

If we remove the external reference we get no error, regardless of whether 
there are any hits or not

-- these all work
There are no hits for this query

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
BIOMATERIAL_TYPE = 'aardvark'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
 OFFSET 0;

Lots of hits for this query:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 

Re: Urgent problem with Phoenix function LAST_VALUE and joins

2021-09-29 Thread

There seem to be no obvious unsubscribe to this list. Unlike all other Apache 
lists. Given that we have (thank God) decommissioned all our Phoenix instances 
how do I delete myself from this list?

Thanks! 

> On 29 Sep 2021, at 07:37, Istvan Toth  wrote:
> 
> 
> Please try to reproduce with the latest released Phoenix version, or git 
> master HEAD before opening the JIRA.
> You can use bin/phoenix_sandbox.py in the source repo/distribution to quickly 
> start a test environment without installing Phoenix in a real cluster.
> 
> 
>> On Wed, Sep 29, 2021 at 12:36 AM Simon Mottram 
>>  wrote:
>> Ok, should I bug this in your JIRA?
>> 
>> I will try and setup a more reproducible series of steps
>> 
>> Cheers
>> 
>> S
>> From: Kadir Ozdemir 
>> Sent: 29 September 2021 11:32 AM
>> To: Geoffrey Jacoby 
>> Cc: user 
>> Subject: Re: Urgent problem with Phoenix function LAST_VALUE and joins
>>  
>> Geoffrey,
>> 
>> Thank you for pointing out this. I mixed up release numbers. Yes, the paging 
>> feature cannot be the reason here. My comment for disabling the paging 
>> feature should be ignored.
>> 
>> Kadir
>> 
>> On Tue, Sep 28, 2021 at 7:44 AM Geoffrey Jacoby  
>> wrote:
>> Kadir,
>> 
>> The server-side paging changes are in Phoenix 5.1, which is at feature 
>> parity with 4.16 for HBase 1.x, not 5.0 (which was released in 2018 and is 
>> roughly equivalent to Phoenix 4.14). 
>> 
>> So if Simon's environment is Phoenix 5.0, it can't be affected by the 
>> server-side changes. (Though I see that he's running HBase 2.2, which I 
>> thought wasn't supported until Phoenix 5.1?) 
>> 
>> Geoffrey
>> 
>> On Tue, Sep 28, 2021 at 4:07 AM Simon Mottram  
>> wrote:
>> Hi thanks for the quick response
>> 
>> I'm going to appear thick now, brace yourself.   By client do you mean the 
>> queryserver running in EMR which does have an hbase-site.xml or my Java API 
>> as a property, we don't have an hbase-site.xml (unless it's embedded in the 
>> thin client jar.
>> 
>> Cheers
>> 
>> S
>> From: Kadir Ozdemir 
>> Sent: 28 September 2021 7:45 PM
>> To: user 
>> Subject: Re: Urgent problem with Phoenix function LAST_VALUE and joins
>>  
>> In Phoenix 5.0 we introduced a new server side paging feature to eliminate 
>> timeouts due long running server side computations such aggregation and 
>> joins. I wonder if this new feature caused your issue. If so, the work 
>> around is to disable it by setting phoenix.server.paging.enabled to false in 
>> hbase-site.xml. This is a client side config param so you just need to 
>> restart your client. Hope this will fix your issue.
>> 
>> On Mon, Sep 27, 2021 at 9:23 PM Simon Mottram  
>> wrote:
>> Hi
>> 
>> Got my fingers crossed that there's a work around for this as this really is 
>> a big problem for us
>> 
>> We are using:
>> 
>> Amazon EMR
>> 
>> Release label:emr-6.1.0
>> Hadoop distribution:Amazon
>> Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0
>> 
>> Thin Client version:
>> phoenix-5.0.0-HBase-2.0-thin-client.jar
>> 
>> We get the following error when doing an LAST_VALUE aggregation where
>> 
>>   1.  A JOIN is empty
>>   2.  The column is INTEGER or DATETIME
>> 
>> Remote driver error: IllegalArgumentException: offset (25) + length (4) 
>> exceed the capacity of the array: 25
>> 
>> The query that breaks is:
>> 
>> SELECT
>> "BIOMATERIAL_NAME",
>> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
>> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
>> FROM
>> VARIABLE_VALUES_QA.OBSERVATION
>> JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
>> ON DOCID = OBSERVATION_VALUE_ID
>>   AND TAGNAME = 'TRIAL_ID'
>> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
>> GROUP BY
>> "BIOMATERIAL_NAME"
>> LIMIT 10
>>  OFFSET 0;
>> 
>> I can refactor this using EXIST but get same error, presumably the driver 
>> knows to treat them the same:
>> 
>> SELECT
>> "BIOMATERIAL_NAME",
>> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
>> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
>> FROM
>> VARIABLE_VALUES_QA.OBSERVATION
>> WHERE
>> EXISTS (
>> SELECT
>> DOCID
>> FROM
>> VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
>> WHERE
>> DOCID = OBSERVATION_VALUE_ID
>> AND TAGNAME = 'TRIAL_ID'
>> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
>> GROUP BY
>> "BIOMATERIAL_NAME"
>> LIMIT 10
>>  OFFSET 0;
>> 
>> If we remove the external reference we get no error, regardless of whether 
>> there are any hits or not
>> 
>> -- these all work
>> There are no hits for this query
>> 
>> SELECT
>> "BIOMATERIAL_NAME",
>> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
>> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
>> FROM
>> VARIABLE_VALUES_QA.OBSERVATION
>> WHERE
>> BIOMATERIAL_TYPE = 'aardvark'
>> GROUP BY
>> "BIOMATERIAL_NAME"
>> LIMIT 10
>>  OFFSET 0;
>> 
>> Lots of hits for this query:
>> 
>> SELECT
>> "BIOMATERIAL_NAME",
>> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
>> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
>> FROM
>> 

Next Tech Talk on Online Data Format Change in Phoenix

2021-09-29 Thread Kadir Ozdemir
Hi All,

Did you ever need to change the underlying data format of your table which
is already in production and serving your customers. For example, you
wanted to change the primary key of your table, or its column encoding or
even its storage format. And you wanted to do this without interrupting the
services depending on this table. If you did, then you would want to
attend the tech talk by Gokcen Iskender on Oct 07 (next Thursday at 9:00
PST) and see how this will be possible in Phoenix. Hope to see you there.
Please see details at https://phoenix.apache.org/tech_talks.html.

Thanks,
Kadir


Invitation: Apache Phoenix Tech Talk @ Thu Oct 7, 2021 9am - 10:30am (PDT) (user@phoenix.apache.org)

2021-09-29 Thread kadir
BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VEVENT
DTSTART:20211007T16Z
DTEND:20211007T173000Z
DTSTAMP:20210929T182931Z
ORGANIZER;CN=ka...@gsuite.cloud.apache.org:mailto:kadir@gsuite.cloud.apache
 .org
UID:2839cfd3o2o7grqo05pa55e...@google.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=user@phoenix.apache.org;X-NUM-GUESTS=0:mailto:user@phoenix.apache.o
 rg
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=ACCEPTED;RSVP=TRUE
 ;CN=ka...@gsuite.cloud.apache.org;X-NUM-GUESTS=0:mailto:kadir@gsuite.cloud.
 apache.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=d...@phoenix.apache.org;X-NUM-GUESTS=0:mailto:d...@phoenix.apache.org
X-MICROSOFT-CDO-OWNERAPPTID:-2090667828
CREATED:20210929T182930Z
DESCRIPTION:Title: Online Data Format Change in PhoenixAbstract: Ph
 oenix lets the users to make certain changes to the table/index schemas via
  ALTER TABLE and ALTER INDEX. Some schema changes such as changing the row 
 key (primary keys)\, the type of a column\, the table storage format\, the 
 column encoding\, etc. requires table/index rows to be re-written. This tal
 k introduces Online Data Format change feature which will let Phoenix users
  to make those changes with no service interruption. The focus will be on c
 hanging column encoding and table storage format.Presenter : Gokcen
  IskenderMeeting Link : https://salesforce.zoom.us/j/8624511?pw
 d=MXFNTkYrUzFoRFkxYnlybnlwNnc4UT09This meeting will be recorded.Please see https://phoenix.apache.org/tech_talks.html;>https:
 //phoenix.apache.org/tech_talks.html for the info on upcomi
 ng meetings\, and the recordings of the previous meetings.\n\n-::~:~::~:~:~
 :~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-\
 nDo not edit this section of the description.\n\nView your event at https:/
 /calendar.google.com/calendar/event?action=VIEW=MjgzOWNmZDNvMm83Z3JxbzA
 1cGE1NWVuZ3IgdXNlckBwaG9lbml4LmFwYWNoZS5vcmc=Mjkja2FkaXJAZ3N1aXRlLmNsb3
 VkLmFwYWNoZS5vcmc2Y2ZiYjY0OTU1YzMzYmZjNTBlZjVjNzJkMzEyNzljODdjYjAwNTBj=
 America%2FLos_Angeles=en=1.\n-::~:~::~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:
 ~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~::~:~::-
LAST-MODIFIED:20210929T182930Z
LOCATION:
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Apache Phoenix Tech Talk
TRANSP:OPAQUE
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics


Re: Urgent problem with Phoenix function LAST_VALUE and joins

2021-09-29 Thread Istvan Toth
Please try to reproduce with the latest released Phoenix version, or git
master HEAD before opening the JIRA.
You can use bin/phoenix_sandbox.py in the source repo/distribution to
quickly start a test environment without installing Phoenix in a real
cluster.


On Wed, Sep 29, 2021 at 12:36 AM Simon Mottram 
wrote:

> Ok, should I bug this in your JIRA?
>
> I will try and setup a more reproducible series of steps
>
> Cheers
>
> S
> --
> *From:* Kadir Ozdemir 
> *Sent:* 29 September 2021 11:32 AM
> *To:* Geoffrey Jacoby 
> *Cc:* user 
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> Geoffrey,
>
> Thank you for pointing out this. I mixed up release numbers. Yes, the
> paging feature cannot be the reason here. My comment for disabling the
> paging feature should be ignored.
>
> Kadir
>
> On Tue, Sep 28, 2021 at 7:44 AM Geoffrey Jacoby 
> wrote:
>
> Kadir,
>
> The server-side paging changes are in Phoenix 5.1, which is at feature
> parity with 4.16 for HBase 1.x, not 5.0 (which was released in 2018 and is
> roughly equivalent to Phoenix 4.14).
>
> So if Simon's environment is Phoenix 5.0, it can't be affected by the
> server-side changes. (Though I see that he's running HBase 2.2, which I
> thought wasn't supported until Phoenix 5.1?)
>
> Geoffrey
>
> On Tue, Sep 28, 2021 at 4:07 AM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Hi thanks for the quick response
>
> I'm going to appear thick now, brace yourself.   By client do you mean the
> queryserver running in EMR which does have an hbase-site.xml or my Java API
> as a property, we don't have an hbase-site.xml (unless it's embedded in the
> thin client jar.
>
> Cheers
>
> S
> --
> *From:* Kadir Ozdemir 
> *Sent:* 28 September 2021 7:45 PM
> *To:* user 
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> In Phoenix 5.0 we introduced a new server side paging feature to eliminate
> timeouts due long running server side computations such aggregation and
> joins. I wonder if this new feature caused your issue. If so, the work
> around is to disable it by setting phoenix.server.paging.enabled to false
> in hbase-site.xml. This is a client side config param so you just need to
> restart your client. Hope this will fix your issue.
>
> On Mon, Sep 27, 2021 at 9:23 PM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Hi
>
> Got my fingers crossed that there's a work around for this as this really
> is a big problem for us
>
> We are using:
>
> Amazon EMR
>
> Release label:emr-6.1.0
> Hadoop distribution:Amazon
> Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0
>
> Thin Client version:
> phoenix-5.0.0-HBase-2.0-thin-client.jar
>
> We get the following error when doing an LAST_VALUE aggregation where
>
>   1.  A JOIN is empty
>   2.  The column is INTEGER or DATETIME
>
> Remote driver error: IllegalArgumentException: offset (25) + length (4)
> exceed the capacity of the array: 25
>
> The query that breaks is:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
> ON DOCID = OBSERVATION_VALUE_ID
>   AND TAGNAME = 'TRIAL_ID'
> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> I can refactor this using EXIST but get same error, presumably the driver
> knows to treat them the same:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> EXISTS (
> SELECT
> DOCID
> FROM
> VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
> WHERE
> DOCID = OBSERVATION_VALUE_ID
> AND TAGNAME = 'TRIAL_ID'
> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> If we remove the external reference we get no error, regardless of whether
> there are any hits or not
>
> -- these all work
> There are no hits for this query
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> BIOMATERIAL_TYPE = 'aardvark'
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> Lots of hits for this query:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10  OFFSET 0;
>
> I've tried weird things like:
>
> Comparing exists to TRUE to try and force it into a normal BOOLEAN value,
> same IllegalArgumentException.
>
>  SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"