Re: Issues with LAST_VALUE aggregation UDF

2018-09-11 Thread Abhishek Gupta
Thank you so much for the response Josh.

Will check on 4.14 and get back to you with the results.

Thanks

On Tue, Sep 11, 2018 at 8:45 PM Josh Elser  wrote:

> The versions you provided in the description make it sound like you're
> actually using HDP's distribution of Apache Phoenix, not an official
> Apache Phoenix release.
>
> Please test against an Apache Phoenix release or contact Hortonworks for
> support. It would not be unheard of that this issue has already been fixed.
>
> On 9/11/18 10:01 AM, Abhishek Gupta wrote:
> > Hi,
> >
> > We are trying to use the LAST_VALUE aggregation UDF in our Phoenix SQL
> > queries that serve REST APIs over analytics data. However we are seeing
> > a number of issues/limitations in this function.
> > First, there seems to be a bug where it fails for sparse columns that
> > have NULLs. Have created a JIRA ticket for that containing the details
> > of the exception and steps to reproduce
> > https://issues.apache.org/jira/browse/PHOENIX-4898
> >
> > Second, the function works in a way that it ignores all the group by
> > columns except the column in the order by clause in finding the last
> > value. For example
> > Following is a dataset (pk1,pk2,pk3,pk4 and pk5 are Primary key columns
> > and m1-m3 are metric columns, I intend to do a SUM on m2 and m3 and a
> > last value on m1
> >
> > pk1 .   | pk2   | pk3 .| pk4|
> > pk5 .   | m1 | m2 | m3
> >
>  
> --
> > | test| 20180801| app1| plat1  | lang1  | 1
> >| 10  | 100  |
> > | test| 20180801| app1| plat1  | lang2  | 2
> >| 10  | 100  |
> > | test| 20180801| app1| plat2  | lang1  | 3
> >| 10  | 100  |
> > | test| 20180801| app2| plat1  | lang1  | 4
> >| 10  | 100  |
> > | test| 201808010030| app1| plat1  | lang1  |
> > 10   | 10  | 100  |
> > | test| 201808010030| app1| plat1  | lang2  |
> > 20   | 10  | 100  |
> > | test| 201808010030| app1| plat2  | lang1  |
> > 30   | 10  | 100  |
> > | test| 201808010030| app2| plat1  | lang1  |
> > 40   | 10  | 100  |
> > | test| 201808010100| app1| plat1  | lang1  |
> > 100  | 10  | 100  |
> > | test| 201808010100| app1| plat1  | lang2  |
> > 200  | 10  | 100  |
> > | test| 201808010100| app1| plat2  | lang1  |
> > 300  | 10  | 100  |
> > | test| 201808010100| app2| plat1  | lang1  |
> > 400  | 10  | 100  |
> > | test| 201808010130| app1| plat1  | lang1  |
> > 1000 | 10  | 100  |
> > | test| 201808010130| app1| plat1  | lang2  |
> > 2000 | 10  | 100  |
> > | test| 201808010130| app1| plat2  | lang1  |
> > 3000 | 10  | 100  |
> > | test| 201808010130| app2| plat1  | lang1  |
> > 4000 | 10  | 100  |
> > | test| 201808010200| app1| plat1  | lang1  |
> > 1| 10  | 100  |
> > | test| 201808010200| app1| plat1  | lang2  |
> > 2| 10  | 100  |
> > | test| 201808010200| app1| plat2  | lang1  |
> > 3| 10  | 100  |
> > | test| 201808010200| app2| plat1  | lang1  |
> > 4| 10  | 100  |
> >
> >
> > If I run the following query (using all primary key columns in group by)
> >
> > select TO_CHAR(TO_DATE(pk2,'MMddHHmm'),'MMddHH') as
> > t,pk3,pk4,pk5, last_value(m1) within group (order by pk2 asc) as oi,
> > sum(m2), sum(m3) from test_table group by pk1,t,pk3,pk4,pk5;
> >
> > I get the correct result for last value as for each pk2 value which is
> > used in order by there is only 1 row
> >
> >
>  
> +-+-++++--+--+
> > |  T  | pk3  | pk4  |  pk5  |   OI   | SUM(M2)  | SUM(M3)  |
> >
>  
> +-+-++++--+--+
> > | 2018080100  | app1| plat1  | lang1  | 10 | 20
> >   | 200  |
> > | 2018080100  | app1| plat1  | lang2  | 20 | 20
> >   | 200  |
> > | 2018080100  | app1| plat2  | lang1  | 30 | 20
> >   | 200  |
> > | 2018080100  | app2| plat1  | lang1  | 40 | 20
> >   | 200  |
> > | 2018080101  | app1| plat1  | lang1  | 1000   | 20
> >   | 200  |
> > | 2018080101  | app1| plat1  | lang2  | 

Re: Issue in upgrading phoenix : java.lang.ArrayIndexOutOfBoundsException: SYSTEM:CATALOG 63

2018-09-11 Thread Tanvi Bhandari
" mapped hbase tables to phoenix and created them explicitly from phoenix
sqlline client. I first created schema corresponding to namespace and then
tables." By this statement, I meant the same. I re-created my tables since
I had the DDLs with me.

After that I tried getting the count of records in my table which gave me 8
records (expected result). - *select count(*) from "myTable"*;
But when I performed the *select * from "myTable";* it is not returning any
result.

On Wed, Sep 12, 2018 at 1:55 AM Thomas D'Silva 
wrote:

> Since you dropped all the system tables, all the phoenix metadata was
> lost. If you have the ddl statements used to create your tables, you can
> try rerunning them.
>
> On Tue, Sep 11, 2018 at 9:32 AM, Tanvi Bhandari 
> wrote:
>
>> Hi,
>>
>>
>>
>> I am trying to upgrade the phoenix binaries in my setup from phoenix-4.6
>> (had optional concept of schema) to phoenix-4.14 (schema is a must in
>> here).
>>
>> Earlier, I had the phoenix-4.6-hbase-1.1 binaries. When I try to run the
>> phoenix-4.14-hbase-1.3 on the same data. Hbase comes up fine But when I try
>> to connect to phoenix using sqline client,  I get the following error on
>> *console*:
>>
>>
>>
>> 18/09/07 04:22:48 WARN ipc.CoprocessorRpcChannel: Call failed on
>> IOException
>>
>> org.apache.hadoop.hbase.DoNotRetryIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException: SYSTEM:CATALOG: 63
>>
>> at
>> org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:120)
>>
>> at
>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3572)
>>
>> at
>> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16422)
>>
>> at
>> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435)
>>
>> at
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875)
>>
>> at
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857)
>>
>> at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
>>
>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
>>
>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
>>
>> at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>>
>> at
>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 63
>>
>> at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:517)
>>
>> at
>> org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)
>>
>> at
>> org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)
>>
>> at
>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1046)
>>
>> at
>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:587)
>>
>>at
>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.loadTable(MetaDataEndpointImpl.java:1305)
>>
>> at
>> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3568)
>>
>> ... 10 more
>>
>>
>>
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>>
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>>
>> at
>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>>
>> at
>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
>>
>> at
>> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:326)
>>
>> at
>> org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1629)
>>
>> at
>> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:104)
>>
>> at
>> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:94)
>>
>> at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:136)
>>
>> at
>> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:107)
>>
>> at
>> org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callMethod(CoprocessorRpcChannel.java:56)
>>
>> at
>> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService$Stub.getVersion(MetaDataProtos.java:16739)
>>
>> at
>> 

Issues with LAST_VALUE aggregation UDF

2018-09-11 Thread Abhishek Gupta
Hi,

We are trying to use the LAST_VALUE aggregation UDF in our Phoenix SQL
queries that serve REST APIs over analytics data. However we are seeing a
number of issues/limitations in this function.
First, there seems to be a bug where it fails for sparse columns that have
NULLs. Have created a JIRA ticket for that containing the details of the
exception and steps to reproduce
https://issues.apache.org/jira/browse/PHOENIX-4898

Second, the function works in a way that it ignores all the group by
columns except the column in the order by clause in finding the last value.
For example
Following is a dataset (pk1,pk2,pk3,pk4 and pk5 are Primary key columns and
m1-m3 are metric columns, I intend to do a SUM on m2 and m3 and a last
value on m1

pk1 .   | pk2   | pk3 .| pk4| pk5 .   |
> m1 | m2 | m3
>
> --
> | test| 20180801| app1| plat1  | lang1  | 1
> | 10  | 100  |
> | test| 20180801| app1| plat1  | lang2  | 2
> | 10  | 100  |
> | test| 20180801| app1| plat2  | lang1  | 3
> | 10  | 100  |
> | test| 20180801| app2| plat1  | lang1  | 4
> | 10  | 100  |
> | test| 201808010030| app1| plat1  | lang1  | 10
>  | 10  | 100  |
> | test| 201808010030| app1| plat1  | lang2  | 20
>  | 10  | 100  |
> | test| 201808010030| app1| plat2  | lang1  | 30
>  | 10  | 100  |
> | test| 201808010030| app2| plat1  | lang1  | 40
>  | 10  | 100  |
> | test| 201808010100| app1| plat1  | lang1  | 100
> | 10  | 100  |
> | test| 201808010100| app1| plat1  | lang2  | 200
> | 10  | 100  |
> | test| 201808010100| app1| plat2  | lang1  | 300
> | 10  | 100  |
> | test| 201808010100| app2| plat1  | lang1  | 400
> | 10  | 100  |
> | test| 201808010130| app1| plat1  | lang1  | 1000
>  | 10  | 100  |
> | test| 201808010130| app1| plat1  | lang2  | 2000
>  | 10  | 100  |
> | test| 201808010130| app1| plat2  | lang1  | 3000
>  | 10  | 100  |
> | test| 201808010130| app2| plat1  | lang1  | 4000
>  | 10  | 100  |
> | test| 201808010200| app1| plat1  | lang1  | 1
> | 10  | 100  |
> | test| 201808010200| app1| plat1  | lang2  | 2
> | 10  | 100  |
> | test| 201808010200| app1| plat2  | lang1  | 3
> | 10  | 100  |
> | test| 201808010200| app2| plat1  | lang1  | 4
> | 10  | 100  |


If I run the following query (using all primary key columns in group by)

select TO_CHAR(TO_DATE(pk2,'MMddHHmm'),'MMddHH') as t,pk3,pk4,pk5,
> last_value(m1) within group (order by pk2 asc) as oi, sum(m2), sum(m3) from
> test_table group by pk1,t,pk3,pk4,pk5;


I get the correct result for last value as for each pk2 value which is used
in order by there is only 1 row

>
> +-+-++++--+--+
> |  T  | pk3  | pk4  |  pk5  |   OI   | SUM(M2)  | SUM(M3)  |
>
> +-+-++++--+--+
> | 2018080100  | app1| plat1  | lang1  | 10 | 20   |
> 200  |
> | 2018080100  | app1| plat1  | lang2  | 20 | 20   |
> 200  |
> | 2018080100  | app1| plat2  | lang1  | 30 | 20   |
> 200  |
> | 2018080100  | app2| plat1  | lang1  | 40 | 20   |
> 200  |
> | 2018080101  | app1| plat1  | lang1  | 1000   | 20   |
> 200  |
> | 2018080101  | app1| plat1  | lang2  | 2000   | 20   |
> 200  |
> | 2018080101  | app1| plat2  | lang1  | 3000   | 20   |
> 200  |
> | 2018080101  | app2| plat1  | lang1  | 4000   | 20   |
> 200  |
> | 2018080102  | app1| plat1  | lang1  | 1  | 20   |
> 100  |
> | 2018080102  | app1| plat1  | lang2  | 2  | 10   |
> 100  |
> | 2018080102  | app1| plat2  | lang1  | 3  | 10   |
> 100  |
> | 2018080102  | app2| plat1  | lang1  | 4  | 10   |
> 100  |
>
> +-+-++++--+--+


However if I do I group by on less than all the primary columns the
LAST_VALUE function ignores the rest of the group by columns in sorting and
returns incorrect last_value

select TO_CHAR(TO_DATE(pk2,'MMddHHmm'),'MMddHH') as t, pk3,
> last_value(m1) within group (order by pk2 asc) as oi, sum(m2), sum(m3) from
> test_table group by pk1,t,pk3;


+-+-++--+--+
> |

Speakers needed for Apache DC Roadshow

2018-09-11 Thread Rich Bowen
We need your help to make the Apache Washington DC Roadshow on Dec 4th a 
success.


What do we need most? Speakers!

We're bringing a unique DC flavor to this event by mixing Open Source 
Software with talks about Apache projects as well as OSS CyberSecurity, 
OSS in Government and and OSS Career advice.


Please take a look at: http://www.apachecon.com/usroadshow18/

(Note: You are receiving this message because you are subscribed to one 
or more mailing lists at The Apache Software Foundation.)


Rich, for the ApacheCon Planners

--
rbo...@apache.org
http://apachecon.com
@ApacheCon


Issue in upgrading phoenix : java.lang.ArrayIndexOutOfBoundsException: SYSTEM:CATALOG 63

2018-09-11 Thread Tanvi Bhandari
Hi,



I am trying to upgrade the phoenix binaries in my setup from phoenix-4.6
(had optional concept of schema) to phoenix-4.14 (schema is a must in
here).

Earlier, I had the phoenix-4.6-hbase-1.1 binaries. When I try to run the
phoenix-4.14-hbase-1.3 on the same data. Hbase comes up fine But when I try
to connect to phoenix using sqline client,  I get the following error on
*console*:



18/09/07 04:22:48 WARN ipc.CoprocessorRpcChannel: Call failed on IOException

org.apache.hadoop.hbase.DoNotRetryIOException:
org.apache.hadoop.hbase.DoNotRetryIOException: SYSTEM:CATALOG: 63

at
org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:120)

at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3572)

at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:16422)

at
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435)

at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875)

at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857)

at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)

at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)

at
org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.ArrayIndexOutOfBoundsException: 63

at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:517)

at org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

at
org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:1046)

at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:587)

   at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.loadTable(MetaDataEndpointImpl.java:1305)

at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3568)

... 10 more



at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:423)

at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)

at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)

at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:326)

at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1629)

at
org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:104)

at
org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:94)

at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:136)

at
org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:107)

at
org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callMethod(CoprocessorRpcChannel.java:56)

at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService$Stub.getVersion(MetaDataProtos.java:16739)

at
org.apache.phoenix.query.ConnectionQueryServicesImpl$5.call(ConnectionQueryServicesImpl.java:1271)

at
org.apache.phoenix.query.ConnectionQueryServicesImpl$5.call(ConnectionQueryServicesImpl.java:1263)

at org.apache.hadoop.hbase.client.HTable$15.call(HTable.java:1736)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)





*Region-server logs are as follows: *

2018-09-07 03:23:36,170 ERROR
[B.defaultRpcServer.handler=1,queue=1,port=29062]
coprocessor.MetaDataEndpointImpl: loading system catalog table inside
getVersion failed

java.lang.ArrayIndexOutOfBoundsException: 63

   at
org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:517)

   at
org.apache.phoenix.schema.PTableImpl.(PTableImpl.java:421)

   at
org.apache.phoenix.schema.PTableImpl.makePTable(PTableImpl.java:406)

   at

Re: Issue in upgrading phoenix : java.lang.ArrayIndexOutOfBoundsException: SYSTEM:CATALOG 63

2018-09-11 Thread Thomas D'Silva
Since you dropped all the system tables, all the phoenix metadata was lost.
If you have the ddl statements used to create your tables, you can try
rerunning them.

On Tue, Sep 11, 2018 at 9:32 AM, Tanvi Bhandari 
wrote:

> Hi,
>
>
>
> I am trying to upgrade the phoenix binaries in my setup from phoenix-4.6
> (had optional concept of schema) to phoenix-4.14 (schema is a must in
> here).
>
> Earlier, I had the phoenix-4.6-hbase-1.1 binaries. When I try to run the
> phoenix-4.14-hbase-1.3 on the same data. Hbase comes up fine But when I try
> to connect to phoenix using sqline client,  I get the following error on
> *console*:
>
>
>
> 18/09/07 04:22:48 WARN ipc.CoprocessorRpcChannel: Call failed on
> IOException
>
> org.apache.hadoop.hbase.DoNotRetryIOException: 
> org.apache.hadoop.hbase.DoNotRetryIOException:
> SYSTEM:CATALOG: 63
>
> at org.apache.phoenix.util.ServerUtil.createIOException(
> ServerUtil.java:120)
>
> at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(
> MetaDataEndpointImpl.java:3572)
>
> at org.apache.phoenix.coprocessor.generated.MetaDataProtos$
> MetaDataService.callMethod(MetaDataProtos.java:16422)
>
> at org.apache.hadoop.hbase.regionserver.HRegion.
> execService(HRegion.java:7435)
>
> at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> execServiceOnRegion(RSRpcServices.java:1875)
>
> at org.apache.hadoop.hbase.regionserver.RSRpcServices.
> execService(RSRpcServices.java:1857)
>
> at org.apache.hadoop.hbase.protobuf.generated.
> ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
>
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
>
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
>
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> RpcExecutor.java:130)
>
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> java:107)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 63
>
> at org.apache.phoenix.schema.PTableImpl.init(PTableImpl.java:517)
>
> at org.apache.phoenix.schema.PTableImpl.(PTableImpl.
> java:421)
>
> at org.apache.phoenix.schema.PTableImpl.makePTable(
> PTableImpl.java:406)
>
> at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(
> MetaDataEndpointImpl.java:1046)
>
> at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(
> MetaDataEndpointImpl.java:587)
>
>at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.loadTable(
> MetaDataEndpointImpl.java:1305)
>
> at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(
> MetaDataEndpointImpl.java:3568)
>
> ... 10 more
>
>
>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
>
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
>
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>
> at org.apache.hadoop.ipc.RemoteException.instantiateException(
> RemoteException.java:106)
>
> at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(
> RemoteException.java:95)
>
> at org.apache.hadoop.hbase.protobuf.ProtobufUtil.
> getRemoteException(ProtobufUtil.java:326)
>
> at org.apache.hadoop.hbase.protobuf.ProtobufUtil.
> execService(ProtobufUtil.java:1629)
>
> at org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(
> RegionCoprocessorRpcChannel.java:104)
>
> at org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(
> RegionCoprocessorRpcChannel.java:94)
>
> at org.apache.hadoop.hbase.client.RpcRetryingCaller.
> callWithRetries(RpcRetryingCaller.java:136)
>
> at org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.
> callExecService(RegionCoprocessorRpcChannel.java:107)
>
> at org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callMethod(
> CoprocessorRpcChannel.java:56)
>
> at org.apache.phoenix.coprocessor.generated.MetaDataProtos$
> MetaDataService$Stub.getVersion(MetaDataProtos.java:16739)
>
> at org.apache.phoenix.query.ConnectionQueryServicesImpl$5.call(
> ConnectionQueryServicesImpl.java:1271)
>
> at org.apache.phoenix.query.ConnectionQueryServicesImpl$5.call(
> ConnectionQueryServicesImpl.java:1263)
>
> at org.apache.hadoop.hbase.client.HTable$15.call(HTable.java:1736)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
> *Region-server logs are as follows: 

Re: Issues with LAST_VALUE aggregation UDF

2018-09-11 Thread Josh Elser
The versions you provided in the description make it sound like you're 
actually using HDP's distribution of Apache Phoenix, not an official 
Apache Phoenix release.


Please test against an Apache Phoenix release or contact Hortonworks for 
support. It would not be unheard of that this issue has already been fixed.


On 9/11/18 10:01 AM, Abhishek Gupta wrote:

Hi,

We are trying to use the LAST_VALUE aggregation UDF in our Phoenix SQL 
queries that serve REST APIs over analytics data. However we are seeing 
a number of issues/limitations in this function.
First, there seems to be a bug where it fails for sparse columns that 
have NULLs. Have created a JIRA ticket for that containing the details 
of the exception and steps to reproduce

https://issues.apache.org/jira/browse/PHOENIX-4898

Second, the function works in a way that it ignores all the group by 
columns except the column in the order by clause in finding the last 
value. For example
Following is a dataset (pk1,pk2,pk3,pk4 and pk5 are Primary key columns 
and m1-m3 are metric columns, I intend to do a SUM on m2 and m3 and a 
last value on m1


pk1 .   | pk2                       | pk3 .    | pk4            |
pk5 .   | m1         | m2 | m3

--
| test    | 20180801    | app1    | plat1          | lang1  | 1 
           | 10  | 100  |
| test    | 20180801    | app1    | plat1          | lang2  | 2 
           | 10  | 100  |
| test    | 20180801    | app1    | plat2          | lang1  | 3 
           | 10  | 100  |
| test    | 20180801    | app2    | plat1          | lang1  | 4 
           | 10  | 100  |

| test    | 201808010030    | app1    | plat1          | lang1  |
10           | 10  | 100  |
| test    | 201808010030    | app1    | plat1          | lang2  |
20           | 10  | 100  |
| test    | 201808010030    | app1    | plat2          | lang1  |
30           | 10  | 100  |
| test    | 201808010030    | app2    | plat1          | lang1  |
40           | 10  | 100  |
| test    | 201808010100    | app1    | plat1          | lang1  |
100          | 10  | 100  |
| test    | 201808010100    | app1    | plat1          | lang2  |
200          | 10  | 100  |
| test    | 201808010100    | app1    | plat2          | lang1  |
300          | 10  | 100  |
| test    | 201808010100    | app2    | plat1          | lang1  |
400          | 10  | 100  |
| test    | 201808010130    | app1    | plat1          | lang1  |
1000         | 10  | 100  |
| test    | 201808010130    | app1    | plat1          | lang2  |
2000         | 10  | 100  |
| test    | 201808010130    | app1    | plat2          | lang1  |
3000         | 10  | 100  |
| test    | 201808010130    | app2    | plat1          | lang1  |
4000         | 10  | 100  |
| test    | 201808010200    | app1    | plat1          | lang1  |
1        | 10  | 100  |
| test    | 201808010200    | app1    | plat1          | lang2  |
2        | 10  | 100  |
| test    | 201808010200    | app1    | plat2          | lang1  |
3        | 10  | 100  |
| test    | 201808010200    | app2    | plat1          | lang1  |
4        | 10  | 100  |


If I run the following query (using all primary key columns in group by)

select TO_CHAR(TO_DATE(pk2,'MMddHHmm'),'MMddHH') as
t,pk3,pk4,pk5, last_value(m1) within group (order by pk2 asc) as oi,
sum(m2), sum(m3) from test_table group by pk1,t,pk3,pk4,pk5;

I get the correct result for last value as for each pk2 value which is 
used in order by there is only 1 row



+-+-++++--+--+
|      T      | pk3  | pk4  |  pk5  |   OI   | SUM(M2)  | SUM(M3)  |

+-+-++++--+--+
| 2018080100  | app1    | plat1          | lang1  | 10     | 20 
  | 200      |
| 2018080100  | app1    | plat1          | lang2  | 20     | 20 
  | 200      |
| 2018080100  | app1    | plat2          | lang1  | 30     | 20 
  | 200      |
| 2018080100  | app2    | plat1          | lang1  | 40     | 20 
  | 200      |
| 2018080101  | app1    | plat1          | lang1  | 1000   | 20 
  | 200      |
| 2018080101  | app1    | plat1          | lang2  | 2000   | 20 
  | 200      |
| 2018080101  | app1    | plat2          | lang1  | 3000   | 20 
  | 200      |
| 2018080101  | app2    | plat1          | lang1  | 4000   | 20 
  | 200      |
| 2018080102  | app1    | plat1          | lang1  | 1  | 20 
  | 100      |
| 2018080102  | app1    | plat1          | lang2  | 2  | 10 
  | 100      |
| 2018080102  | app1    | plat2          | lang1  |