[jira] [Commented] (PHOENIX-4724) Efficient Equi-Depth histogram for streaming data

2018-05-17 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480191#comment-16480191
 ] 

James Taylor commented on PHOENIX-4724:
---

+1. Excellent work, [~vincentpoon] !

> Efficient Equi-Depth histogram for streaming data
> -
>
> Key: PHOENIX-4724
> URL: https://issues.apache.org/jira/browse/PHOENIX-4724
> Project: Phoenix
>  Issue Type: Sub-task
>Affects Versions: 4.15.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Attachments: PHOENIX-4724.v1.patch, PHOENIX-4724.v2.patch
>
>
> Equi-Depth histogram from 
> http://web.cs.ucla.edu/~zaniolo/papers/Histogram-EDBT2011-CamReady.pdf, but 
> without the sliding window - we assume a single window over the entire data 
> set.
> Used to generate the bucket boundaries of a histogram where each bucket has 
> the same # of items.
> This is useful, for example, for pre-splitting an index table, by feeding in 
> data from the indexed column.
> Works on streaming data - the histogram is dynamically updated for each new 
> value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4704:
--
Fix Version/s: 5.0.0

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480176#comment-16480176
 ] 

James Taylor commented on PHOENIX-4704:
---

+1. Great work, [~vincentpoon]! Let's get this nice feature into 4.14.

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4704:
--
Fix Version/s: 4.14.0

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4742) DistinctPrefixFilter potentially seeks to lesser key when descending or null value

2018-05-17 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480170#comment-16480170
 ] 

James Taylor commented on PHOENIX-4742:
---

Anyone got time for a code review: [~sergey.soldatov], [~rajeshbabu], 
[~tdsilva]?

> DistinctPrefixFilter potentially seeks to lesser key when descending or null 
> value
> --
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4742_v1.patch
>
>
> DistinctPrefixFilter seeks to a smaller key than the current key (which 
> causes an infinite loop in HBase 1.4 and seeks to every row in other HBase 
> versions). This happens when:
>  # Last column of distinct is descending. We currently always add a 0x01 
> byte, but since the separator byte if 0xFF when descending, the seek key is 
> too small.
>  # Last column value is null. In this case, instead of adding a 0x01 byte, we 
> need to increment in-place the null value of the last distinct column. 
> This was discovered due to 
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 hanging in 
> master.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4742) DistinctPrefixFilter potentially seeks to lesser key when descending or null value

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4742:
--
Attachment: PHOENIX-4742_v1.patch

> DistinctPrefixFilter potentially seeks to lesser key when descending or null 
> value
> --
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4742_v1.patch
>
>
> DistinctPrefixFilter seeks to a smaller key than the current key (which 
> causes an infinite loop in HBase 1.4 and seeks to every row in other HBase 
> versions). This happens when:
>  # Last column of distinct is descending. We currently always add a 0x01 
> byte, but since the separator byte if 0xFF when descending, the seek key is 
> too small.
>  # Last column value is null. In this case, instead of adding a 0x01 byte, we 
> need to increment in-place the null value of the last distinct column. 
> This was discovered due to 
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 hanging in 
> master.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4742) DistinctPrefixFilter potentially seeks to lesser key when descending or null value

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4742:
--
Description: 
DistinctPrefixFilter seeks to a smaller key than the current key (which causes 
an infinite loop in HBase 1.4 and seeks to every row in other HBase versions). 
This happens when:
 # Last column of distinct is descending. We currently always add a 0x01 byte, 
but since the separator byte if 0xFF when descending, the seek key is too small.
 # Last column value is null. In this case, instead of adding a 0x01 byte, we 
need to increment in-place the null value of the last distinct column. 

This was discovered due to 
OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 hanging in master.

  was:OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only 
test failing on master (i.e. HBase 1.4). It's getting into an infinite loop 
when a reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix 
this so we can do a release for HBase 1.4. At a minimum, we could disable 
DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).


> DistinctPrefixFilter potentially seeks to lesser key when descending or null 
> value
> --
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> DistinctPrefixFilter seeks to a smaller key than the current key (which 
> causes an infinite loop in HBase 1.4 and seeks to every row in other HBase 
> versions). This happens when:
>  # Last column of distinct is descending. We currently always add a 0x01 
> byte, but since the separator byte if 0xFF when descending, the seek key is 
> too small.
>  # Last column value is null. In this case, instead of adding a 0x01 byte, we 
> need to increment in-place the null value of the last distinct column. 
> This was discovered due to 
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 hanging in 
> master.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4742) DistinctPrefixFilter potentially seeks to lesser key when descending or null value

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4742:
--
Summary: DistinctPrefixFilter potentially seeks to lesser key when 
descending or null value  (was: Prevent infinite loop with HBase 1.4 and 
DistinctPrefixFilter)

> DistinctPrefixFilter potentially seeks to lesser key when descending or null 
> value
> --
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Sergey Soldatov
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
> failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
> reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this 
> so we can do a release for HBase 1.4. At a minimum, we could disable 
> DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4742) DistinctPrefixFilter potentially seeks to lesser key when descending or null value

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4742:
--
Fix Version/s: 5.0.0
   4.14.0

> DistinctPrefixFilter potentially seeks to lesser key when descending or null 
> value
> --
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Sergey Soldatov
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
> failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
> reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this 
> so we can do a release for HBase 1.4. At a minimum, we could disable 
> DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-4742) DistinctPrefixFilter potentially seeks to lesser key when descending or null value

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor reassigned PHOENIX-4742:
-

Assignee: James Taylor  (was: Sergey Soldatov)

> DistinctPrefixFilter potentially seeks to lesser key when descending or null 
> value
> --
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
> failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
> reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this 
> so we can do a release for HBase 1.4. At a minimum, we could disable 
> DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4741) Shade disruptor dependency

2018-05-17 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480151#comment-16480151
 ] 

James Taylor commented on PHOENIX-4741:
---

If we're already shading, shouldn't we close this as "Not a Problem"?

> Shade disruptor dependency 
> ---
>
> Key: PHOENIX-4741
> URL: https://issues.apache.org/jira/browse/PHOENIX-4741
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Jungtaek Lim
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> We should shade disruptor dependency to avoid conflict with the versions used 
> by the other framework like storm , hive etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-17 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480129#comment-16480129
 ] 

Thomas D'Silva commented on PHOENIX-3955:
-

At upgrade time if you check all tables and if you find a table with multiple 
column families with property values that aren't in sync, then change them all 
to the value of the default column family. If the table has indexes change the 
property values to keep them in sync as well. This should cover tables created 
with older clients that might have inconsistent properties, right?

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4741) Shade disruptor dependency

2018-05-17 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480128#comment-16480128
 ] 

Ankit Singhal commented on PHOENIX-4741:


I just checked Phoenix is already shading disruptor library in its client. 
Reducing the priority for now, let me know [~kabhwan] if you still see a 
conflict with a disruptor dependency.

> Shade disruptor dependency 
> ---
>
> Key: PHOENIX-4741
> URL: https://issues.apache.org/jira/browse/PHOENIX-4741
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Jungtaek Lim
>Assignee: Ankit Singhal
>Priority: Blocker
> Fix For: 4.14.0, 5.0.0
>
>
> We should shade disruptor dependency to avoid conflict with the versions used 
> by the other framework like storm , hive etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4741) Shade disruptor dependency

2018-05-17 Thread Ankit Singhal (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated PHOENIX-4741:
---
Priority: Major  (was: Blocker)

> Shade disruptor dependency 
> ---
>
> Key: PHOENIX-4741
> URL: https://issues.apache.org/jira/browse/PHOENIX-4741
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Jungtaek Lim
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> We should shade disruptor dependency to avoid conflict with the versions used 
> by the other framework like storm , hive etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4743) ALTER TABLE ADD COLUMN for global index should not modify HBase metadata if failed

2018-05-17 Thread Chinmay Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-4743:
--
Description: 
When you issue an "ALTER TABLE" for a global index to add a column, Phoenix 
throws a SQLException, but the HBase metadata for the global index table is 
still modified.

Steps to reproduce:
 # Create the base data table: 
{code:java}
create table if not exists z_base_table (id INTEGER not null primary key, host 
VARCHAR(10), flag boolean);{code}

 # Create a global index on top of this table: 
{code:java}
create index global_z_index on z_base_table(HOST);{code}

 # Add a column to the global index table:
{code:java}
alter table global_z_index add cf1.age INTEGER;{code}

This will throw an exception in Phoenix, but HBase metadata for the global 
index table is still modified. Stack trace:
{noformat}
Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop column 
referenced by VIEW columnName=GLOBAL_Z_INDEX (state=42M01,code=1010)
 java.sql.SQLException: ERROR 1010 (42M01): Not allowed to mutate table. Cannot 
add/drop column referenced by VIEW columnName=GLOBAL_Z_INDEX
 at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)
 at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
 at 
org.apache.phoenix.schema.MetaDataClient.processMutationResult(MetaDataClient.java:3049)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3503)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3210)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableAddColumnStatement$1.execute(PhoenixStatement.java:1432)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
 at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
 at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
 at sqlline.Commands.execute(Commands.java:822)
 at sqlline.Commands.sql(Commands.java:732)
 at sqlline.SqlLine.dispatch(SqlLine.java:813)
 at sqlline.SqlLine.begin(SqlLine.java:686)
 at sqlline.SqlLine.start(SqlLine.java:398)
 at sqlline.SqlLine.main(SqlLine.java:291{noformat}
 

  was:
When you issue an "ALTER TABLE" for a global index to add a column, Phoenix 
throws a SQLException, but the HBase metadata for the global index table is 
still modified.

Steps to reproduce:
 # Create the base data table: 

{code:java}
create table if not exists z_base_table (id INTEGER not null primary key, host 
VARCHAR(10), flag boolean);{code}

 # Create a global index on top of this table:
{code:java}
create index global_z_index on z_base_table(HOST);{code}

 # Alter the global index table to add a column:
{code:java}
alter table global_z_index add cf1.age INTEGER;{code}

This will throw an exception in Phoenix, but HBase metadata for the global 
index table is still modified. Stack trace:

 
{noformat}
Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop column 
referenced by VIEW columnName=GLOBAL_Z_INDEX (state=42M01,code=1010)
 java.sql.SQLException: ERROR 1010 (42M01): Not allowed to mutate table. Cannot 
add/drop column referenced by VIEW columnName=GLOBAL_Z_INDEX
 at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)
 at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
 at 
org.apache.phoenix.schema.MetaDataClient.processMutationResult(MetaDataClient.java:3049)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3503)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3210)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableAddColumnStatement$1.execute(PhoenixStatement.java:1432)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
 at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
 at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
 at sqlline.Commands.execute(Commands.java:822)
 at sqlline.Commands.sql(Commands.java:732)
 at sqlline.SqlLine.dispatch(SqlLine.java:813)
 at sqlline.SqlLine.begin(SqlLine.java:686)
 at sqlline.SqlLine.start(SqlLine.java:398)
 at sqlline.SqlLine.main(SqlLine.java:291{noformat}
 


> ALTER TABLE ADD COLUMN for global index should not modify HBase metadata if 
> failed
> 

[jira] [Updated] (PHOENIX-4743) ALTER TABLE ADD COLUMN for global index should not modify HBase metadata if failed

2018-05-17 Thread Chinmay Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinmay Kulkarni updated PHOENIX-4743:
--
Description: 
When you issue an "ALTER TABLE" for a global index to add a column, Phoenix 
throws a SQLException, but the HBase metadata for the global index table is 
still modified.

Steps to reproduce:
 * Create a global index on top of this table: 
{code:java}
create index global_z_index on z_base_table(HOST);{code}

 * Create the base data table: 
{code:java}
create table if not exists z_base_table (id INTEGER not null primary key, host 
VARCHAR(10), flag boolean);{code}

 * Add a column to the global index table:
{code:java}
alter table global_z_index add cf1.age INTEGER;{code}

This will throw an exception in Phoenix, but HBase metadata for the global 
index table is still modified. Stack trace:
{noformat}
Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop column 
referenced by VIEW columnName=GLOBAL_Z_INDEX (state=42M01,code=1010)
 java.sql.SQLException: ERROR 1010 (42M01): Not allowed to mutate table. Cannot 
add/drop column referenced by VIEW columnName=GLOBAL_Z_INDEX
 at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)
 at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
 at 
org.apache.phoenix.schema.MetaDataClient.processMutationResult(MetaDataClient.java:3049)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3503)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3210)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableAddColumnStatement$1.execute(PhoenixStatement.java:1432)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
 at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
 at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
 at sqlline.Commands.execute(Commands.java:822)
 at sqlline.Commands.sql(Commands.java:732)
 at sqlline.SqlLine.dispatch(SqlLine.java:813)
 at sqlline.SqlLine.begin(SqlLine.java:686)
 at sqlline.SqlLine.start(SqlLine.java:398)
 at sqlline.SqlLine.main(SqlLine.java:291{noformat}
 

  was:
When you issue an "ALTER TABLE" for a global index to add a column, Phoenix 
throws a SQLException, but the HBase metadata for the global index table is 
still modified.

Steps to reproduce:
 # Create the base data table: 
{code:java}
create table if not exists z_base_table (id INTEGER not null primary key, host 
VARCHAR(10), flag boolean);{code}

 # Create a global index on top of this table: 
{code:java}
create index global_z_index on z_base_table(HOST);{code}

 # Add a column to the global index table:
{code:java}
alter table global_z_index add cf1.age INTEGER;{code}

This will throw an exception in Phoenix, but HBase metadata for the global 
index table is still modified. Stack trace:
{noformat}
Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop column 
referenced by VIEW columnName=GLOBAL_Z_INDEX (state=42M01,code=1010)
 java.sql.SQLException: ERROR 1010 (42M01): Not allowed to mutate table. Cannot 
add/drop column referenced by VIEW columnName=GLOBAL_Z_INDEX
 at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)
 at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
 at 
org.apache.phoenix.schema.MetaDataClient.processMutationResult(MetaDataClient.java:3049)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3503)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3210)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableAddColumnStatement$1.execute(PhoenixStatement.java:1432)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
 at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
 at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
 at sqlline.Commands.execute(Commands.java:822)
 at sqlline.Commands.sql(Commands.java:732)
 at sqlline.SqlLine.dispatch(SqlLine.java:813)
 at sqlline.SqlLine.begin(SqlLine.java:686)
 at sqlline.SqlLine.start(SqlLine.java:398)
 at sqlline.SqlLine.main(SqlLine.java:291{noformat}
 


> ALTER TABLE ADD COLUMN for global index should not modify HBase metadata if 
> failed
> 

[jira] [Created] (PHOENIX-4743) ALTER TABLE ADD COLUMN for global index should not modify HBase metadata if failed

2018-05-17 Thread Chinmay Kulkarni (JIRA)
Chinmay Kulkarni created PHOENIX-4743:
-

 Summary: ALTER TABLE ADD COLUMN for global index should not modify 
HBase metadata if failed
 Key: PHOENIX-4743
 URL: https://issues.apache.org/jira/browse/PHOENIX-4743
 Project: Phoenix
  Issue Type: Bug
Reporter: Chinmay Kulkarni


When you issue an "ALTER TABLE" for a global index to add a column, Phoenix 
throws a SQLException, but the HBase metadata for the global index table is 
still modified.

Steps to reproduce:
 # Create the base data table: 

{code:java}
create table if not exists z_base_table (id INTEGER not null primary key, host 
VARCHAR(10), flag boolean);{code}

 # Create a global index on top of this table:
{code:java}
create index global_z_index on z_base_table(HOST);{code}

 # Alter the global index table to add a column:
{code:java}
alter table global_z_index add cf1.age INTEGER;{code}

This will throw an exception in Phoenix, but HBase metadata for the global 
index table is still modified. Stack trace:

 
{noformat}
Error: ERROR 1010 (42M01): Not allowed to mutate table. Cannot add/drop column 
referenced by VIEW columnName=GLOBAL_Z_INDEX (state=42M01,code=1010)
 java.sql.SQLException: ERROR 1010 (42M01): Not allowed to mutate table. Cannot 
add/drop column referenced by VIEW columnName=GLOBAL_Z_INDEX
 at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)
 at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
 at 
org.apache.phoenix.schema.MetaDataClient.processMutationResult(MetaDataClient.java:3049)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3503)
 at org.apache.phoenix.schema.MetaDataClient.addColumn(MetaDataClient.java:3210)
 at 
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableAddColumnStatement$1.execute(PhoenixStatement.java:1432)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
 at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
 at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:390)
 at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
 at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1825)
 at sqlline.Commands.execute(Commands.java:822)
 at sqlline.Commands.sql(Commands.java:732)
 at sqlline.SqlLine.dispatch(SqlLine.java:813)
 at sqlline.SqlLine.begin(SqlLine.java:686)
 at sqlline.SqlLine.start(SqlLine.java:398)
 at sqlline.SqlLine.main(SqlLine.java:291{noformat}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-17 Thread Chinmay Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480007#comment-16480007
 ] 

Chinmay Kulkarni commented on PHOENIX-3955:
---

[~jamestaylor] [~tdsilva] [~gjacoby] In the upgrade path, I guess we would have 
to do 2 things then: For every table, make sure these properties are in sync 
amongst all column families; and ensure these properties are in sync for each 
index table. In the first case, I guess we can use the default CF as the source 
of truth.

What about the case where a table is created with an old phoenix client and so 
these properties have different values amongst its own column families, and we 
then try to create an index on this table with a new phoenix client? Since the 
base table's properties are out of sync amongst its own CFs, we won't know 
which properties to inherit during index creation. One solution is to force an 
entire upgrade/throw an UpgradeRequiredException, but "EXECUTE UPGRADE" does a 
lot of other stuff which we don't require at this point.

Is it worth the effort to introduce some new command like "SYNC TABLE  " which syncs these properties amongst all its column 
families and also all the indexes of that table?

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479977#comment-16479977
 ] 

Vincent Poon commented on PHOENIX-4704:
---

and thanks [~aertoria] for the TABLESAMPLE feature that makes this possible!

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479971#comment-16479971
 ] 

Vincent Poon commented on PHOENIX-4704:
---

[~jamestaylor] mind doing a review?  This enhances IndexTool with an option to 
use TABLESAMPLE to sample the data table and presplit the index table.  There's 
also an option to only split if the data table has > N regions.

 

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4704:
--
Attachment: PHOENIX-4704.master.v1.patch

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
> Attachments: PHOENIX-4704.master.v1.patch
>
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-4704) Presplit index tables when building asynchronously

2018-05-17 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-4704:
-

Assignee: Vincent Poon

> Presplit index tables when building asynchronously
> --
>
> Key: PHOENIX-4704
> URL: https://issues.apache.org/jira/browse/PHOENIX-4704
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
>
> For large data tables with many regions, if we build the index asynchronously 
> using the IndexTool, the index table will initial face a hotspot as all data 
> region mappers attempt to write to the sole new index region.  This can 
> potentially lead to the index getting disabled if writes to the index table 
> timeout during this hotspotting.
> We can add an optional step (or perhaps activate it based on the count of 
> regions in the data table) to the IndexTool to first do a MR job to gather 
> stats on the indexed column values, and then attempt to presplit the index 
> table before we do the actual index build MR job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4742) Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter

2018-05-17 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479832#comment-16479832
 ] 

James Taylor commented on PHOENIX-4742:
---

FYI, I'm looking at this now, [~sergey.soldatov]. It's a bug in the way 
descending row keys are handled by DistinctPrefixFilter. We're generating a 
seek next hint that's smaller than the current key. In HBase 1.4 this causes an 
infinite loop. In other versions of HBase, this seems to just cause the filter 
to go to the next row. I'll try to tweak the logic correctly for descending 
keys.

> Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter
> -
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Sergey Soldatov
>Priority: Major
>
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
> failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
> reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this 
> so we can do a release for HBase 1.4. At a minimum, we could disable 
> DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4742) Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter

2018-05-17 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479828#comment-16479828
 ] 

Sergey Soldatov commented on PHOENIX-4742:
--

That happens because of the changes in flters in HBase 1.4 and 2.0. Somehow the 
matcher instead of SEEK_NEXT_ROW keeps returning SEEK_NEXT_USING_HINT, so we 
are getting stuck at the last cell. let me dig a bit.  

> Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter
> -
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Sergey Soldatov
>Priority: Major
>
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
> failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
> reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this 
> so we can do a release for HBase 1.4. At a minimum, we could disable 
> DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-4742) Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter

2018-05-17 Thread Sergey Soldatov (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov reassigned PHOENIX-4742:


Assignee: Sergey Soldatov

> Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter
> -
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Sergey Soldatov
>Priority: Major
>
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
> failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
> reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this 
> so we can do a release for HBase 1.4. At a minimum, we could disable 
> DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2018-05-17 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479635#comment-16479635
 ] 

Josh Elser commented on PHOENIX-3655:
-

{quote}The stuff you are talking about is implemented as a part of 
hbase-hadoop2-compat library on which phoenix-core depends. I would prefer 
re-using the code that provides the {{HBaseMetrics2HadoopMetricsAdapter}} for 
converting the metrics. Also, {{GlobalMetricRegistriesAdapter}} class helps me 
directly dump the hbase registry to JMX.
{quote}
Maybe we're talking past each other. I am talking about the classes/interfaces 
defined in hbase-metrics-api. These classes/interfaces are what 
HBaseMetrics2HadopMetricsAdapter uses and I think that is what would be good to 
use. BaseSourceImpl is not a part of this collection. I am suggesting that 
Phoenix depend on nothing metrics-related in hbase-hadoop2-compat.

> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Karan Mehta
>Priority: Major
> Fix For: 4.15.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2018-05-17 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479636#comment-16479636
 ] 

Thomas D'Silva commented on PHOENIX-3655:
-

I think the raw value is still useful. Lets say a client started writing a lot 
more data this month vs last month etc. 

> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Karan Mehta
>Priority: Major
> Fix For: 4.15.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2018-05-17 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479630#comment-16479630
 ] 

Karan Mehta commented on PHOENIX-3655:
--

{quote}The rest of what you were saying is about on track: we use the 
hbase-metrics-api to construct the things we're measuring, sending them to the 
MetricsRegistry. Then, HBase (or the reporter we configure on the Registry) 
would take care of pushing those to Hadoop Metrics2 sink. There may be 
something in place already with the underlying dropwizard metrics 
implementation to push all of these to JMX (as an aside).
{quote}
[~elserj] The stuff you are talking about is implemented as a part of 
hbase-hadoop2-compat library on which phoenix-core depends. I would prefer 
re-using the code that provides the {{HBaseMetrics2HadoopMetricsAdapter}} for 
converting the metrics. Also, {{GlobalMetricRegistriesAdapter}} class helps me 
directly dump the hbase registry to JMX.
{quote}I fear you might be getting sucked into some old metrics cruft in HBase, 
[~karanmehta93]. BaseSourceImpl and other classes in hbase-hadoop2-compat are 
vestigial to prevent having to rewrite all of hbase-server to use the new 
hbase-metrics-api. I would think that if you're coming in here fresh, you could 
just use hbase-metrics-api only and ignore all of that other stuff.
{quote}
 Agreed to some extent, please refer to my old comment and advise how we can 
proceed.
{quote}GLOBAL_MUTATION_BYTES is useful for trending, to see how the amount of 
data written by a client changes over time
{quote}
This is *NOT* per client, be advised. [~tdsilva] This number will keep growing, 
we would ideally need the rate at which it is growing. If its not the peak 
hours of day, the number of mutations will be less and vice versa. A Guage to 
track the raw value might not be super useful here. HBase-metrics Meter is a 
DropwizardMeter and provides the rate of change in last 1 min, 5 min and 15 
mins, might be useful here (though not sure).
{quote}Are you planning on only converting GlobalClientMetrics to use hbase 
metrics? Or are you going to change MutationMetricQueue, OverAllQueryMetrics 
etc as well?
{quote}
Only the GlobalClientMetrics for this Jira. The current metrics implementations 
are simple counters as such internally, If we plan to expose them as hbase 
counters, the work should be trivial. However if we decide to model them as 
other types, more work might be involved. I don't want to change how the 
metrics are exported from various queues.

Also, these metrics keep a track of number of samples of the metric. How can 
that be used for our case? I could not find any Metric that uses it.

[~apurtell] thoughts?

> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Karan Mehta
>Priority: Major
> Fix For: 4.15.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4692) ArrayIndexOutOfBoundsException in ScanRanges.intersectScan

2018-05-17 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479617#comment-16479617
 ] 

Sergey Soldatov commented on PHOENIX-4692:
--

[~jamestaylor] +1 if there is no other way to avoid adding SkipScanFilter more 
than once. 

> ArrayIndexOutOfBoundsException in ScanRanges.intersectScan
> --
>
> Key: PHOENIX-4692
> URL: https://issues.apache.org/jira/browse/PHOENIX-4692
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0
>Reporter: Sergey Soldatov
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4692-IT.patch, PHOENIX-4692_v1.patch
>
>
> ScanRanges.intersectScan may fail with AIOOBE if a salted table is used.
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.phoenix.util.ScanUtil.getKey(ScanUtil.java:333)
>   at org.apache.phoenix.util.ScanUtil.getMinKey(ScanUtil.java:317)
>   at 
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:371)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:1074)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:631)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:501)
>   at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>   at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:274)
>   at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:364)
>   at 
> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:234)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:144)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:139)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:314)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:293)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:292)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:285)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:1798)
> {noformat}
> Script to reproduce:
> {noformat}
> CREATE TABLE TEST (PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL,  ID1 INTEGER, 
> ID2 INTEGER CONSTRAINT PK PRIMARY KEY(PK1 , PK2))SALT_BUCKETS = 4;
> upsert into test values (1,1,1,1);
> upsert into test values (2,2,2,2);
> upsert into test values (2,3,1,2);
> create view TEST_VIEW as select * from TEST where PK1 in (1,2);
> CREATE INDEX IDX_VIEW ON TEST_VIEW (ID1);
>   select /*+ INDEX(TEST_VIEW IDX_VIEW) */ * from TEST_VIEW where ID1 = 1  
> ORDER BY ID2 LIMIT 500 OFFSET 0;
> {noformat}
> That happens because we have a point lookup optimization which reduces 
> RowKeySchema to a single field, while we have more than one slot due salting. 
> [~jamestaylor] can you please take a look? I'm not sure whether it should be 
> fixed on the ScanUtil level or we just should not use point lookup in such 
> cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2018-05-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479556#comment-16479556
 ] 

Andrew Purtell commented on PHOENIX-3655:
-

I would argue for converting the metrics to hbase metrics. Let the hbase 
metrics impl classes deal with legacy.

If the new API still has rough edges that cause legacy to leak in we should fix 
that. 

bq. I am not sure how metrics such as GLOBAL_MUTATION_BYTES will help us track 
the operations. Is it fine to just publish the raw value? 

Yeah, as a Gauge, for simple export of a raw value.

Where we care about the distribution of values, use a Histogram. 

Use Meter for something we are tracking as the rate of something (like ops/sec)

Use Timer if we are tracking the time it takes to do something (like op latency)

> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Karan Mehta
>Priority: Major
> Fix For: 4.15.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2018-05-17 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479534#comment-16479534
 ] 

Thomas D'Silva commented on PHOENIX-3655:
-

GLOBAL_MUTATION_BYTES is useful for trending, to see how the amount of data 
written by a client changes over time. 

Are you planning on only converting GlobalClientMetrics to use hbase metrics? 
Or are you going to change MutationMetricQueue, OverAllQueryMetrics etc as well?

> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Karan Mehta
>Priority: Major
> Fix For: 4.15.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2018-05-17 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479510#comment-16479510
 ] 

Josh Elser commented on PHOENIX-3655:
-

{quote}Do you want to move towards {{MetricRegistry}} provided by HBase? 
{quote}
Yeah, definitely. Getting us on hbase-metrics _should_ be a gain for us (no 
more janky Hadoop metrics2)

The rest of what you were saying is about on track: we use the 
hbase-metrics-api to construct the things we're measuring, sending them to the 
MetricsRegistry. Then, HBase (or the reporter we configure on the Registry) 
would take care of pushing those to Hadoop Metrics2 sink. There may be 
something in place already with the underlying dropwizard metrics 
implementation to push all of these to JMX (as an aside).

I fear you might be getting sucked into some old metrics cruft in HBase, 
[~karanmehta93]. BaseSourceImpl and other classes in hbase-hadoop2-compat are 
vestigial to prevent having to rewrite all of hbase-server to use the new 
hbase-metrics-api. I would think that if you're coming in here fresh, you could 
just use hbase-metrics-api only and ignore all of that other stuff.

> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Karan Mehta
>Priority: Major
> Fix For: 4.15.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3655) Metrics for PQS

2018-05-17 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479490#comment-16479490
 ] 

Karan Mehta commented on PHOENIX-3655:
--

I digged in further through the code. HBase has {{BaseSourceImpl}} class which 
contains the HBase metric registry as well as Hadoop metric registry, both of 
which eventually get registered to JMX, through different code paths.

If we extend this class (as all other metrics do as of now), the default call 
to the constructor will internally create a Hadoop registry as well as HBase 
registry, however it will mark the HBase registry as an existing source. This 
is a spinet of code from the constructor. 
{code:java}
1. this.metricsRegistry = (new 
DynamicMetricsRegistry(metricsName)).setContext(metricsContext);
2. BaseSourceImpl.DefaultMetricsSystemInitializer.INSTANCE.init(metricsName);
3. DefaultMetricsSystem.instance().register(metricsJmxContext, 
metricsDescription, this);
4. this.registry = 
MetricRegistries.global().create(this.getMetricRegistryInfo());
5. this.metricsAdapter = new HBaseMetrics2HadoopMetricsAdapter();
6. this.init();{code}
 

Line 3 registers all metrics with Hadoop metrics by default. Line 4 creates the 
corresponding HBase registry but the call to {{this.getMetricRegistryInfo()}} 
specifies the existingSource as true. That is the reason why 
{{GlobalMetricRegistriesAdapter.doRun()}} would skip it. The method does 
multiple things, first it registers this registry to JMX (which is not required 
since its already done before) and it also creates a 
{{HBaseMetrics2HadoopMetricsAdapter}} for it. We are probably missing out a 
thing here. *We cannot potentially have metrics from both the type of 
registries since the backend won't do anything even if those were registered. 
We should probably put a check before registering any metric there.*

*Another potential issue at HBase layer is explained below.*
{{GlobalMetricRegistriesAdapter}} is a thread that is scheduled to run at every 
10 seconds and collect/register HBase registries to JMX. The executor service 
for executing this task is initialized as a part of 
{{BaseSourceImpl.DefaultMetricsSystemInitializer#init()}} which is called at 
line 2 in the code shown above. Now this forces us to use atleast one Hadoop 
based metrics registry even though we dont want to. I came across this issue 
since PQS doesn't have any metrics and when I did a POC for publishing metrics 
by solely using HBase registry, it was never published.

I believe that rather than writing a shim layer to convert the metrics, Its 
better and probably easier to rewrite the metrics itself (GlobalClientMetrics). 
I am investigating as to what type of Metric will correspond better to the 
metrics exposed by Phoenix. Phoenix metrics keep track of number of samples as 
well. 
A good example here can be {{GLOBAL_OPEN_PHOENIX_CONNECTIONS}}, which can be 
mapped directly to Counter, however I am not sure how metrics such as 
{{GLOBAL_MUTATION_BYTES}} will help us track the operations. Is it fine to just 
publish the raw value? This will always go over time and doesn't give any 
insight if I understand correctly. 

[~apurtell] [~tdsilva] [~elserj] Any thoughts?
 

> Metrics for PQS
> ---
>
> Key: PHOENIX-3655
> URL: https://issues.apache.org/jira/browse/PHOENIX-3655
> Project: Phoenix
>  Issue Type: New Feature
>Affects Versions: 4.8.0
> Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98
>Reporter: Rahul Shrivastava
>Assignee: Karan Mehta
>Priority: Major
> Fix For: 4.15.0
>
> Attachments: MetricsforPhoenixQueryServerPQS.pdf
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Phoenix Query Server runs a separate process compared to its thin client. 
> Metrics collection is currently done by PhoenixRuntime.java i.e. at Phoenix 
> driver level. We need the following
> 1. For every jdbc statement/prepared statement/ run by PQS , we need 
> capability to collect metrics at PQS level and push the data to external sink 
> i.e. file, JMX , other external custom sources. 
> 2. Besides this global metrics could be periodically collected and pushed to 
> the sink. 
> 2. PQS can be configured to turn on metrics collection and type of collect ( 
> runtime or global) via hbase-site.xml
> 3. Sink could be configured via an interface in hbase-site.xml. 
> All metrics definition https://phoenix.apache.org/metrics.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4742) Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter

2018-05-17 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479474#comment-16479474
 ] 

James Taylor commented on PHOENIX-4742:
---

[~sergey.soldatov] - any insight into this? I seem to remember you made a 
change to our filters to accommodate HBase 1.4. Was this change perhaps not 
done for DistinctPrefixFilter?

> Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter
> -
>
> Key: PHOENIX-4742
> URL: https://issues.apache.org/jira/browse/PHOENIX-4742
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Priority: Major
>
> OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
> failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
> reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this 
> so we can do a release for HBase 1.4. At a minimum, we could disable 
> DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4742) Prevent infinite loop with HBase 1.4 and DistinctPrefixFilter

2018-05-17 Thread James Taylor (JIRA)
James Taylor created PHOENIX-4742:
-

 Summary: Prevent infinite loop with HBase 1.4 and 
DistinctPrefixFilter
 Key: PHOENIX-4742
 URL: https://issues.apache.org/jira/browse/PHOENIX-4742
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor


OrderByIT.testOrderByReverseOptimizationWithNUllsLastBug3491 is the only test 
failing on master (i.e. HBase 1.4). It's getting into an infinite loop when a 
reverse scan is done for the DistinctPrefixFilter. It'd be nice to fix this so 
we can do a release for HBase 1.4. At a minimum, we could disable 
DistinctPrefixFilter when a reverse scan is being done (for HBase 1.4 only).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4692) ArrayIndexOutOfBoundsException in ScanRanges.intersectScan

2018-05-17 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479464#comment-16479464
 ] 

James Taylor commented on PHOENIX-4692:
---

Please review, [~sergey.soldatov]. The problem was that the hash join was 
compiling using the same scan more than once, so the SkipScanFilter was being 
added to it multiple times (which is a no-no). This patch prevents that.

[~maryannxue] - any insights into this? Is this the correct fix?

> ArrayIndexOutOfBoundsException in ScanRanges.intersectScan
> --
>
> Key: PHOENIX-4692
> URL: https://issues.apache.org/jira/browse/PHOENIX-4692
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0
>Reporter: Sergey Soldatov
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4692-IT.patch, PHOENIX-4692_v1.patch
>
>
> ScanRanges.intersectScan may fail with AIOOBE if a salted table is used.
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.phoenix.util.ScanUtil.getKey(ScanUtil.java:333)
>   at org.apache.phoenix.util.ScanUtil.getMinKey(ScanUtil.java:317)
>   at 
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:371)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:1074)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:631)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:501)
>   at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>   at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:274)
>   at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:364)
>   at 
> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:234)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:144)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:139)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:314)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:293)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:292)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:285)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:1798)
> {noformat}
> Script to reproduce:
> {noformat}
> CREATE TABLE TEST (PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL,  ID1 INTEGER, 
> ID2 INTEGER CONSTRAINT PK PRIMARY KEY(PK1 , PK2))SALT_BUCKETS = 4;
> upsert into test values (1,1,1,1);
> upsert into test values (2,2,2,2);
> upsert into test values (2,3,1,2);
> create view TEST_VIEW as select * from TEST where PK1 in (1,2);
> CREATE INDEX IDX_VIEW ON TEST_VIEW (ID1);
>   select /*+ INDEX(TEST_VIEW IDX_VIEW) */ * from TEST_VIEW where ID1 = 1  
> ORDER BY ID2 LIMIT 500 OFFSET 0;
> {noformat}
> That happens because we have a point lookup optimization which reduces 
> RowKeySchema to a single field, while we have more than one slot due salting. 
> [~jamestaylor] can you please take a look? I'm not sure whether it should be 
> fixed on the ScanUtil level or we just should not use point lookup in such 
> cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4692) ArrayIndexOutOfBoundsException in ScanRanges.intersectScan

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4692:
--
Attachment: PHOENIX-4692_v1.patch

> ArrayIndexOutOfBoundsException in ScanRanges.intersectScan
> --
>
> Key: PHOENIX-4692
> URL: https://issues.apache.org/jira/browse/PHOENIX-4692
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0
>Reporter: Sergey Soldatov
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4692-IT.patch, PHOENIX-4692_v1.patch
>
>
> ScanRanges.intersectScan may fail with AIOOBE if a salted table is used.
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.phoenix.util.ScanUtil.getKey(ScanUtil.java:333)
>   at org.apache.phoenix.util.ScanUtil.getMinKey(ScanUtil.java:317)
>   at 
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:371)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:1074)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:631)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:501)
>   at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>   at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:274)
>   at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:364)
>   at 
> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:234)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:144)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:139)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:314)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:293)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:292)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:285)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:1798)
> {noformat}
> Script to reproduce:
> {noformat}
> CREATE TABLE TEST (PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL,  ID1 INTEGER, 
> ID2 INTEGER CONSTRAINT PK PRIMARY KEY(PK1 , PK2))SALT_BUCKETS = 4;
> upsert into test values (1,1,1,1);
> upsert into test values (2,2,2,2);
> upsert into test values (2,3,1,2);
> create view TEST_VIEW as select * from TEST where PK1 in (1,2);
> CREATE INDEX IDX_VIEW ON TEST_VIEW (ID1);
>   select /*+ INDEX(TEST_VIEW IDX_VIEW) */ * from TEST_VIEW where ID1 = 1  
> ORDER BY ID2 LIMIT 500 OFFSET 0;
> {noformat}
> That happens because we have a point lookup optimization which reduces 
> RowKeySchema to a single field, while we have more than one slot due salting. 
> [~jamestaylor] can you please take a look? I'm not sure whether it should be 
> fixed on the ScanUtil level or we just should not use point lookup in such 
> cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4692) ArrayIndexOutOfBoundsException in ScanRanges.intersectScan

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-4692:
--
Fix Version/s: (was: 4.15.0)
   5.0.0
   4.14.0

> ArrayIndexOutOfBoundsException in ScanRanges.intersectScan
> --
>
> Key: PHOENIX-4692
> URL: https://issues.apache.org/jira/browse/PHOENIX-4692
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0
>Reporter: Sergey Soldatov
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4692-IT.patch
>
>
> ScanRanges.intersectScan may fail with AIOOBE if a salted table is used.
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.phoenix.util.ScanUtil.getKey(ScanUtil.java:333)
>   at org.apache.phoenix.util.ScanUtil.getMinKey(ScanUtil.java:317)
>   at 
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:371)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:1074)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:631)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:501)
>   at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>   at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:274)
>   at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:364)
>   at 
> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:234)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:144)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:139)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:314)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:293)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:292)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:285)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:1798)
> {noformat}
> Script to reproduce:
> {noformat}
> CREATE TABLE TEST (PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL,  ID1 INTEGER, 
> ID2 INTEGER CONSTRAINT PK PRIMARY KEY(PK1 , PK2))SALT_BUCKETS = 4;
> upsert into test values (1,1,1,1);
> upsert into test values (2,2,2,2);
> upsert into test values (2,3,1,2);
> create view TEST_VIEW as select * from TEST where PK1 in (1,2);
> CREATE INDEX IDX_VIEW ON TEST_VIEW (ID1);
>   select /*+ INDEX(TEST_VIEW IDX_VIEW) */ * from TEST_VIEW where ID1 = 1  
> ORDER BY ID2 LIMIT 500 OFFSET 0;
> {noformat}
> That happens because we have a point lookup optimization which reduces 
> RowKeySchema to a single field, while we have more than one slot due salting. 
> [~jamestaylor] can you please take a look? I'm not sure whether it should be 
> fixed on the ScanUtil level or we just should not use point lookup in such 
> cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-4692) ArrayIndexOutOfBoundsException in ScanRanges.intersectScan

2018-05-17 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor reassigned PHOENIX-4692:
-

Assignee: James Taylor

> ArrayIndexOutOfBoundsException in ScanRanges.intersectScan
> --
>
> Key: PHOENIX-4692
> URL: https://issues.apache.org/jira/browse/PHOENIX-4692
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0
>Reporter: Sergey Soldatov
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4692-IT.patch
>
>
> ScanRanges.intersectScan may fail with AIOOBE if a salted table is used.
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 1
>   at org.apache.phoenix.util.ScanUtil.getKey(ScanUtil.java:333)
>   at org.apache.phoenix.util.ScanUtil.getMinKey(ScanUtil.java:317)
>   at 
> org.apache.phoenix.compile.ScanRanges.intersectScan(ScanRanges.java:371)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:1074)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:631)
>   at 
> org.apache.phoenix.iterate.BaseResultIterators.(BaseResultIterators.java:501)
>   at 
> org.apache.phoenix.iterate.ParallelIterators.(ParallelIterators.java:62)
>   at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:274)
>   at 
> org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:364)
>   at 
> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:234)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:144)
>   at 
> org.apache.phoenix.execute.DelegateQueryPlan.iterator(DelegateQueryPlan.java:139)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:314)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:293)
>   at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:292)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:285)
>   at 
> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:1798)
> {noformat}
> Script to reproduce:
> {noformat}
> CREATE TABLE TEST (PK1 INTEGER NOT NULL, PK2 INTEGER NOT NULL,  ID1 INTEGER, 
> ID2 INTEGER CONSTRAINT PK PRIMARY KEY(PK1 , PK2))SALT_BUCKETS = 4;
> upsert into test values (1,1,1,1);
> upsert into test values (2,2,2,2);
> upsert into test values (2,3,1,2);
> create view TEST_VIEW as select * from TEST where PK1 in (1,2);
> CREATE INDEX IDX_VIEW ON TEST_VIEW (ID1);
>   select /*+ INDEX(TEST_VIEW IDX_VIEW) */ * from TEST_VIEW where ID1 = 1  
> ORDER BY ID2 LIMIT 500 OFFSET 0;
> {noformat}
> That happens because we have a point lookup optimization which reduces 
> RowKeySchema to a single field, while we have more than one slot due salting. 
> [~jamestaylor] can you please take a look? I'm not sure whether it should be 
> fixed on the ScanUtil level or we just should not use point lookup in such 
> cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3270) Remove @Ignore tag for TransactionIT.testNonTxToTxTableFailure()

2018-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479346#comment-16479346
 ] 

Hudson commented on PHOENIX-3270:
-

FAILURE: Integrated in Jenkins build Phoenix-4.x-HBase-0.98 #1895 (See 
[https://builds.apache.org/job/Phoenix-4.x-HBase-0.98/1895/])
PHOENIX-3270 Remove @Ignore tag for (jtaylor: rev 
0c107b773bbcbdd13446ac32710216b268140c01)
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/tx/ParameterizedTransactionIT.java


> Remove @Ignore tag for TransactionIT.testNonTxToTxTableFailure()
> 
>
> Key: PHOENIX-3270
> URL: https://issues.apache.org/jira/browse/PHOENIX-3270
> Project: Phoenix
>  Issue Type: Task
>Reporter: James Taylor
>Assignee: James Taylor
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> We should remove the @Ignore tag for 
> TransactionIT.testNonTxToTxTableFailure(). The tests passes and it's not 
> clear why it was added in the first place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-4741) Shade disruptor dependency

2018-05-17 Thread Ankit Singhal (JIRA)
Ankit Singhal created PHOENIX-4741:
--

 Summary: Shade disruptor dependency 
 Key: PHOENIX-4741
 URL: https://issues.apache.org/jira/browse/PHOENIX-4741
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.14.0, 5.0.0
Reporter: Jungtaek Lim
Assignee: Ankit Singhal
 Fix For: 4.14.0, 5.0.0


We should shade disruptor dependency to avoid conflict with the versions used 
by the other framework like storm , hive etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-4740) FIRST_VALUES fails when using salt_buckets and order by

2018-05-17 Thread Valliet (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valliet updated PHOENIX-4740:
-
Description: 
Hi,

so I'm running phoenix over a 1.2 Hbase, and found this gem:

First, without salt_buckets, everything works fine:

 
{code:java}
create table emp (
 emp_code VARCHAR not null,
 bu_code VARCHAR not null,
 territory_codes VARCHAR,
 salary DOUBLE,
 CONSTRAINT pk PRIMARY KEY (emp_code, bu_code));
upsert into emp values('emp1', 'bu1', 'FR', 1000);
upsert into emp values('emp1', 'bu2', 'EN', 1000);
upsert into emp values('emp2', 'bu1', 'US', 1000);
upsert into emp values('emp2', 'bu2', 'DE', 1000);
upsert into emp values('emp2', 'bu3', 'AF', 1000);
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code order by 
total desc limit 100;
 
+---+---+-+
| EMP_CODE | FIRST_VALUES(TERRITORY_CODES, true, TERRITORY_CODES, 10) | TOTAL |
+---+---+-+
| emp2 | [AF, DE, US] | 3000.0 |
| emp1 | [EN, FR] | 2000.0 |
+---+---+-+
{code}
Then I add SALT_BUCKETS on the table, and if I use the 'order by total', 
FIRST_VALUES results are empty:

 
{code:java}
 
drop table emp;

create table emp (
 emp_code VARCHAR not null,
 bu_code VARCHAR not null, 
 territory_codes VARCHAR, 
 salary DOUBLE, 
 CONSTRAINT pk PRIMARY KEY (emp_code, bu_code)) SALT_BUCKETS=10;
upsert into emp values('emp1', 'bu1', 'FR', 1000);
upsert into emp values('emp1', 'bu2', 'EN', 1000);
upsert into emp values('emp2', 'bu1', 'US', 1000);
upsert into emp values('emp2', 'bu2', 'DE', 1000);
upsert into emp values('emp2', 'bu3', 'AF', 1000);
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code order by 
total desc limit 100;
+---+---+-+
| EMP_CODE | FIRST_VALUES(TERRITORY_CODES, true, TERRITORY_CODES, 10) | TOTAL |
+---+---+-+
| emp2 | | 3000.0 |
| emp1 | | 2000.0 |
+---+---+-+
2
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code limit 100;
+---+---+-+
| EMP_CODE | FIRST_VALUES(TERRITORY_CODES, true, TERRITORY_CODES, 10) | TOTAL |
+---+---+-+
| emp1 | [EN, FR] | 2000.0 |
| emp2 | [AF, DE, US] | 3000.0 |
+---+---+-+
{code}
 

 Cheers,

 

-manu

 

 

  was:
Hi,

so I'm running phoenix over a 1.2 Hbase, and found this gem:

First, without salt_buckets, everything works fine:

 
{code:java}
create table emp (
 emp_code VARCHAR not null,
 bu_code VARCHAR not null,
 territory_codes VARCHAR,
 salary DOUBLE,
 CONSTRAINT pk PRIMARY KEY (emp_code, bu_code));
upsert into emp values('emp1', 'bu1', 'FR', 1000);
upsert into emp values('emp1', 'bu2', 'EN', 1000);
upsert into emp values('emp2', 'bu1', 'US', 1000);
upsert into emp values('emp2', 'bu2', 'DE', 1000);
upsert into emp values('emp2', 'bu3', 'AF', 1000);
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code order by 
total desc limit 100;
 
+---+---+-+
| EMP_CODE | FIRST_VALUES(TERRITORY_CODES, true, TERRITORY_CODES, 10) | TOTAL |
+---+---+-+
| emp2 | [AF, DE, US] | 3000.0 |
| emp1 | [EN, FR] | 2000.0 |
+---+---+-+
{code}
Then I add SALT_BUCKETS on the table, and if I use the 'order by total', 
FIRST_VALUES results are empty:

 
{code:java}
 
drop table emp;

create table emp (
 emp_code VARCHAR not null,
 bu_code VARCHAR not null, 
 territory_codes VARCHAR, 
 salary DOUBLE, 
 CONSTRAINT pk PRIMARY KEY (emp_code, bu_code)) SALT_BUCKETS=10;
upsert into emp values('emp1', 'bu1', 'FR', 1000);
upsert into emp values('emp1', 'bu2', 'EN', 1000);
upsert into emp values('emp2', 'bu1', 'US', 1000);
upsert into emp values('emp2', 'bu2', 'DE', 1000);
upsert into emp values('emp2', 'bu3', 'AF', 1000);
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code order by 
total desc limit 100;

[jira] [Created] (PHOENIX-4740) FIRST_VALUES fails when using salt_buckets and order by

2018-05-17 Thread Valliet (JIRA)
Valliet created PHOENIX-4740:


 Summary: FIRST_VALUES fails when using salt_buckets and order by
 Key: PHOENIX-4740
 URL: https://issues.apache.org/jira/browse/PHOENIX-4740
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.13.1
 Environment: BigInsight 4.2.0.0

HBase 1.2

phoenix-4.13.1-HBase-1.2
Reporter: Valliet
 Attachments: emp.sql

Hi,

so I'm running phoenix over a 1.2 Hbase, and found this gem:

First, without salt_buckets, everything works fine:

 
{code:java}
create table emp (
 emp_code VARCHAR not null,
 bu_code VARCHAR not null,
 territory_codes VARCHAR,
 salary DOUBLE,
 CONSTRAINT pk PRIMARY KEY (emp_code, bu_code));
upsert into emp values('emp1', 'bu1', 'FR', 1000);
upsert into emp values('emp1', 'bu2', 'EN', 1000);
upsert into emp values('emp2', 'bu1', 'US', 1000);
upsert into emp values('emp2', 'bu2', 'DE', 1000);
upsert into emp values('emp2', 'bu3', 'AF', 1000);
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code order by 
total desc limit 100;
 
+---+---+-+
| EMP_CODE | FIRST_VALUES(TERRITORY_CODES, true, TERRITORY_CODES, 10) | TOTAL |
+---+---+-+
| emp2 | [AF, DE, US] | 3000.0 |
| emp1 | [EN, FR] | 2000.0 |
+---+---+-+
{code}
Then I add SALT_BUCKETS on the table, and if I use the 'order by total', 
FIRST_VALUES results are empty:

 
{code:java}
 
drop table emp;

create table emp (
 emp_code VARCHAR not null,
 bu_code VARCHAR not null, 
 territory_codes VARCHAR, 
 salary DOUBLE, 
 CONSTRAINT pk PRIMARY KEY (emp_code, bu_code)) SALT_BUCKETS=10;
upsert into emp values('emp1', 'bu1', 'FR', 1000);
upsert into emp values('emp1', 'bu2', 'EN', 1000);
upsert into emp values('emp2', 'bu1', 'US', 1000);
upsert into emp values('emp2', 'bu2', 'DE', 1000);
upsert into emp values('emp2', 'bu3', 'AF', 1000);
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code order by 
total desc limit 100;
+---+---+-+
| EMP_CODE | FIRST_VALUES(TERRITORY_CODES, true, TERRITORY_CODES, 10) | TOTAL |
+---+---+-+
| emp2 | | 3000.0 |
| emp1 | | 2000.0 |
+---+---+-+
2
SELECT emp_code, first_values(territory_codes, 10) within group (order by 
territory_codes asc), sum(salary) as total from emp group by emp_code limit 100;
+---+---+-+
| EMP_CODE | FIRST_VALUES(TERRITORY_CODES, true, TERRITORY_CODES, 10) | TOTAL |
+---+---+-+
| emp1 | [EN, FR] | 2000.0 |
| emp2 | [AF, DE, US] | 3000.0 |
+---+---+-+
{code}
 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4701) Write client-side metrics asynchronously to SYSTEM.LOG

2018-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478688#comment-16478688
 ] 

Hudson commented on PHOENIX-4701:
-

SUCCESS: Integrated in Jenkins build Phoenix-4.x-HBase-1.3 #136 (See 
[https://builds.apache.org/job/Phoenix-4.x-HBase-1.3/136/])
PHOENIX-4701 Write client-side metrics asynchronously to (jtaylor: rev 
2015345a023f0adb59174443ec1328bb1399f11b)
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/QueryWithLimitIT.java


> Write client-side metrics asynchronously to SYSTEM.LOG
> --
>
> Key: PHOENIX-4701
> URL: https://issues.apache.org/jira/browse/PHOENIX-4701
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4701.patch, PHOENIX-4701_addendum.patch, 
> PHOENIX-4701_addendum2.patch, PHOENIX-4701_master.patch, 
> PHOENIX-4701_v2.patch, PHOENIX-4701_wip1.patch, PHOENIX-4701_wip2.patch, 
> PHOENIX-4701_wip3.patch
>
>
> Rather than inventing a new, different set of client-side metrics to persist, 
> we should just persist our [client 
> metrics|http://phoenix.apache.org/metrics.html] in the SYSTEM.LOG. The 
> metrics captures all the same information as your QueryLogInfo (and much 
> more), rolls all the information up to a single set of metrics for each 
> Phoenix statement (aggregating/merging parallel scans, etc), and can emits a 
> single log line (which could be written in a single upsert statement). At 
> SFDC, we emit this information into a file system log in a layer above (and 
> use Splunk to produce nifty dashboard for monitoring), but this could easily 
> be emitted directly in Phoenix and go through your asynchronous write path 
> (and then use Phoenix queries to produce the same kind of dashboards). The 
> only piece would be to add the concept of a log level to each metric to 
> enable statically controlling which metrics are output.
> With this approach, the SYSTEM.LOG table could be declared immutable and use 
> our dense storage format with a single byte for column encoding and get a 
> 3-5x perf gain. This would also open the door for users to potentially add 
> secondary indexes on the table. See schema identified in the wip2 patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4701) Write client-side metrics asynchronously to SYSTEM.LOG

2018-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478663#comment-16478663
 ] 

Hudson commented on PHOENIX-4701:
-

FAILURE: Integrated in Jenkins build Phoenix-4.x-HBase-0.98 #1894 (See 
[https://builds.apache.org/job/Phoenix-4.x-HBase-0.98/1894/])
PHOENIX-4701 Write client-side metrics asynchronously to (jtaylor: rev 
006df25cdd7337ca16b98bed92aa6ed24b054496)
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/QueryWithLimitIT.java


> Write client-side metrics asynchronously to SYSTEM.LOG
> --
>
> Key: PHOENIX-4701
> URL: https://issues.apache.org/jira/browse/PHOENIX-4701
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4701.patch, PHOENIX-4701_addendum.patch, 
> PHOENIX-4701_addendum2.patch, PHOENIX-4701_master.patch, 
> PHOENIX-4701_v2.patch, PHOENIX-4701_wip1.patch, PHOENIX-4701_wip2.patch, 
> PHOENIX-4701_wip3.patch
>
>
> Rather than inventing a new, different set of client-side metrics to persist, 
> we should just persist our [client 
> metrics|http://phoenix.apache.org/metrics.html] in the SYSTEM.LOG. The 
> metrics captures all the same information as your QueryLogInfo (and much 
> more), rolls all the information up to a single set of metrics for each 
> Phoenix statement (aggregating/merging parallel scans, etc), and can emits a 
> single log line (which could be written in a single upsert statement). At 
> SFDC, we emit this information into a file system log in a layer above (and 
> use Splunk to produce nifty dashboard for monitoring), but this could easily 
> be emitted directly in Phoenix and go through your asynchronous write path 
> (and then use Phoenix queries to produce the same kind of dashboards). The 
> only piece would be to add the concept of a log level to each metric to 
> enable statically controlling which metrics are output.
> With this approach, the SYSTEM.LOG table could be declared immutable and use 
> our dense storage format with a single byte for column encoding and get a 
> 3-5x perf gain. This would also open the door for users to potentially add 
> secondary indexes on the table. See schema identified in the wip2 patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-4685) Properly handle connection caching for Phoenix inside RegionServers

2018-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478662#comment-16478662
 ] 

Hudson commented on PHOENIX-4685:
-

FAILURE: Integrated in Jenkins build Phoenix-4.x-HBase-0.98 #1894 (See 
[https://builds.apache.org/job/Phoenix-4.x-HBase-0.98/1894/])
PHOENIX-4685 Properly handle connection caching for Phoenix inside 
(ankitsinghal59: rev 928f5f80d6f434a914df544b28f0b7bbabc2e3d1)
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/ServerUtil.java
* (edit) phoenix-core/src/test/java/org/apache/phoenix/query/BaseTest.java


> Properly handle connection caching for Phoenix inside RegionServers
> ---
>
> Key: PHOENIX-4685
> URL: https://issues.apache.org/jira/browse/PHOENIX-4685
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Rajeshbabu Chintaguntla
>Assignee: Rajeshbabu Chintaguntla
>Priority: Blocker
> Fix For: 4.14.0, 5.0.0
>
> Attachments: PHOENIX-4685.patch, PHOENIX-4685_5.x-HBase-2.0.patch, 
> PHOENIX-4685_addendum.patch, PHOENIX-4685_addendum2.patch, 
> PHOENIX-4685_addendum3.patch, PHOENIX-4685_addendum4.patch, 
> PHOENIX-4685_jstack, PHOENIX-4685_v2.patch, PHOENIX-4685_v3.patch, 
> PHOENIX-4685_v4.patch, PHOENIX-4685_v5.patch
>
>
> Currently trying to write data to indexed table failing with OOME where 
> unable to create native threads. But it's working fine with 4.7.x branches. 
> Found many threads created for meta lookup and shared threads and no space to 
> create threads. This is happening even with short circuit writes enabled.
> {noformat}
> 2018-04-08 13:06:04,747 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=9,queue=0,port=16020] 
> index.PhoenixIndexFailurePolicy: handleFailure failed
> java.io.IOException: java.lang.reflect.UndeclaredThrowableException
> at org.apache.hadoop.hbase.security.User.runAsLoginUser(User.java:185)
> at 
> org.apache.phoenix.index.PhoenixIndexFailurePolicy.handleFailureWithExceptions(PhoenixIndexFailurePolicy.java:217)
> at 
> org.apache.phoenix.index.PhoenixIndexFailurePolicy.handleFailure(PhoenixIndexFailurePolicy.java:143)
> at 
> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:160)
> at 
> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:144)
> at 
> org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:632)
> at org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:607)
> at 
> org.apache.phoenix.hbase.index.Indexer.postBatchMutateIndispensably(Indexer.java:590)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1037)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1034)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1034)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:3533)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3914)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3822)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3753)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1027)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:959)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:922)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2666)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42014)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.reflect.UndeclaredThrowableException
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
> at 
>