[jira] [Created] (PHOENIX-2787) support IF EXISTS for ALTER TABLE SET options

2016-03-21 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-2787:
-

 Summary: support IF EXISTS for ALTER TABLE SET options
 Key: PHOENIX-2787
 URL: https://issues.apache.org/jira/browse/PHOENIX-2787
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 4.8.0
Reporter: Vincent Poon
Priority: Trivial


A nice-to-have improvement to the grammar:

ALTER TABLE my_table IF EXISTS SET options

currently the 'IF EXISTS' only works for dropping/adding a column



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-3334) ConnectionQueryServicesImpl should close HConnection if init fails

2016-09-26 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-3334:
-

 Summary: ConnectionQueryServicesImpl should close HConnection if 
init fails
 Key: PHOENIX-3334
 URL: https://issues.apache.org/jira/browse/PHOENIX-3334
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.7.0
Reporter: Vincent Poon


We are seeing ZK connection leaks when there's an error during Phoenix 
connection creation.  ConnectionQueryServicesImpl grabs an HConnection during 
init, which creates a ZK ClientCnxn which starts two threads (EventThread, 
SendThread).Later in the Phoenix connection init, there's an exception (in 
our case, incorrect server jar version).  Phoenix bubbles up the exception but 
never explicitly calls close on the HConnection, so the ZK threads stay alive.

This was perhaps partially by design as the HConnectionImplementation is 
supposed to have a DelayedClosing reaper thread that reaps any stale ZK 
connections.  However, because of HBASE-11354, that reaper never gets started. 
(we are running HBase 0.98)

In any case, this reaper stuff was deprecated in HBASE-6778, so clients should 
close the connection themselves.

{code}
at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1167)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1034)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1370)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2116)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:828) 
~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:183)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:335) 
~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:323) 
~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) 
~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:321)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1274)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2275)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2244)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:78)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2244)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:233)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:135)
 ~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at 
org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:202) 
~[phoenix-core-4.7.0-sfdc-1.0.8.jar:4.7.0-sfdc-1.0.8]
at java.sql.DriverManager.getConnection(DriverManager.java:664) 
~[na:1.8.0_60]
at java.sql.DriverManager.getConnection(DriverManager.java:270) 
~[na:1.8.0_60]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3806) IndexUpdateManager spending a lot of time sorting mutations on Index rebuild

2017-04-24 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982230#comment-15982230
 ] 

Vincent Poon commented on PHOENIX-3806:
---

What I find is that if I create 100 versions of a row, 
NonTxIndexBuilder#createTimestampBatchesFromMutation creates 100 batches based 
on timestamp.

Then for each batch, we call addMutationsForBatch, which in step 3 does a loop 
and adds index updates for all timestamps up to the current batch's timestamp.  
All the index updates are DELETE updates.

What this means is that overall, the # of updates you have is the summation of 
the series 1...100.  So you have something like 5050 index updates, and because 
of the issues described above, you're sorting 5050 times.

And of course as you create more versions, the numbers quickly become 
unfeasible.

> IndexUpdateManager spending a lot of time sorting mutations on Index rebuild
> 
>
> Key: PHOENIX-3806
> URL: https://issues.apache.org/jira/browse/PHOENIX-3806
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> Here's the stack trace. The Array contains 50001 Delete Mutations in this 
> case.
> It seems the code is sorting this over and over again.
> {code}
> Thread 170 (B.DefaultRpcServer.handler=67,queue=7,port=60020):
>   State: RUNNABLE
>   Blocked count: 220598
>   Waited count: 377933
>   Stack:
> java.util.TimSort.binarySort(TimSort.java:296)
> java.util.TimSort.sort(TimSort.java:239)
> java.util.Arrays.sort(Arrays.java:1438)
> 
> org.apache.phoenix.hbase.index.covered.update.SortedCollection.iterator(SortedCollection.java:78)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.fixUpCurrentUpdates(IndexUpdateManager.java:128)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.addIndexUpdate(IndexUpdateManager.java:115)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addCurrentStateMutationsForBatch(NonTxIndexBuilder.java:333)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addUpdateForGivenTimestamp(NonTxIndexBuilder.java:258)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addMutationsForBatch(NonTxIndexBuilder.java:231)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.batchMutationAndAddUpdates(NonTxIndexBuilder.java:109)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.getIndexUpdate(NonTxIndexBuilder.java:71)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:137)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:133)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
> 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:58)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdate(IndexBuildManager.java:144)
> 
> org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(Indexer.java:324)
> Thread 169 (B.DefaultRpcServer.handler=66,queue=6,port=60020):
> {code}
> [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-4057) Do not issue index updates for out of order mutation during index maintenance

2017-08-02 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111757#comment-16111757
 ] 

Vincent Poon commented on PHOENIX-4057:
---

[~jamestaylor] I guess this would also mean we can't run a scrutiny as of an 
older timestamp?  As that would essentially be a point-in-time query comparison 
of data vs index table, and there could potentially be data table writes 
without corresponding index updates with this change?

> Do not issue index updates for out of order mutation during index maintenance 
> --
>
> Key: PHOENIX-4057
> URL: https://issues.apache.org/jira/browse/PHOENIX-4057
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4057_v1.patch, PHOENIX-4057_wip1.patch
>
>
> Index maintenance is not correct when rows arrive out of order (see 
> PHOENIX-4052). In particular, out of order deletes end up with a spurious Put 
> in the index. Rather than corrupt the secondary index, we can instead just 
> ignore out-of-order mutations. The only downside is that point-in-time 
> queries against an index will not work correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3994) Index RPC priority still depends on the controller factory property in hbase-site.xml

2017-07-13 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086062#comment-16086062
 ] 

Vincent Poon commented on PHOENIX-3994:
---

v4 patch lgtm, [~samarthjain]

> Index RPC priority still depends on the controller factory property in 
> hbase-site.xml
> -
>
> Key: PHOENIX-3994
> URL: https://issues.apache.org/jira/browse/PHOENIX-3994
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Sergey Soldatov
>Assignee: Samarth Jain
>Priority: Critical
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-3994_addendum.patch, PHOENIX-3994.patch, 
> PHOENIX-3994_v2.patch, PHOENIX-3994_v3.patch, PHOENIX-3994_v4.patch
>
>
> During PHOENIX-3360 we tried to remove dependency on 
> hbase.rpc.controllerfactory.class property in hbase-site.xml since it cause 
> problems on the client side (if client is using server side configuration, 
> all client request may go using index priority). Committed solution is using 
> setting the controller factory programmatically for coprocessor environment 
> in Indexer class, but it comes that this solution doesn't work because the 
> environment configuration is not used for the coprocessor connection 
> creation. We need to provide a better solution since this issue may cause 
> accidental locks and failures that hard to identify and avoid. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3948) Enable shorter time outs for server-side index writes

2017-07-13 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3948:
--
Attachment: PHOENIX-3948.master.v2.patch

> Enable shorter time outs for server-side index writes
> -
>
> Key: PHOENIX-3948
> URL: https://issues.apache.org/jira/browse/PHOENIX-3948
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>  Labels: globalMutableSecondaryIndex
> Attachments: PHOENIX-3948.master.v1.patch, 
> PHOENIX-3948.master.v2.patch
>
>
> The default timeouts are far too high for a RS->RS global index update as 
> we're holding on to a handler thread when the retries occur. We should be 
> able to set them to be inline with the client-side timeouts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-07-13 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086550#comment-16086550
 ] 

Vincent Poon commented on PHOENIX-3807:
---

Apologies [~elserj], forgot about that and forgot to mention it.  It looks like 
your patch is ready for when a new 1.4 branch is available though.

Originally I was thinking HBASE-18060 could make it into 1.3 but it never got 
committed there.

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3807.v1.master.patch, PHOENIX-3807.v2.patch, 
> PHOENIX-3807.v3.patch, Screen Shot 2017-05-31 at 4.00.16 PM.png
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3948) Enable shorter time outs for server-side index writes

2017-07-10 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081268#comment-16081268
 ] 

Vincent Poon commented on PHOENIX-3948:
---

[~gjacoby] Yea an operator would need to add this new config.  I originally put 
a new default , but wasn't sure if that was right, as some people may have 
tweaked their hbase.client.retries.number, or the real culprit, 
hbase.client.serverside.retries.multiplier , which takes the first number and 
multiplies it by 10.  In HBase Master there is even a note
 "// TODO: Fix this. Not all connections from server side should have 10 times 
the retries."

So presumably if they fixed that up there, then that would get inherited in the 
Indexer.  But otherwise, folks will need to add this property override.

> Enable shorter time outs for server-side index writes
> -
>
> Key: PHOENIX-3948
> URL: https://issues.apache.org/jira/browse/PHOENIX-3948
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>  Labels: globalMutableSecondaryIndex
> Attachments: PHOENIX-3948.master.v1.patch
>
>
> The default timeouts are far too high for a RS->RS global index update as 
> we're holding on to a handler thread when the retries occur. We should be 
> able to set them to be inline with the client-side timeouts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3948) Enable shorter time outs for server-side index writes

2017-07-10 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3948:
--
Attachment: PHOENIX-3948.master.v1.patch

> Enable shorter time outs for server-side index writes
> -
>
> Key: PHOENIX-3948
> URL: https://issues.apache.org/jira/browse/PHOENIX-3948
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>  Labels: globalMutableSecondaryIndex
> Attachments: PHOENIX-3948.master.v1.patch
>
>
> The default timeouts are far too high for a RS->RS global index update as 
> we're holding on to a handler thread when the retries occur. We should be 
> able to set them to be inline with the client-side timeouts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-3948) Enable shorter time outs for server-side index writes

2017-07-10 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-3948:
-

Assignee: Vincent Poon

> Enable shorter time outs for server-side index writes
> -
>
> Key: PHOENIX-3948
> URL: https://issues.apache.org/jira/browse/PHOENIX-3948
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>  Labels: globalMutableSecondaryIndex
>
> The default timeouts are far too high for a RS->RS global index update as 
> we're holding on to a handler thread when the retries occur. We should be 
> able to set them to be inline with the client-side timeouts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3948) Enable shorter time outs for server-side index writes

2017-07-10 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081215#comment-16081215
 ] 

Vincent Poon commented on PHOENIX-3948:
---

[~jamestaylor] [~samarthjain] I was able to set the timeout it in the HTable 
factory when the HTable is created.  Let me know what you think.

> Enable shorter time outs for server-side index writes
> -
>
> Key: PHOENIX-3948
> URL: https://issues.apache.org/jira/browse/PHOENIX-3948
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>  Labels: globalMutableSecondaryIndex
> Attachments: PHOENIX-3948.master.v1.patch
>
>
> The default timeouts are far too high for a RS->RS global index update as 
> we're holding on to a handler thread when the retries occur. We should be 
> able to set them to be inline with the client-side timeouts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-07-12 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084407#comment-16084407
 ] 

Vincent Poon commented on PHOENIX-3807:
---

[~elserj] 100% agree with your naming suggestions.  I was planning to put up a 
new patch but got bogged down in other work.  You can go for it, otherwise I'll 
try to get back to this in a couple days.  Thanks for the idea.

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Vincent Poon
> Attachments: PHOENIX-3807.v1.master.patch, Screen Shot 2017-05-31 at 
> 4.00.16 PM.png
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3948) Enable shorter time outs for server-side index writes

2017-07-14 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3948:
--
Attachment: PHOENIX-3948.master.v3.patch

New v3 patch sets the timeout at 48 seconds, less than the HBase default rpc 
timeout of 60 seconds, so that the client writing to the data table has a 
chance to receive a response before retrying on a new handler.

> Enable shorter time outs for server-side index writes
> -
>
> Key: PHOENIX-3948
> URL: https://issues.apache.org/jira/browse/PHOENIX-3948
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>  Labels: globalMutableSecondaryIndex
> Attachments: PHOENIX-3948.master.v1.patch, 
> PHOENIX-3948.master.v2.patch, PHOENIX-3948.master.v3.patch
>
>
> The default timeouts are far too high for a RS->RS global index update as 
> we're holding on to a handler thread when the retries occur. We should be 
> able to set them to be inline with the client-side timeouts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4051) Prevent out-of-order updates for mutable index updates

2017-07-27 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104172#comment-16104172
 ] 

Vincent Poon commented on PHOENIX-4051:
---

Hey [~jamestaylor] , I see this in HRegion:

{code}
 // we should record the timestamp only after we have acquired the rowLock,
  // otherwise, newer puts/deletes are not guaranteed to have a newer 
timestamp
  now = EnvironmentEdgeManager.currentTimeMillis();
  byte[] byteNow = Bytes.toBytes(now);
{code}

So why would this patch be necessary?  I do see that in HRegion they set the 
timestamp before this code:
{code}
lock(this.updatesLock.readLock(), numReadyToWrite);
  locked = true;
{code}

So that would mean this is possible:
thread 1 generates ts1
thread 2 generates ts2 , and acquires the updatesLock
thread 1 then acquires the updatesLock afterwards and passes the Indexer a ts1 
which is out of order

In which case your patch gets around that.  I'm just trying to understand what 
updatesLock is, and why it's relevant when we have the rowlocks already 
(HRegion#doMiniBatchMutate Step 1)

LGTM - also would be nice if you could include your test as part of this patch?

> Prevent out-of-order updates for mutable index updates
> --
>
> Key: PHOENIX-4051
> URL: https://issues.apache.org/jira/browse/PHOENIX-4051
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Attachments: PHOENIX-4051_v1.patch
>
>
> Out-of-order processing of data rows during index maintenance causes mutable 
> indexes to become out of sync with regard to the data table. Here's a simple 
> example to illustrate the issue:
> # Assume table T(K,V) and index X(V,K).
> # Upsert T(A, 1) at t10. Index updates: Put X(1,A) at t10.
> # Upsert T(A, 3) at t30. Index updates: Delete X(1,A) at t29, Put X(3,A) at 
> t30.
> # Upsert T(A,2) at t20. Index updates: Delete X(1,A) at t19, Put X(2,A) at 
> t20, Delete X(2,A) at t29
> Ideally, we'd want to remove the Delete X(1,A) at t29 since this isn't 
> correct in terms of timeline consistency, but we can't do that with HBase 
> without support for deleting/undoing Delete markers. 
> The above is not what is occurring. Instead, when T(A,2) comes in, the Put 
> X(2,A) will occur at t20, but the Delete won't occur. This causes more index 
> rows than data rows, essentially making it invalid.
> A quick fix is to reset the timestamp of the data table mutations to the 
> current time within the preBatchMutate call, when the row is exclusively 
> locked. This skirts the issue because then timestamps won't overlap.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4050) Index rows created by 4.9 client against 4.10 server are not deleted properly

2017-07-26 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4050:
-

 Summary: Index rows created by 4.9 client against 4.10 server are 
not deleted properly
 Key: PHOENIX-4050
 URL: https://issues.apache.org/jira/browse/PHOENIX-4050
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.10.0
Reporter: Vincent Poon


For mutable secondary indexes, the index rows created by a 4.9 client against a 
4.10 server do not get deleted after issuing a "delete from" statement against 
the data table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-07-19 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093576#comment-16093576
 ] 

Vincent Poon commented on PHOENIX-3807:
---

+1 [~elserj]

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3807.v1.master.patch, PHOENIX-3807.v2.patch, 
> PHOENIX-3807.v3.patch, Screen Shot 2017-05-31 at 4.00.16 PM.png
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4042) Add hadoop-metrics2 server level metrics for secondary indexes

2017-07-19 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093879#comment-16093879
 ] 

Vincent Poon commented on PHOENIX-4042:
---

+1, thanks [~elserj] !

> Add hadoop-metrics2 server level metrics for secondary indexes
> --
>
> Key: PHOENIX-4042
> URL: https://issues.apache.org/jira/browse/PHOENIX-4042
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4042.001.patch
>
>
> PHOENIX-3807 has a patch which uses the new hbase-metrics API for capturing 
> Phoenix 2ndary indexing metrics inside of the RS.
> However, the hbase-metrics API doesn't show up until HBase-1.4. Let's put 
> together an implementation that works with hadoop metrics2 to hold us over 
> until that point.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3994) Index RPC priority still depends on the controller factory property in hbase-site.xml

2017-07-12 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084815#comment-16084815
 ] 

Vincent Poon commented on PHOENIX-3994:
---

[~samarthjain] I looked at the comment you linked to earlier - in 
RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema , it does look like 
they make a copy of the config.  So technically I think you're right that the 
conf is limited in scope, but probably still safer to do a copy.  If you do it 
in the HTableFactory you would only have to do it once, anyways, and all 
HTables would get it.
The clone in UngroupedAggregateRegionObserver looks like maybe it could be 
tweaked to happen only once, as it looks like it's doing it every time now just 
to set that one property, which is static.

> Index RPC priority still depends on the controller factory property in 
> hbase-site.xml
> -
>
> Key: PHOENIX-3994
> URL: https://issues.apache.org/jira/browse/PHOENIX-3994
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Sergey Soldatov
>Assignee: Samarth Jain
>Priority: Critical
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-3994_addendum.patch, PHOENIX-3994.patch, 
> PHOENIX-3994_v2.patch, PHOENIX-3994_v3.patch
>
>
> During PHOENIX-3360 we tried to remove dependency on 
> hbase.rpc.controllerfactory.class property in hbase-site.xml since it cause 
> problems on the client side (if client is using server side configuration, 
> all client request may go using index priority). Committed solution is using 
> setting the controller factory programmatically for coprocessor environment 
> in Indexer class, but it comes that this solution doesn't work because the 
> environment configuration is not used for the coprocessor connection 
> creation. We need to provide a better solution since this issue may cause 
> accidental locks and failures that hard to identify and avoid. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3994) Index RPC priority still depends on the controller factory property in hbase-site.xml

2017-07-12 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084641#comment-16084641
 ] 

Vincent Poon commented on PHOENIX-3994:
---

[~samarthjain] changes LGTM overall, but had one question

{code:title=Indexer.java|borderStyle=solid}
 
env.getConfiguration().setClass(RpcControllerFactory.CUSTOM_CONTROLLER_CONF_KEY,
InterRegionServerIndexRpcControllerFactory.class, 
RpcControllerFactory.class);
{code}

Would setting this config on the environment cause other coprocessors on the 
same region to use the IndexHandlers?  Perhaps we could avoid that by putting 
the config in your new HTable factory in IndexWriterUtils.  We could clone the 
config there so it doesn't affect anything else.  I was thinking about this 
because I need to try do something similar in PHOENIX-3948 to avoid side 
effects on the environment.


> Index RPC priority still depends on the controller factory property in 
> hbase-site.xml
> -
>
> Key: PHOENIX-3994
> URL: https://issues.apache.org/jira/browse/PHOENIX-3994
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.11.0
>Reporter: Sergey Soldatov
>Assignee: Samarth Jain
>Priority: Critical
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-3994_addendum.patch, PHOENIX-3994.patch, 
> PHOENIX-3994_v2.patch, PHOENIX-3994_v3.patch
>
>
> During PHOENIX-3360 we tried to remove dependency on 
> hbase.rpc.controllerfactory.class property in hbase-site.xml since it cause 
> problems on the client side (if client is using server side configuration, 
> all client request may go using index priority). Committed solution is using 
> setting the controller factory programmatically for coprocessor environment 
> in Indexer class, but it comes that this solution doesn't work because the 
> environment configuration is not used for the coprocessor connection 
> creation. We need to provide a better solution since this issue may cause 
> accidental locks and failures that hard to identify and avoid. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4039) Increase default number of RPC retries for our index rebuild task

2017-07-20 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4039:
--
Attachment: PHOENIX-4039_addendum.patch

> Increase default number of RPC retries for our index rebuild task
> -
>
> Key: PHOENIX-4039
> URL: https://issues.apache.org/jira/browse/PHOENIX-4039
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4039_addendum.patch
>
>
> The default value of 0 is too low for the rebuild index task to even initiate 
> rebuilding.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4039) Increase default number of RPC retries for our index rebuild task

2017-07-20 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095429#comment-16095429
 ] 

Vincent Poon commented on PHOENIX-4039:
---

[~jamestaylor] Attached addendum

> Increase default number of RPC retries for our index rebuild task
> -
>
> Key: PHOENIX-4039
> URL: https://issues.apache.org/jira/browse/PHOENIX-4039
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4039_addendum.patch
>
>
> The default value of 0 is too low for the rebuild index task to even initiate 
> rebuilding.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3983) Index rebuild scans should not be using the ServerRpcControllerFactory

2017-06-29 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068747#comment-16068747
 ] 

Vincent Poon commented on PHOENIX-3983:
---

The patch fixes the data table scans by specifically removing it from the 
controller chain which sets the index priority.  But should we try doing the 
opposite?  That is, instead of making a special case for non-index operations, 
we should make a special case for index operations.

I think the current IndexRpcController is too broad in that it stamps all 
non-system table operations with index priority.  This could have unintended 
consequences for users of HTable who have enabled the Phoenix 
ServerRpcControllerFactory.

For instance, the canary and hbase shell use HTable as well, and they also read 
from the server-side hbase-site.xml.  Those requests would now get handled by 
index handlers.  Furthermore the operator would now have to know that they have 
to adjust the IndexRPC handler pool size if they want to increase the pool for 
server-server rpcs.

Having said that, I don't see an immediate easy solution :).  The easiest would 
be if in IndexRpcController we could determine from a TableName whether it is 
an index, but we would have to pass in a special TableName to the HTable in the 
IndexCommitter and then do an instanceof inside IndexRpcController.

> Index rebuild scans should not be using the ServerRpcControllerFactory
> --
>
> Key: PHOENIX-3983
> URL: https://issues.apache.org/jira/browse/PHOENIX-3983
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-3983.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3983) Index rebuild scans should not be using the ServerRpcControllerFactory

2017-06-30 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070701#comment-16070701
 ] 

Vincent Poon commented on PHOENIX-3983:
---

HBASE-15816 requires us to set the priority on each Operation.  One way to 
maintain backwards compatibility would be to use reflection to dynamically 
check if the method setPriority exists on Operation.  Another way would be to 
have some interface, call it PrioritySetter, which we then call for each 
operation.  Then depending on the HBase version we inject either a 
DummyPrioritySetter or an actual PrioritySetter which calls setPriority on the 
Operation.

Cleanest way code-wise would be to just have another branch.  We could then get 
rid of the existing priority setting stuff and just add the new calls to 
Operation#setPriority.

> Index rebuild scans should not be using the ServerRpcControllerFactory
> --
>
> Key: PHOENIX-3983
> URL: https://issues.apache.org/jira/browse/PHOENIX-3983
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-3983.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3983) Index rebuild scans should not be using the ServerRpcControllerFactory

2017-07-02 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071725#comment-16071725
 ] 

Vincent Poon commented on PHOENIX-3983:
---

[~lhofhansl] Yes, check out the controller setPriority logic:
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/hadoop/hbase/ipc/controller/IndexRpcController.java

First in the chain is MetadataRpcController, then IndexRpcController.

I also confirmed that e.g. hbase shell uses the index handler now.  Basically 
any server-side rpc calls that are not system tables (and not replication, 
because it's a special case as described above), use the index handlers.

> Index rebuild scans should not be using the ServerRpcControllerFactory
> --
>
> Key: PHOENIX-3983
> URL: https://issues.apache.org/jira/browse/PHOENIX-3983
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-3983.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3806) IndexUpdateManager spending a lot of time sorting mutations on Index rebuild

2017-04-24 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981677#comment-15981677
 ] 

Vincent Poon commented on PHOENIX-3806:
---

Some more detail on this issue:

What we found was that the sorting being doing in 
IndexUpdateManager#fixUpCurrentUpdates can potentially be called a large number 
of times, and it seems to only happen when triggered from the 
BuildIndexScheduleTask code path.In our case the collection being sorted 
had 50001 Deletes because even though the indexing rebuilding code batches the 
mutations it is replaying, there is a raw scan that happens which can surface 
all the versions for the mutations.  From a quick glance, it seems the raw scan 
is determined by whether BaseScannerRegionObserver.IGNORE_NEWER_MUTATIONS is 
set.

So there are number of avenues to explore here - can the batching be done 
better?  Do we need all versions?  Certainly it seems the sorting in 
fixUpCurrentUpdates can be done better, by sorting once rather than on every 
loop iteration.  I'll look into that further.

Steps to reproduce (perhaps can be converted to a unit test):
1)  Produce a large number of versions for a single row
2)  Disable the index on the table with "alter index  on  disable"
3)  Make sure the index_disable_timestamp is set (filed PHOENIX-3810)
4) The rebuilder should trigger the sort for a large number of deletes


> IndexUpdateManager spending a lot of time sorting mutations on Index rebuild
> 
>
> Key: PHOENIX-3806
> URL: https://issues.apache.org/jira/browse/PHOENIX-3806
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> Here's the stack trace. The Array contains 50001 Delete Mutations in this 
> case.
> It seems the code is sorting this over and over again.
> {code}
> Thread 170 (B.DefaultRpcServer.handler=67,queue=7,port=60020):
>   State: RUNNABLE
>   Blocked count: 220598
>   Waited count: 377933
>   Stack:
> java.util.TimSort.binarySort(TimSort.java:296)
> java.util.TimSort.sort(TimSort.java:239)
> java.util.Arrays.sort(Arrays.java:1438)
> 
> org.apache.phoenix.hbase.index.covered.update.SortedCollection.iterator(SortedCollection.java:78)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.fixUpCurrentUpdates(IndexUpdateManager.java:128)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.addIndexUpdate(IndexUpdateManager.java:115)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addCurrentStateMutationsForBatch(NonTxIndexBuilder.java:333)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addUpdateForGivenTimestamp(NonTxIndexBuilder.java:258)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addMutationsForBatch(NonTxIndexBuilder.java:231)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.batchMutationAndAddUpdates(NonTxIndexBuilder.java:109)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.getIndexUpdate(NonTxIndexBuilder.java:71)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:137)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:133)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
> 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:58)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdate(IndexBuildManager.java:144)
> 
> org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(Indexer.java:324)
> Thread 169 (B.DefaultRpcServer.handler=66,queue=6,port=60020):
> {code}
> [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (PHOENIX-3810) index_disable_timestamp not set by ALTER INDEX DISABLE

2017-04-24 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-3810:
-

 Summary: index_disable_timestamp not set by ALTER INDEX DISABLE
 Key: PHOENIX-3810
 URL: https://issues.apache.org/jira/browse/PHOENIX-3810
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.10.0
Reporter: Vincent Poon


index_disable_timestamp seems to be set to null after issuing a "ALTER INDEX 
DISABLE".  As a result, the index rebuilding tasks don't start as expected.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-2460) Implement scrutiny command to validate whether or not an index is in sync with the data table

2017-08-17 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131556#comment-16131556
 ] 

Vincent Poon commented on PHOENIX-2460:
---

[~jamestaylor] [~samarthjain] Mind taking a look at this PR when you get time?
https://github.com/apache/phoenix/pull/269

Thanks!

> Implement scrutiny command to validate whether or not an index is in sync 
> with the data table
> -
>
> Key: PHOENIX-2460
> URL: https://issues.apache.org/jira/browse/PHOENIX-2460
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Attachments: PHOENIX-2460.patch
>
>
> We should have a process that runs to verify that an index is valid against a 
> data table and potentially fixes it if discrepancies are found. This could 
> either be a MR job or a low priority background task.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3824) Mutable Index partial rebuild adds more than one index row for updated data row

2017-05-02 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994307#comment-15994307
 ] 

Vincent Poon commented on PHOENIX-3824:
---

[~lhofhansl] it turned out that the two are related.  Short summary is, 
normally when you do an update to a data table row, in the preBatchMutate hook 
you generate the index update (so you can write it to WAL).  To get the index 
update, you grab the current state of the row (since you're in preBatchMutate, 
it's the pre-update state of the row).  That way, you can figure out the 
existing index row, and issue a Delete for it, and then Put the new index row.

Well when you're doing an index rebuild, all your data table rows are written 
already.  So when you "grab the current state of the row", it's the same as the 
mutation you're replaying.  Since nothing has 'changed', so to speak, the 
delete isn't issued.  Hence you end up with the extra index row.

PHOENIX-3806 then gets triggered because there's some logic to handle 
out-of-order updates.  The way they handle out-of-order-updates is, if you get 
a mutation that isn't the latest timestamp (i.e. backwards in time), the code 
the rolls up through each version up to present.  That way you know the present 
index state, and if it has changed, you hide your current (back in time) index 
update by issuing a Delete after your Put.  If you have many versions, this 
"roll up" ends up being done for each one, hence the arithmetic summation 
problem.

I believe the simple fix is to make sure you don't scan for newer versions when 
you "grab the current state of the row".  There's actually code that tries to 
do that but I think there's a bug.  I'm still writing proper tests, etc, but I 
think that should fix it.

I haven't figured out PHOENIX-3825, though.  I don't know if the code is built 
to handle that, and actually it's tricky to make it work with this one.

> Mutable Index partial rebuild adds more than one index row for updated data 
> row
> ---
>
> Key: PHOENIX-3824
> URL: https://issues.apache.org/jira/browse/PHOENIX-3824
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Vincent Poon
>
> If you follow this sequence:
> 1) disable index
> 2) write an updates to a data table row
> 3) trigger the BuildIndexScheduleTask partial rebuild
> then you end up with two index rows for the one data table row.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3847) Handle out of order rows during index maintenance

2017-05-12 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008933#comment-16008933
 ] 

Vincent Poon commented on PHOENIX-3847:
---

[~jamestaylor] Tried this out and the behavior seems to be as your described.  
Is there a reason we don't issue the delete at the old timestamp?  i.e. Delete 
of A,1 at 1000.  If we did that, it seems the result would be the same 
regardless of order?

> Handle out of order rows during index maintenance
> -
>
> Key: PHOENIX-3847
> URL: https://issues.apache.org/jira/browse/PHOENIX-3847
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>
> Based on the investigation and work done in PHOENIX-3825 plus the existence 
> of the ignoreNewerMutations flag, it seems that out of order rows are not 
> handled correctly during index maintenance. When the user handles replaying 
> failed batches, we force them to submit them in timestamp order. As long as 
> the user provides the original timestamp, the order shouldn't matter. 
> Regardless of the order the server processes data table mutations, the 
> resulting index rows should be the same and should purely be based on the 
> cell time stamp of the data rows. Ideally, we shouldn't need the 
> ignoreNewerMutations flag at all. Perhaps that was the intent with 
> IndexUpdateManager.fixUpCurrentUpdates(), but it doesn't to be working.
> Would it work to simply generate all the index rows for the mutating data 
> rows for all versions? We should walk through a series of examples to see if 
> this would work.  For example, with the following data table:
> |Type|RowKey|Value|Timestamp
> | Put | 1 | A | 1000
> | Put | 1 | C | 3000
> the index table would look like this:
> |Type|RowKey|Timestamp
> | Put | A,1 | 1000
> | Del | A,1 | 3000
> | Put | C,1 | 3000
> Then if a Put comes in out of order at 2000, the data table would look like 
> this:
> |Type|RowKey|Value|Timestamp
> | Put | 1 | A | 1000
> | Put | 1 | B | 2000
> | Put | 1 | C | 3000
> and the index table should look like this:
> |Type|RowKey|Timestamp
> | Put | A,1 | 1000
> | Del | A,1 | 2000
> | Put | B,1 | 2000
> | Del | B,1 | 3000
> | Put | C,1 | 3000
> Given that we can't reverse Delete markers, I'm not sure we can get there 
> completely. We'd still have a Delete of A,1 @ 3000. But perhaps this is not a 
> problem? We'd need to play this out further and include scenarios with row 
> delete as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3847) Handle out of order rows during index maintenance

2017-05-12 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008938#comment-16008938
 ] 

Vincent Poon commented on PHOENIX-3847:
---

I guess then point queries wouldn't work.  Hmm yea not sure we can get around 
the extra A,1 at 3000

> Handle out of order rows during index maintenance
> -
>
> Key: PHOENIX-3847
> URL: https://issues.apache.org/jira/browse/PHOENIX-3847
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>
> Based on the investigation and work done in PHOENIX-3825 plus the existence 
> of the ignoreNewerMutations flag, it seems that out of order rows are not 
> handled correctly during index maintenance. When the user handles replaying 
> failed batches, we force them to submit them in timestamp order. As long as 
> the user provides the original timestamp, the order shouldn't matter. 
> Regardless of the order the server processes data table mutations, the 
> resulting index rows should be the same and should purely be based on the 
> cell time stamp of the data rows. Ideally, we shouldn't need the 
> ignoreNewerMutations flag at all. Perhaps that was the intent with 
> IndexUpdateManager.fixUpCurrentUpdates(), but it doesn't to be working.
> Would it work to simply generate all the index rows for the mutating data 
> rows for all versions? We should walk through a series of examples to see if 
> this would work.  For example, with the following data table:
> |Type|RowKey|Value|Timestamp
> | Put | 1 | A | 1000
> | Put | 1 | C | 3000
> the index table would look like this:
> |Type|RowKey|Timestamp
> | Put | A,1 | 1000
> | Del | A,1 | 3000
> | Put | C,1 | 3000
> Then if a Put comes in out of order at 2000, the data table would look like 
> this:
> |Type|RowKey|Value|Timestamp
> | Put | 1 | A | 1000
> | Put | 1 | B | 2000
> | Put | 1 | C | 3000
> and the index table should look like this:
> |Type|RowKey|Timestamp
> | Put | A,1 | 1000
> | Del | A,1 | 2000
> | Put | B,1 | 2000
> | Del | B,1 | 3000
> | Put | C,1 | 3000
> Given that we can't reverse Delete markers, I'm not sure we can get there 
> completely. We'd still have a Delete of A,1 @ 3000. But perhaps this is not a 
> problem? We'd need to play this out further and include scenarios with row 
> delete as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3825) Mutable Index rebuild does not write an index version for each data row version

2017-05-11 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3825:
--
Attachment: PHOENIX-3825.master.v1.patch

> Mutable Index rebuild does not write an index version for each data row 
> version
> ---
>
> Key: PHOENIX-3825
> URL: https://issues.apache.org/jira/browse/PHOENIX-3825
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Vincent Poon
> Attachments: PHOENIX-3825.master.v1.patch
>
>
> 1) Write a row
> 2) Disable the index
> 3) write a series of updates to the data row
> 4) trigger the BuildIndexScheduleTask partial rebuild
> The index table will only have 1 new version, whereas the data row had many 
> versions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (PHOENIX-3825) Mutable Index rebuild does not write an index version for each data row version

2017-05-11 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-3825:
-

 Assignee: Vincent Poon
Affects Version/s: 4.11.0
   4.10.0

As I was fixing the integration tests, I realized fixUpCurrentUpdates is also 
used for out-of-order updates.  The logic is arguably too complicated to handle 
that case, and I started refactoring but it was looking like a pretty big code 
change.

I think the simplest fix for now is to check for "ignoreNewerMutations", which 
is true when replaying for partial rebuilds.  In that case, we skip that 
method, and all the index versions get created appropriately.  I added a unit 
test for that so we don't regress.

> Mutable Index rebuild does not write an index version for each data row 
> version
> ---
>
> Key: PHOENIX-3825
> URL: https://issues.apache.org/jira/browse/PHOENIX-3825
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.10.0, 4.11.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-3825.master.v1.patch
>
>
> 1) Write a row
> 2) Disable the index
> 3) write a series of updates to the data row
> 4) trigger the BuildIndexScheduleTask partial rebuild
> The index table will only have 1 new version, whereas the data row had many 
> versions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3825) Mutable Index rebuild does not write an index version for each data row version

2017-05-09 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16003653#comment-16003653
 ] 

Vincent Poon commented on PHOENIX-3825:
---

[~jamestaylor]  Any idea what this method is for?
IndexUpdateManager#fixUpCurrentUpdates(..)

When I comment out all but the last line of that method, this JIRA is fixed, 
and the normal index updates seem to work as well.

But that method seems to be there for a reason.  From what I can tell, it's 
trying to do the opposite of this JIRA (but I could be wrong).  It seems like 
it's trying to avoid unnecessary index writes.  If there are index mutations 
that are covered/hidden by the current index mutation, then they are marked for 
removal from the batch of index updates.  That way, only the final visible 
mutation is written.

What do you think is the correct behavior here?  Write all versions, or only 
the latest?

> Mutable Index rebuild does not write an index version for each data row 
> version
> ---
>
> Key: PHOENIX-3825
> URL: https://issues.apache.org/jira/browse/PHOENIX-3825
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Vincent Poon
>
> 1) Write a row
> 2) Disable the index
> 3) write a series of updates to the data row
> 4) trigger the BuildIndexScheduleTask partial rebuild
> The index table will only have 1 new version, whereas the data row had many 
> versions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-05-17 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014516#comment-16014516
 ] 

Vincent Poon commented on PHOENIX-3807:
---

[~elserj] Cool, I'm working on HBASE-18060 which is a backport of "HBASE-9774 
HBase native metrics and metric collection for coprocessors"

With that we'll be able to report these server level metrics through HBase's 
metrics system.  Looking forward to seeing what you have so we can possibly 
integrate the two.

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Josh Elser
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3825) Mutable Index rebuild does not write an index version for each data row version

2017-05-09 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16003772#comment-16003772
 ] 

Vincent Poon commented on PHOENIX-3825:
---

Actually, looking at TestIndexUpdateManager, looks like the goal was in fact to 
cancel out Puts and Deletes for the same row.  So I guess it's a matter of what 
the correct expected behavior is.

> Mutable Index rebuild does not write an index version for each data row 
> version
> ---
>
> Key: PHOENIX-3825
> URL: https://issues.apache.org/jira/browse/PHOENIX-3825
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Vincent Poon
>
> 1) Write a row
> 2) Disable the index
> 3) write a series of updates to the data row
> 4) trigger the BuildIndexScheduleTask partial rebuild
> The index table will only have 1 new version, whereas the data row had many 
> versions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-05-31 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3807:
--
Attachment: PHOENIX-3807.v1.master.patch

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Josh Elser
> Attachments: PHOENIX-3807.v1.master.patch, Screen Shot 2017-05-31 at 
> 4.00.16 PM.png
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-05-31 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3807:
--
Attachment: (was: PHOENIX-3807.v1.master.patch)

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Josh Elser
> Attachments: PHOENIX-3807.v1.master.patch, Screen Shot 2017-05-31 at 
> 4.00.16 PM.png
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (PHOENIX-3824) Mutable Index partial rebuild adds more than one index row for updated data row

2017-05-02 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-3824:
-

 Summary: Mutable Index partial rebuild adds more than one index 
row for updated data row
 Key: PHOENIX-3824
 URL: https://issues.apache.org/jira/browse/PHOENIX-3824
 Project: Phoenix
  Issue Type: Bug
Reporter: Vincent Poon


If you follow this sequence:
1) disable index
2) write an updates to a data table row
3) trigger the BuildIndexScheduleTask partial rebuild

then you end up with two index rows for the one data table row.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (PHOENIX-3825) Mutable Index rebuild does not write an index version for each data row version

2017-05-02 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-3825:
-

 Summary: Mutable Index rebuild does not write an index version for 
each data row version
 Key: PHOENIX-3825
 URL: https://issues.apache.org/jira/browse/PHOENIX-3825
 Project: Phoenix
  Issue Type: Bug
Reporter: Vincent Poon


1) Write a row
2) Disable the index
3) write a series of updates to the data row
4) trigger the BuildIndexScheduleTask partial rebuild

The index table will only have 1 new version, whereas the data row had many 
versions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3824) Mutable Index partial rebuild adds more than one index row for updated data row

2017-05-05 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3824:
--
Attachment: PHOENIX-3824.v1.patch

> Mutable Index partial rebuild adds more than one index row for updated data 
> row
> ---
>
> Key: PHOENIX-3824
> URL: https://issues.apache.org/jira/browse/PHOENIX-3824
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.10.0
>Reporter: Vincent Poon
> Attachments: PHOENIX-3824.v1.patch
>
>
> If you follow this sequence:
> 1) disable index
> 2) write an updates to a data table row
> 3) trigger the BuildIndexScheduleTask partial rebuild
> then you end up with two index rows for the one data table row.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3824) Mutable Index partial rebuild adds more than one index row for updated data row

2017-05-05 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3824:
--
Attachment: PHOENIX-3824.v2.patch

Fixed line lengths and test timeout

> Mutable Index partial rebuild adds more than one index row for updated data 
> row
> ---
>
> Key: PHOENIX-3824
> URL: https://issues.apache.org/jira/browse/PHOENIX-3824
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.10.0
>Reporter: Vincent Poon
> Attachments: PHOENIX-3824.v1.patch, PHOENIX-3824.v2.patch
>
>
> If you follow this sequence:
> 1) disable index
> 2) write an updates to a data table row
> 3) trigger the BuildIndexScheduleTask partial rebuild
> then you end up with two index rows for the one data table row.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3806) IndexUpdateManager spending a lot of time sorting mutations on Index rebuild

2017-05-05 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3806:
--
Attachment: PHOENIX-3806.v1.patch

> IndexUpdateManager spending a lot of time sorting mutations on Index rebuild
> 
>
> Key: PHOENIX-3806
> URL: https://issues.apache.org/jira/browse/PHOENIX-3806
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: PHOENIX-3806.v1.patch
>
>
> Here's the stack trace. The Array contains 50001 Delete Mutations in this 
> case.
> It seems the code is sorting this over and over again.
> {code}
> Thread 170 (B.DefaultRpcServer.handler=67,queue=7,port=60020):
>   State: RUNNABLE
>   Blocked count: 220598
>   Waited count: 377933
>   Stack:
> java.util.TimSort.binarySort(TimSort.java:296)
> java.util.TimSort.sort(TimSort.java:239)
> java.util.Arrays.sort(Arrays.java:1438)
> 
> org.apache.phoenix.hbase.index.covered.update.SortedCollection.iterator(SortedCollection.java:78)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.fixUpCurrentUpdates(IndexUpdateManager.java:128)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.addIndexUpdate(IndexUpdateManager.java:115)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addCurrentStateMutationsForBatch(NonTxIndexBuilder.java:333)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addUpdateForGivenTimestamp(NonTxIndexBuilder.java:258)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addMutationsForBatch(NonTxIndexBuilder.java:231)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.batchMutationAndAddUpdates(NonTxIndexBuilder.java:109)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.getIndexUpdate(NonTxIndexBuilder.java:71)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:137)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:133)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
> 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:58)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdate(IndexBuildManager.java:144)
> 
> org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(Indexer.java:324)
> Thread 169 (B.DefaultRpcServer.handler=66,queue=6,port=60020):
> {code}
> [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3824) Mutable Index partial rebuild adds more than one index row for updated data row

2017-05-08 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001541#comment-16001541
 ] 

Vincent Poon commented on PHOENIX-3824:
---

[~lhofhansl] Yes it's a hot code path in that it's called anytime we want to 
generate an index update.  However the results are cached per mutation, so it's 
only called once for a mutation with many versions.

It's not actually sorting.  There's a call to guava's Ordering#min(), which 
just does an O(N) comparison of cell lists.  And for the comparison, I only 
compare the last cell in each list, since it's assumed they're already ordered 
from newest to oldest.  So overall an O(N) operation where N is your number of 
cell lists (families), not number of cells.  If you have one family with a huge 
number of cells it should return in constant time.

> Mutable Index partial rebuild adds more than one index row for updated data 
> row
> ---
>
> Key: PHOENIX-3824
> URL: https://issues.apache.org/jira/browse/PHOENIX-3824
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.10.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Fix For: 4.11.0
>
> Attachments: PHOENIX-3824.v1.patch, PHOENIX-3824.v2.patch
>
>
> If you follow this sequence:
> 1) disable index
> 2) write an updates to a data table row
> 3) trigger the BuildIndexScheduleTask partial rebuild
> then you end up with two index rows for the one data table row.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3824) Mutable Index partial rebuild adds more than one index row for updated data row

2017-05-04 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997685#comment-15997685
 ] 

Vincent Poon commented on PHOENIX-3824:
---

[~jamestaylor] Can you take a look at the logic in the patch?  Previously, when 
we ignoredNewerMutations , we set the scanner max timestamp to the first entry 
in the cell list, which is the newest timestamp.  But it seems for replaying of 
mutations, we should be getting the oldest timestamp in the current mutation, 
otherwise we'll fetch data that is in the current mutation being replayed.

> Mutable Index partial rebuild adds more than one index row for updated data 
> row
> ---
>
> Key: PHOENIX-3824
> URL: https://issues.apache.org/jira/browse/PHOENIX-3824
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Vincent Poon
>
> If you follow this sequence:
> 1) disable index
> 2) write an updates to a data table row
> 3) trigger the BuildIndexScheduleTask partial rebuild
> then you end up with two index rows for the one data table row.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PHOENIX-3806) IndexUpdateManager spending a lot of time sorting mutations on Index rebuild

2017-05-04 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997680#comment-15997680
 ] 

Vincent Poon commented on PHOENIX-3806:
---

put up a patch that simply changes the SortedCollection to a TreeSet, so that 
we don't have to do a sort every time we get an iterator on a collection.

For a mutation with 200 versions, this brings the runtime down from 90 seconds 
to around 1 second.

The test for this patch will be part of PHOENIX-3824.  That also fixes the 
arithmetic series summation problem described earlier.

> IndexUpdateManager spending a lot of time sorting mutations on Index rebuild
> 
>
> Key: PHOENIX-3806
> URL: https://issues.apache.org/jira/browse/PHOENIX-3806
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> Here's the stack trace. The Array contains 50001 Delete Mutations in this 
> case.
> It seems the code is sorting this over and over again.
> {code}
> Thread 170 (B.DefaultRpcServer.handler=67,queue=7,port=60020):
>   State: RUNNABLE
>   Blocked count: 220598
>   Waited count: 377933
>   Stack:
> java.util.TimSort.binarySort(TimSort.java:296)
> java.util.TimSort.sort(TimSort.java:239)
> java.util.Arrays.sort(Arrays.java:1438)
> 
> org.apache.phoenix.hbase.index.covered.update.SortedCollection.iterator(SortedCollection.java:78)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.fixUpCurrentUpdates(IndexUpdateManager.java:128)
> 
> org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager.addIndexUpdate(IndexUpdateManager.java:115)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addCurrentStateMutationsForBatch(NonTxIndexBuilder.java:333)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addUpdateForGivenTimestamp(NonTxIndexBuilder.java:258)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.addMutationsForBatch(NonTxIndexBuilder.java:231)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.batchMutationAndAddUpdates(NonTxIndexBuilder.java:109)
> 
> org.apache.phoenix.hbase.index.covered.NonTxIndexBuilder.getIndexUpdate(NonTxIndexBuilder.java:71)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:137)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call(IndexBuildManager.java:133)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
> 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submit(BaseTaskRunner.java:58)
> 
> org.apache.phoenix.hbase.index.parallel.BaseTaskRunner.submitUninterruptible(BaseTaskRunner.java:99)
> 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexUpdate(IndexBuildManager.java:144)
> 
> org.apache.phoenix.hbase.index.Indexer.preBatchMutateWithExceptions(Indexer.java:324)
> Thread 169 (B.DefaultRpcServer.handler=66,queue=6,port=60020):
> {code}
> [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-05-31 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3807:
--
Attachment: PHOENIX-3807.v1.master.patch

Here's a patch which uses the HBase native metrics API.  There's a check for 
HBase version >= 1.4 for backwards compatibility, and a config flag.

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Josh Elser
> Attachments: PHOENIX-3807.v1.master.patch
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-05-31 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3807:
--
Attachment: Screen Shot 2017-05-31 at 4.00.16 PM.png

example pic of what the metrics look like, as viewed from the RS debug dump 
webpage.

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Josh Elser
> Attachments: PHOENIX-3807.v1.master.patch, Screen Shot 2017-05-31 at 
> 4.00.16 PM.png
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (PHOENIX-3807) Add server level metrics for secondary indexes

2017-05-31 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032161#comment-16032161
 ] 

Vincent Poon edited comment on PHOENIX-3807 at 5/31/17 11:01 PM:
-

Here's a patch which uses the HBase native metrics API.  There's a check for 
HBase version >= 1.4 for backwards compatibility, and a config flag.
[~jamestaylor]


was (Author: vincentpoon):
Here's a patch which uses the HBase native metrics API.  There's a check for 
HBase version >= 1.4 for backwards compatibility, and a config flag.

> Add server level metrics for secondary indexes
> --
>
> Key: PHOENIX-3807
> URL: https://issues.apache.org/jira/browse/PHOENIX-3807
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Josh Elser
> Attachments: PHOENIX-3807.v1.master.patch
>
>
> Add server level metrics for secondary indexes
> - Histogram metrics for time to complete all secondary index updates per 
> primary table update per index type. Will help us trend perf over time and 
> catch impending issues. 
> - Histogram metrics for number of updates dispatched to secondary index 
> stores to service one primary table update per index type. Will help us catch 
> inefficient behavior or problematic schema design. 
> - Count of deployed secondary indexes by type. Will help understand and trend 
> index usage, or catch deploy of an unwanted index type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (PHOENIX-4169) Explicitly cap timeout for index disable RPC on compaction

2017-09-11 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4169:
--
Attachment: PHOENIX-4169.master.patch
PHOENIX-4169.0.98.patch

> Explicitly cap timeout for index disable RPC on compaction
> --
>
> Key: PHOENIX-4169
> URL: https://issues.apache.org/jira/browse/PHOENIX-4169
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Critical
> Attachments: PHOENIX-4169.0.98.patch, PHOENIX-4169.master.patch
>
>
> In PHOENIX-3953 we're marking the mutable global index as disabled with an 
> index_disable_timestamp of 0 from the compaction hook.This is a potentially a 
> server-server RPC, and HConnectionManager#setServerSideHConnectionRetries 
> makes it such that the HBase client config on the server side has 10 times 
> the number of retries, lasting hours.
> To avoid a hung coprocessor hook, we should explicitly cap the number of 
> retries here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4214) Scans which write should not block region split or close

2017-09-15 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168806#comment-16168806
 ] 

Vincent Poon commented on PHOENIX-4214:
---

[~lhofhansl] Yes, this patch blocks only new write requests, and lets existing 
ones finish.  I was thinking for case of local index creation, maybe we could 
write to HFiles directly, then bulk load into HBase at the end?  This would 
avoid the flush/split deadlock issue in PHOENIX-3111 altogether.  And HBase 
1.3+ supports replication of bulk loads, I believe.

Normal upsert select and deletes would still need this patch, though.

> Scans which write should not block region split or close
> 
>
> Key: PHOENIX-4214
> URL: https://issues.apache.org/jira/browse/PHOENIX-4214
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
> Attachments: splitDuringUpsertSelect_wip.patch
>
>
> PHOENIX-3111 introduced a scan reference counter which is checked during 
> region preSplit and preClose.  However, a steady stream of UPSERT SELECT or 
> DELETE can keep the count above 0 indefinitely, preventing or greatly 
> delaying a region split or close.
> We should try to avoid starvation of the split / close request, and 
> fail/reject queries where appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4219) Index gets out of sync on HBase 1.x

2017-09-19 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4219:
--
Attachment: PHOENIX-4219_test.patch

> Index gets out of sync on HBase 1.x
> ---
>
> Key: PHOENIX-4219
> URL: https://issues.apache.org/jira/browse/PHOENIX-4219
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
> Attachments: PHOENIX-4219_test.patch
>
>
> When writing batches in parallel with multiple background threads, it seems 
> the index sometimes gets out of sync.  This only happens on the master and 
> 4.x-HBase-1.2.
> The tests pass for 4.x-HBase-0.98
> See the attached test, which writes with 2 background threads with batch size 
> of 100.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4219) Index gets out of sync on HBase 1.x

2017-09-19 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4219:
-

 Summary: Index gets out of sync on HBase 1.x
 Key: PHOENIX-4219
 URL: https://issues.apache.org/jira/browse/PHOENIX-4219
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Vincent Poon


When writing batches in parallel with multiple background threads, it seems the 
index sometimes gets out of sync.  This only happens on the master and 
4.x-HBase-1.2.
The tests pass for 4.x-HBase-0.98

See the attached test, which writes with 2 background threads with batch size 
of 100.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4214) Scans which write should not block region split or close

2017-09-22 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4214:
--
Attachment: PHOENIX-4214.master.v1.patch

Attaching patch for master, [~jamestaylor] [~samarthjain]  please review

I find that if I throw an IOException from doPostScannerOpen , the exception 
gets bubbled back to the client, which gets an IOException wrapped around a 
UnknownScannerException.
Is there a specific exception I should be throwing to get the HBase client to 
retry, without the phoenix client having to handle the exception?

> Scans which write should not block region split or close
> 
>
> Key: PHOENIX-4214
> URL: https://issues.apache.org/jira/browse/PHOENIX-4214
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
> Attachments: PHOENIX-4214.master.v1.patch, 
> splitDuringUpsertSelect_wip.patch
>
>
> PHOENIX-3111 introduced a scan reference counter which is checked during 
> region preSplit and preClose.  However, a steady stream of UPSERT SELECT or 
> DELETE can keep the count above 0 indefinitely, preventing or greatly 
> delaying a region split or close.
> We should try to avoid starvation of the split / close request, and 
> fail/reject queries where appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4220) Upper bound not being used in partial index rebuilder

2017-09-20 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173540#comment-16173540
 ] 

Vincent Poon commented on PHOENIX-4220:
---

Is there a test we can write for this? We didn't have one on PHOENIX-3525 , so 
didn't catch this.
But +1 to the patch!

> Upper bound not being used in partial index rebuilder
> -
>
> Key: PHOENIX-4220
> URL: https://issues.apache.org/jira/browse/PHOENIX-4220
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4220.patch
>
>
> Though we're setting the scan upper and lower bound, it's not taking effect 
> in the index rebuilder. Thus, we're rebuilding from the lower bound to the 
> latest timestamp which is bad if the table is taking a lot of writes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4214) Scans which write should not block region split or close

2017-09-21 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174344#comment-16174344
 ] 

Vincent Poon commented on PHOENIX-4214:
---

Was trying to port this to master but the other unrelated tests were failing 
due to PHOENIX-4219
In the process of extracting out the relevant tests just for this jira, will 
hopefully post an update soon

> Scans which write should not block region split or close
> 
>
> Key: PHOENIX-4214
> URL: https://issues.apache.org/jira/browse/PHOENIX-4214
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
> Attachments: splitDuringUpsertSelect_wip.patch
>
>
> PHOENIX-3111 introduced a scan reference counter which is checked during 
> region preSplit and preClose.  However, a steady stream of UPSERT SELECT or 
> DELETE can keep the count above 0 indefinitely, preventing or greatly 
> delaying a region split or close.
> We should try to avoid starvation of the split / close request, and 
> fail/reject queries where appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3111) Possible Deadlock/delay while building index, upsert select, delete rows at server

2017-09-14 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3111:
--
Attachment: splitDuringUpsertSelect.patch

Here's a patch that has following tests to reproduce the issue, they're part of 
a larger suite I was writing to test global mutable secondary indexing:
testRegionCloseDuringUpsertSelect
testSplitDuringUpsertSelect

Patch includes something like what [~samarthjain] suggested - when a 
split/close has been requested, I throw IOException for any new incoming scans 
that require a write.  This at least allows the split/close to happen 
eventually, as the scansRefCounter won't go up, while still allowing for the 
existing operations in progress to finish.

The loop in preClose is rather dangerous, since if a scanner thread is 
interrupted, there is no guarantee the finally block will run, and so the 
scanRefCounter might never get back to 0.  I encountered this in the test when 
the miniCluster was attempting to shutdown, not sure if there are other 
scenarios where this might happen in actual production usage.  I throw an 
IOException during an interrupt there to avoid this.

Note that you currently can't run all the tests in the suite, so just run the 
individual tests you want to try.

> Possible Deadlock/delay while building index, upsert select, delete rows at 
> server
> --
>
> Key: PHOENIX-3111
> URL: https://issues.apache.org/jira/browse/PHOENIX-3111
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Sergio Peleato
>Assignee: Rajeshbabu Chintaguntla
>Priority: Critical
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3111_addendum.patch, PHOENIX-3111.patch, 
> PHOENIX-3111_v2.patch, splitDuringUpsertSelect.patch
>
>
> There is a possible deadlock while building local index or running upsert 
> select, delete at server. The situation might happen in this case.
> In the above queries we scan mutations from table and write back to same 
> table in that case there is a chance of memstore might reach the threshold of 
> blocking memstore size then RegionTooBusyException might be thrown back to 
> client and queries might retry scanning.
> Let's suppose if we take a local index build index case we first scan from 
> the data table and prepare index mutations and write back to same table.
> So there is chance of memstore full as well in that case we try to flush the 
> region. But if the split happen in between then split might be waiting for 
> write lock on the region to close and flush wait for readlock because the 
> write lock in the queue until the local index build completed. Local index 
> build won't complete because we are not allowed to write until there is 
> flush. This might not be complete deadlock situation but the queries might 
> take lot of time to complete in this cases.
> {noformat}
> "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5 
> os_prio=31 tid=0x7f7fb2050800 nid=0x1c033 waiting on condition 
> [0x000139b68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0006ede72550> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370)
> - locked <0x0006ede69d00> (a java.lang.Object)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561)
> at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
> at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   

[jira] [Updated] (PHOENIX-3111) Possible Deadlock/delay while building index, upsert select, delete rows at server

2017-09-14 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3111:
--
Attachment: (was: splitDuringUpsertSelect.patch)

> Possible Deadlock/delay while building index, upsert select, delete rows at 
> server
> --
>
> Key: PHOENIX-3111
> URL: https://issues.apache.org/jira/browse/PHOENIX-3111
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Sergio Peleato
>Assignee: Rajeshbabu Chintaguntla
>Priority: Critical
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3111_addendum.patch, PHOENIX-3111.patch, 
> PHOENIX-3111_v2.patch
>
>
> There is a possible deadlock while building local index or running upsert 
> select, delete at server. The situation might happen in this case.
> In the above queries we scan mutations from table and write back to same 
> table in that case there is a chance of memstore might reach the threshold of 
> blocking memstore size then RegionTooBusyException might be thrown back to 
> client and queries might retry scanning.
> Let's suppose if we take a local index build index case we first scan from 
> the data table and prepare index mutations and write back to same table.
> So there is chance of memstore full as well in that case we try to flush the 
> region. But if the split happen in between then split might be waiting for 
> write lock on the region to close and flush wait for readlock because the 
> write lock in the queue until the local index build completed. Local index 
> build won't complete because we are not allowed to write until there is 
> flush. This might not be complete deadlock situation but the queries might 
> take lot of time to complete in this cases.
> {noformat}
> "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5 
> os_prio=31 tid=0x7f7fb2050800 nid=0x1c033 waiting on condition 
> [0x000139b68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0006ede72550> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370)
> - locked <0x0006ede69d00> (a java.lang.Object)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561)
> at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
> at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
> - <0x0006ee132098> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> {noformat}
> "MemStoreFlusher.0" #170 prio=5 os_prio=31 tid=0x7f7fb6842000 nid=0x19303 
> waiting on condition [0x0001388e9000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0006ede72550> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
> at 
> 

[jira] [Created] (PHOENIX-4214) Scans which write should not block region split or close

2017-09-14 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4214:
-

 Summary: Scans which write should not block region split or close
 Key: PHOENIX-4214
 URL: https://issues.apache.org/jira/browse/PHOENIX-4214
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Vincent Poon


PHOENIX-3111 introduced a scan reference counter which is checked during region 
preSplit and preClose.  However, a steady stream of UPSERT SELECT or DELETE can 
keep the count above 0 indefinitely, preventing or greatly delaying a region 
split or close.

We should try to avoid starvation of the split / close request, and fail/reject 
queries where appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4214) Scans which write should not block region split or close

2017-09-14 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4214:
--
Attachment: splitDuringUpsertSelect_wip.patch

Here's a patch that has following tests to reproduce the issue, they're part of 
a larger suite I was writing to test global mutable secondary indexing:
testRegionCloseDuringUpsertSelect
testSplitDuringUpsertSelect
Patch includes something like what Samarth Jain suggested - when a split/close 
has been requested, I throw IOException for any new incoming scans that require 
a write. This at least allows the split/close to happen eventually, as the 
scansRefCounter won't go up, while still allowing for the existing operations 
in progress to finish.
The loop in preClose is rather dangerous, since if a scanner thread is 
interrupted, there is no guarantee the finally block will run, and so the 
scanRefCounter might never get back to 0. I encountered this in the test when 
the miniCluster was attempting to shutdown, not sure if there are other 
scenarios where this might happen in actual production usage. I throw an 
IOException during an interrupt there to avoid this.
Note that you currently can't run all the tests in the suite, so just run the 
individual tests you want to try.

> Scans which write should not block region split or close
> 
>
> Key: PHOENIX-4214
> URL: https://issues.apache.org/jira/browse/PHOENIX-4214
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
> Attachments: splitDuringUpsertSelect_wip.patch
>
>
> PHOENIX-3111 introduced a scan reference counter which is checked during 
> region preSplit and preClose.  However, a steady stream of UPSERT SELECT or 
> DELETE can keep the count above 0 indefinitely, preventing or greatly 
> delaying a region split or close.
> We should try to avoid starvation of the split / close request, and 
> fail/reject queries where appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3111) Possible Deadlock/delay while building index, upsert select, delete rows at server

2017-09-14 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167136#comment-16167136
 ] 

Vincent Poon commented on PHOENIX-3111:
---

opened PHOENIX-4214 , please move the discussion there

> Possible Deadlock/delay while building index, upsert select, delete rows at 
> server
> --
>
> Key: PHOENIX-3111
> URL: https://issues.apache.org/jira/browse/PHOENIX-3111
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Sergio Peleato
>Assignee: Rajeshbabu Chintaguntla
>Priority: Critical
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3111_addendum.patch, PHOENIX-3111.patch, 
> PHOENIX-3111_v2.patch
>
>
> There is a possible deadlock while building local index or running upsert 
> select, delete at server. The situation might happen in this case.
> In the above queries we scan mutations from table and write back to same 
> table in that case there is a chance of memstore might reach the threshold of 
> blocking memstore size then RegionTooBusyException might be thrown back to 
> client and queries might retry scanning.
> Let's suppose if we take a local index build index case we first scan from 
> the data table and prepare index mutations and write back to same table.
> So there is chance of memstore full as well in that case we try to flush the 
> region. But if the split happen in between then split might be waiting for 
> write lock on the region to close and flush wait for readlock because the 
> write lock in the queue until the local index build completed. Local index 
> build won't complete because we are not allowed to write until there is 
> flush. This might not be complete deadlock situation but the queries might 
> take lot of time to complete in this cases.
> {noformat}
> "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5 
> os_prio=31 tid=0x7f7fb2050800 nid=0x1c033 waiting on condition 
> [0x000139b68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0006ede72550> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370)
> - locked <0x0006ede69d00> (a java.lang.Object)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278)
> at 
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561)
> at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
> at 
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
> - <0x0006ee132098> (a 
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> {noformat}
> "MemStoreFlusher.0" #170 prio=5 os_prio=31 tid=0x7f7fb6842000 nid=0x19303 
> waiting on condition [0x0001388e9000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0006ede72550> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
> at 
> 

[jira] [Commented] (PHOENIX-4242) Fix Indexer post-compact hook logging of NPE and TableNotFound

2017-10-04 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192341#comment-16192341
 ] 

Vincent Poon commented on PHOENIX-4242:
---

[~jamestaylor]
In our postCompact hook, we call getTableNoCache which calls 
MetaDataClient#updateCache.  However in there, we bail out early if it's a 
System table, instead of populating the result.
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java#L604

If I comment out that portion, then my test passes for the NPE part of this 
Jira.  Any idea if/why we need this?

> Fix Indexer post-compact hook logging of NPE and TableNotFound
> --
>
> Key: PHOENIX-4242
> URL: https://issues.apache.org/jira/browse/PHOENIX-4242
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>
> The post-compact hook in the Indexer seems to log extraneous log messages 
> indicating NPE or TableNotFound.  The TableNotFound exceptions seem to 
> indicate actual table names prefixed with MERGE or RESTORE, and sometimes 
> suffixed with a digit, so perhaps these are views or something similar.
> Examples:
> 2017-09-28 13:35:03,118 WARN  [ctions-1506410238599] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for SYSTEM.SEQUENCE
> java.lang.NullPointerException
> 2017-09-28 10:20:56,406 WARN  [ctions-1506410238415] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for 
> MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2
> org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table 
> undefined. tableName=MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4279) PhoenixMRJobSubmitter not submitting index rebuild jobs

2017-10-06 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4279:
-

 Summary: PhoenixMRJobSubmitter not submitting index rebuild jobs
 Key: PHOENIX-4279
 URL: https://issues.apache.org/jira/browse/PHOENIX-4279
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.13.0
Reporter: Vincent Poon


We had an index that was disabled.  We tried to rebuild it with "ALTER INDEX 
my_index ON my_table REBUILD ASYNC".  That index_state was updated to 'b' , but 
PhoenixMRJobSubmitter didn't submit rebuild jobs.  The getCandidateJobs query 
didn't return any results.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4267) Add mutable index chaos tests

2017-10-02 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4267:
-

 Summary: Add mutable index chaos tests
 Key: PHOENIX-4267
 URL: https://issues.apache.org/jira/browse/PHOENIX-4267
 Project: Phoenix
  Issue Type: Improvement
Reporter: Vincent Poon


Tests that kill regionservers or close regions while batch writes to an indexed 
table are happening.
Index scrutiny is run at the end of each test to verify the index is in sync 
afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4267) Add mutable index chaos tests

2017-10-02 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4267:
--
Attachment: PHOENIX-4267.v1.master.patch

[~jamestaylor] These weren't passing before because of PHOENIX-3112, but should 
pass once that is in.

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4267) Add mutable index chaos tests

2017-10-02 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4267:
--
Attachment: (was: PHOENIX-4267.v1.master.patch)

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-4267) Add mutable index chaos tests

2017-10-02 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-4267:
-

Assignee: Vincent Poon

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4267) Add mutable index chaos tests

2017-10-02 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4267:
--
Attachment: PHOENIX-4267.v1.master.patch

> Add mutable index chaos tests
> -
>
> Key: PHOENIX-4267
> URL: https://issues.apache.org/jira/browse/PHOENIX-4267
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4267.v1.master.patch
>
>
> Tests that kill regionservers or close regions while batch writes to an 
> indexed table are happening.
> Index scrutiny is run at the end of each test to verify the index is in sync 
> afterwards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4269) IndexScrutinyToolIT is flapping

2017-10-10 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199425#comment-16199425
 ] 

Vincent Poon commented on PHOENIX-4269:
---

ping [~jamestaylor]

> IndexScrutinyToolIT is flapping
> ---
>
> Key: PHOENIX-4269
> URL: https://issues.apache.org/jira/browse/PHOENIX-4269
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.13.0
>Reporter: James Taylor
>Assignee: Vincent Poon
> Attachments: PHOENIX-4269.master.patch
>
>
> In a local test run (not able to repro when run separately), I saw the 
> following failure:
> {code}
> [ERROR] Tests run: 20, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 193.228 s <<< FAILURE! - in org.apache.phoenix.end2end.IndexScrutinyToolIT
> [ERROR] 
> testBothDataAndIndexAsSource[0](org.apache.phoenix.end2end.IndexScrutinyToolIT)
>   Time elapsed: 11.708 s  <<< FAILURE!
> java.lang.AssertionError: expected:<1> but was:<0>
>   at 
> org.apache.phoenix.end2end.IndexScrutinyToolIT.testBothDataAndIndexAsSource(IndexScrutinyToolIT.java:344)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4242) Fix Indexer post-compact hook logging of NPE and TableNotFound

2017-10-10 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199388#comment-16199388
 ] 

Vincent Poon commented on PHOENIX-4242:
---

[~jamestaylor] Does it matter if the index table gets compacted?  If the data 
table is being compacted, if the indexes on it are disabled, then we don't want 
to remove crucial version history for rebuilding the index.  But does it matter 
if we clear the index table version history, whether it's disabled or not?  We 
don't seem to ever scan through the index version history anywhere.

> Fix Indexer post-compact hook logging of NPE and TableNotFound
> --
>
> Key: PHOENIX-4242
> URL: https://issues.apache.org/jira/browse/PHOENIX-4242
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4747.v1.master.patch
>
>
> The post-compact hook in the Indexer seems to log extraneous log messages 
> indicating NPE or TableNotFound.  The TableNotFound exceptions seem to 
> indicate actual table names prefixed with MERGE or RESTORE, and sometimes 
> suffixed with a digit, so perhaps these are views or something similar.
> Examples:
> 2017-09-28 13:35:03,118 WARN  [ctions-1506410238599] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for SYSTEM.SEQUENCE
> java.lang.NullPointerException
> 2017-09-28 10:20:56,406 WARN  [ctions-1506410238415] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for 
> MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2
> org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table 
> undefined. tableName=MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4242) Fix Indexer post-compact hook logging of NPE and TableNotFound

2017-10-05 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4242:
--
Attachment: PHOENIX-4747.v1.master.patch

[~jamestaylor] PhoenixRuntime#getTableNoCache() isn't working for System 
tables, because we set alwaysHitServer to true when calling updateCache(), 
which skips this logic:
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java#L584

Seems the simple fix is to set the table in the result on the line further 
down, please see attached patch.

For the TableNotFoundExceptions, I think the reason might be because we're 
creating HBase non-Phoenix tables, which for some reason have the Indexer 
coprocessor loaded?  Any idea how that might happen?  We don't seem to have 
this problem with UngroupedAggregateRegionObserver. 

> Fix Indexer post-compact hook logging of NPE and TableNotFound
> --
>
> Key: PHOENIX-4242
> URL: https://issues.apache.org/jira/browse/PHOENIX-4242
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4747.v1.master.patch
>
>
> The post-compact hook in the Indexer seems to log extraneous log messages 
> indicating NPE or TableNotFound.  The TableNotFound exceptions seem to 
> indicate actual table names prefixed with MERGE or RESTORE, and sometimes 
> suffixed with a digit, so perhaps these are views or something similar.
> Examples:
> 2017-09-28 13:35:03,118 WARN  [ctions-1506410238599] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for SYSTEM.SEQUENCE
> java.lang.NullPointerException
> 2017-09-28 10:20:56,406 WARN  [ctions-1506410238415] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for 
> MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2
> org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table 
> undefined. tableName=MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4179) Use max timestamp of projected cells for cell timestamp returned to client

2017-09-08 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158970#comment-16158970
 ] 

Vincent Poon commented on PHOENIX-4179:
---

looks good [~jamestaylor], +1

> Use max timestamp of projected cells for cell timestamp returned to client
> --
>
> Key: PHOENIX-4179
> URL: https://issues.apache.org/jira/browse/PHOENIX-4179
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4179.patch
>
>
> The timestamp that we use for the cell that gets serialized back to the 
> client is somewhat random, as it'll be the timestamp of the first cell. 
> Instead, we should use the max timestamp we see across all cells that are 
> projected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3815) Only disable indexes on which write failures occurred

2017-08-29 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145787#comment-16145787
 ] 

Vincent Poon commented on PHOENIX-3815:
---

[~jamestaylor] now that we've lowered the index write timeout (PHOENIX-3948), I 
suppose TrackingWriter isn't as bad.  However, there are still some corner 
cases where it will be noticeably slower than ParallelWriter.  Imagine all 
threads in the writer pool (default of 10) are in use and there is heavy write 
traffic - the writes will then be executed serially as threads become 
available, and you'd have to wait for each to fail in the worst case where all 
indexes are failing.

But I think these might be relatively rare cases.  Also if the failure policy 
is to disable the index, then it doesn't matter too much.  Either way, we 
should at a minimum extract out the common code in the classes if we're not 
going to remove one - they pretty much do the same thing but use a different 
TaskRunner.

If the failure policy is to leave the index enabled, then you might care about 
failing fast, as you could face repeated failures if something is wrong with 
the index RS.  We could also have a rate counter for # of failures in a given 
time window.  If # of failures exceeds that, we fail fast like ParallelWriter.  
Otherwise, use TrackingWriter logic in the normal happy case.  But again, only 
matters if you plan to leave the index enabled.

> Only disable indexes on which write failures occurred
> -
>
> Key: PHOENIX-3815
> URL: https://issues.apache.org/jira/browse/PHOENIX-3815
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3815.v1.patch
>
>
> We currently disable all indexes if any of them fail to be written to. We 
> really only should disable the one in which the write failed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4137) Document IndexScrutinyTool

2017-08-29 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146082#comment-16146082
 ] 

Vincent Poon commented on PHOENIX-4137:
---

[~jamestaylor] Attached the documentation update

> Document IndexScrutinyTool
> --
>
> Key: PHOENIX-4137
> URL: https://issues.apache.org/jira/browse/PHOENIX-4137
> Project: Phoenix
>  Issue Type: Task
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: secondary_indexing.md
>
>
> Now that PHOENIX-2460 has been committed, we need to update our website 
> documentation to describe how to use it. For an overview of updating the 
> website, see http://phoenix.apache.org/building_website.html. For 
> IndexScrutinyTool, it's probably enough to add a section in 
> https://phoenix.apache.org/secondary_indexing.html (which lives in 
> ./site/source/src/site/markdown/secondary_indexing.md) describing the purpose 
> and possible arguments to the MR job. Something similar to the table for our 
> bulk loader here: 
> https://phoenix.apache.org/bulk_dataload.html#Loading_via_MapReduce.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-2460) Implement scrutiny command to validate whether or not an index is in sync with the data table

2017-08-29 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-2460:
--
Attachment: secondary_indexing.md

> Implement scrutiny command to validate whether or not an index is in sync 
> with the data table
> -
>
> Key: PHOENIX-2460
> URL: https://issues.apache.org/jira/browse/PHOENIX-2460
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-2460.patch, secondary_indexing.md
>
>
> We should have a process that runs to verify that an index is valid against a 
> data table and potentially fixes it if discrepancies are found. This could 
> either be a MR job or a low priority background task.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3815) Only disable indexes on which write failures occurred

2017-08-29 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3815:
--
Attachment: PHOENIX-3815.0.98.v2.patch
PHOENIX-3815.master.v2.patch

> Only disable indexes on which write failures occurred
> -
>
> Key: PHOENIX-3815
> URL: https://issues.apache.org/jira/browse/PHOENIX-3815
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3815.0.98.v2.patch, 
> PHOENIX-3815.master.v2.patch, PHOENIX-3815.v1.patch
>
>
> We currently disable all indexes if any of them fail to be written to. We 
> really only should disable the one in which the write failed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-3815) Only disable indexes on which write failures occurred

2017-08-29 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-3815:
--
Attachment: PHOENIX-3815.master.v2.patch

> Only disable indexes on which write failures occurred
> -
>
> Key: PHOENIX-3815
> URL: https://issues.apache.org/jira/browse/PHOENIX-3815
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3815.0.98.v2.patch, 
> PHOENIX-3815.master.v2.patch, PHOENIX-3815.v1.patch
>
>
> We currently disable all indexes if any of them fail to be written to. We 
> really only should disable the one in which the write failed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3953) Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction

2017-09-07 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157344#comment-16157344
 ] 

Vincent Poon commented on PHOENIX-3953:
---

lgtm

> Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction
> --
>
> Key: PHOENIX-3953
> URL: https://issues.apache.org/jira/browse/PHOENIX-3953
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>  Labels: globalMutableSecondaryIndex
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3953_4.x-HBase-0.98_addendum2.patch, 
> PHOENIX-3953_addendum1.patch, PHOENIX-3953_addendum2.patch, 
> PHOENIX-3953.patch, PHOENIX-3953_v2.patch
>
>
> To guard against a compaction occurring (which would potentially clear delete 
> markers and puts that the partial index rebuild process counts on to properly 
> catch up an index with the data table), we should clear the 
> INDEX_DISABLED_TIMESTAMP and mark the index as disabled. This could be done 
> in the post compaction coprocessor hook. At this point, a manual rebuild of 
> the index would be required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3953) Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction

2017-09-06 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155761#comment-16155761
 ] 

Vincent Poon commented on PHOENIX-3953:
---

[~jamestaylor] in PHOENIX-3948 , I found that HBase 0.98 has a bug (or 
"feature") where HConnectionManager#setServerSideHConnectionRetries makes it 
such that the HBase client config on the server side has 10 times the number of 
retries, such that it will retry for hours.  Worth double checking if we need 
to lower it here too, inside the compaction critical path

> Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction
> --
>
> Key: PHOENIX-3953
> URL: https://issues.apache.org/jira/browse/PHOENIX-3953
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>  Labels: globalMutableSecondaryIndex
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3953_addendum1.patch, PHOENIX-3953.patch, 
> PHOENIX-3953_v2.patch
>
>
> To guard against a compaction occurring (which would potentially clear delete 
> markers and puts that the partial index rebuild process counts on to properly 
> catch up an index with the data table), we should clear the 
> INDEX_DISABLED_TIMESTAMP and mark the index as disabled. This could be done 
> in the post compaction coprocessor hook. At this point, a manual rebuild of 
> the index would be required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4181) Drop tenant views columns when base view column is dropped

2017-09-07 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4181:
--
Attachment: PHOENIX-4181_test.master.patch

test to repro the issue

> Drop tenant views columns when base view column is dropped
> --
>
> Key: PHOENIX-4181
> URL: https://issues.apache.org/jira/browse/PHOENIX-4181
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
> Attachments: PHOENIX-4181_test.master.patch
>
>
> If you create a base table, a base view on the base table, and then a tenant 
> view on the base view, when a base view column is dropped, it should get 
> dropped from the tenant views as well.
> This is currently not happening.  See the attached test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4181) Drop tenant views columns when base view column is dropped

2017-09-07 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4181:
-

 Summary: Drop tenant views columns when base view column is dropped
 Key: PHOENIX-4181
 URL: https://issues.apache.org/jira/browse/PHOENIX-4181
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Vincent Poon


If you create a base table, a base view on the base table, and then a tenant 
view on the base view, when a base view column is dropped, it should get 
dropped from the tenant views as well.

This is currently not happening.  See the attached test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4169) Explicitly cap timeout for index disable RPC on compaction

2017-09-12 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4169:
--
Attachment: PHOENIX-4169.master.v4.patch
PHOENIX-4169.0.98.v4.patch

[~jamestaylor] updated patch with Indexer changes as well.

I created a separate config with separate timeouts because writes to update the 
index state should have longer timeouts than normal index updates (which are 
shorter).


> Explicitly cap timeout for index disable RPC on compaction
> --
>
> Key: PHOENIX-4169
> URL: https://issues.apache.org/jira/browse/PHOENIX-4169
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Critical
> Attachments: PHOENIX-4169.0.98.patch, PHOENIX-4169.0.98.v2.patch, 
> PHOENIX-4169.0.98.v3.patch, PHOENIX-4169.0.98.v4.patch, 
> PHOENIX-4169.master.patch, PHOENIX-4169.master.v2.patch, 
> PHOENIX-4169.master.v3.patch, PHOENIX-4169.master.v4.patch
>
>
> In PHOENIX-3953 we're marking the mutable global index as disabled with an 
> index_disable_timestamp of 0 from the compaction hook.This is a potentially a 
> server-server RPC, and HConnectionManager#setServerSideHConnectionRetries 
> makes it such that the HBase client config on the server side has 10 times 
> the number of retries, lasting hours.
> To avoid a hung coprocessor hook, we should explicitly cap the number of 
> retries here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4169) Explicitly cap timeout for index disable RPC on compaction

2017-09-12 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4169:
--
Attachment: PHOENIX-4169.0.98.v3.patch
PHOENIX-4169.master.v3.patch

[~samarthjain] Thanks, updated the patch with your suggestion

> Explicitly cap timeout for index disable RPC on compaction
> --
>
> Key: PHOENIX-4169
> URL: https://issues.apache.org/jira/browse/PHOENIX-4169
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Critical
> Attachments: PHOENIX-4169.0.98.patch, PHOENIX-4169.0.98.v2.patch, 
> PHOENIX-4169.0.98.v3.patch, PHOENIX-4169.master.patch, 
> PHOENIX-4169.master.v2.patch, PHOENIX-4169.master.v3.patch
>
>
> In PHOENIX-3953 we're marking the mutable global index as disabled with an 
> index_disable_timestamp of 0 from the compaction hook.This is a potentially a 
> server-server RPC, and HConnectionManager#setServerSideHConnectionRetries 
> makes it such that the HBase client config on the server side has 10 times 
> the number of retries, lasting hours.
> To avoid a hung coprocessor hook, we should explicitly cap the number of 
> retries here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4169) Explicitly cap timeout for index disable RPC on compaction

2017-09-06 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4169:
-

 Summary: Explicitly cap timeout for index disable RPC on compaction
 Key: PHOENIX-4169
 URL: https://issues.apache.org/jira/browse/PHOENIX-4169
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Vincent Poon
Assignee: Vincent Poon


In PHOENIX-3953 we're marking the mutable global index as disabled with an 
index_disable_timestamp of 0 from the compaction hook.This is a potentially a 
server-server RPC, and HConnectionManager#setServerSideHConnectionRetries makes 
it such that the HBase client config on the server side has 10 times the number 
of retries, lasting hours.
To avoid a hung coprocessor hook, we should explicitly cap the number of 
retries here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3953) Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction

2017-09-06 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155722#comment-16155722
 ] 

Vincent Poon commented on PHOENIX-3953:
---

[~jamestaylor] Is there a way for us to signal that this happened, perhaps by 
setting the indexDisableTimestamp to a special negative value?  At a minimum, I 
think we should add a log line to indicate this.  Otherwise from an operator 
perspective, we would see a disabled index and have to triage why the rebuilder 
didn't fix it.

> Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction
> --
>
> Key: PHOENIX-3953
> URL: https://issues.apache.org/jira/browse/PHOENIX-3953
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>  Labels: globalMutableSecondaryIndex
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3953_addendum1.patch, PHOENIX-3953.patch, 
> PHOENIX-3953_v2.patch
>
>
> To guard against a compaction occurring (which would potentially clear delete 
> markers and puts that the partial index rebuild process counts on to properly 
> catch up an index with the data table), we should clear the 
> INDEX_DISABLED_TIMESTAMP and mark the index as disabled. This could be done 
> in the post compaction coprocessor hook. At this point, a manual rebuild of 
> the index would be required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (PHOENIX-3953) Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction

2017-09-06 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155722#comment-16155722
 ] 

Vincent Poon edited comment on PHOENIX-3953 at 9/6/17 5:17 PM:
---

[~jamestaylor] Is there a way for us to signal that this happened, perhaps by 
setting the indexDisableTimestamp to a special negative value?  At a minimum, I 
think we should add a log line to indicate this.  Otherwise from an operator 
perspective, we would see a disabled index and have to triage why the rebuilder 
didn't fix it.

Perhaps a new index state would make it even clearer


was (Author: vincentpoon):
[~jamestaylor] Is there a way for us to signal that this happened, perhaps by 
setting the indexDisableTimestamp to a special negative value?  At a minimum, I 
think we should add a log line to indicate this.  Otherwise from an operator 
perspective, we would see a disabled index and have to triage why the rebuilder 
didn't fix it.

> Clear INDEX_DISABLED_TIMESTAMP and disable index on compaction
> --
>
> Key: PHOENIX-3953
> URL: https://issues.apache.org/jira/browse/PHOENIX-3953
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
>  Labels: globalMutableSecondaryIndex
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3953_addendum1.patch, PHOENIX-3953.patch, 
> PHOENIX-3953_v2.patch
>
>
> To guard against a compaction occurring (which would potentially clear delete 
> markers and puts that the partial index rebuild process counts on to properly 
> catch up an index with the data table), we should clear the 
> INDEX_DISABLED_TIMESTAMP and mark the index as disabled. This could be done 
> in the post compaction coprocessor hook. At this point, a manual rebuild of 
> the index would be required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4169) Explicitly cap timeout for index disable RPC on compaction

2017-09-12 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4169:
--
Attachment: PHOENIX-4169.master.v2.patch
PHOENIX-4169.0.98.v2.patch

added a log line for when the index disable timestamp is cleared

> Explicitly cap timeout for index disable RPC on compaction
> --
>
> Key: PHOENIX-4169
> URL: https://issues.apache.org/jira/browse/PHOENIX-4169
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Critical
> Attachments: PHOENIX-4169.0.98.patch, PHOENIX-4169.0.98.v2.patch, 
> PHOENIX-4169.master.patch, PHOENIX-4169.master.v2.patch
>
>
> In PHOENIX-3953 we're marking the mutable global index as disabled with an 
> index_disable_timestamp of 0 from the compaction hook.This is a potentially a 
> server-server RPC, and HConnectionManager#setServerSideHConnectionRetries 
> makes it such that the HBase client config on the server side has 10 times 
> the number of retries, lasting hours.
> To avoid a hung coprocessor hook, we should explicitly cap the number of 
> retries here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4214) Scans which write should not block region split or close

2017-09-26 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181554#comment-16181554
 ] 

Vincent Poon commented on PHOENIX-4214:
---

Thanks [~jamestaylor], I'll take a look.  IIRC the WIP was working on 0.98 so 
should be easy to adapt.  I'll try to get a patch up for 0.98 soon.

> Scans which write should not block region split or close
> 
>
> Key: PHOENIX-4214
> URL: https://issues.apache.org/jira/browse/PHOENIX-4214
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4214-4.x-HBase-0.98_v1.patch, 
> PHOENIX-4214.master.v1.patch, splitDuringUpsertSelect_wip.patch
>
>
> PHOENIX-3111 introduced a scan reference counter which is checked during 
> region preSplit and preClose.  However, a steady stream of UPSERT SELECT or 
> DELETE can keep the count above 0 indefinitely, preventing or greatly 
> delaying a region split or close.
> We should try to avoid starvation of the split / close request, and 
> fail/reject queries where appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4238) Add support for salted and shared index tables to IndexScrutinyTool MR

2017-09-26 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4238:
-

 Summary: Add support for salted and shared index tables to 
IndexScrutinyTool MR
 Key: PHOENIX-4238
 URL: https://issues.apache.org/jira/browse/PHOENIX-4238
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Vincent Poon
Assignee: Vincent Poon


The IndexScrutinyTool MR job doesn't work for salted and shared table.  We 
should add support for this, similar to PHOENIX-4233



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (PHOENIX-3815) Only disable indexes on which write failures occurred

2017-09-26 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181579#comment-16181579
 ] 

Vincent Poon edited comment on PHOENIX-3815 at 9/26/17 9:09 PM:


+1, [~jamestaylor]


was (Author: vincentpoon):
+1, @James Taylor

> Only disable indexes on which write failures occurred
> -
>
> Key: PHOENIX-3815
> URL: https://issues.apache.org/jira/browse/PHOENIX-3815
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3815.0.98.v2.patch, 
> PHOENIX-3815.master.v2.patch, PHOENIX-3815.v1.patch, PHOENIX-3815_v3.patch
>
>
> We currently disable all indexes if any of them fail to be written to. We 
> really only should disable the one in which the write failed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-3815) Only disable indexes on which write failures occurred

2017-09-26 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181579#comment-16181579
 ] 

Vincent Poon commented on PHOENIX-3815:
---

+1, @James Taylor

> Only disable indexes on which write failures occurred
> -
>
> Key: PHOENIX-3815
> URL: https://issues.apache.org/jira/browse/PHOENIX-3815
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
> Fix For: 4.12.0
>
> Attachments: PHOENIX-3815.0.98.v2.patch, 
> PHOENIX-3815.master.v2.patch, PHOENIX-3815.v1.patch, PHOENIX-3815_v3.patch
>
>
> We currently disable all indexes if any of them fail to be written to. We 
> really only should disable the one in which the write failed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4233) IndexScrutiny test tool does not work for salted and shared index tables

2017-09-26 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181562#comment-16181562
 ] 

Vincent Poon commented on PHOENIX-4233:
---

 [~jamestaylor] Created PHOENIX-4238 to add this to the MR-based scrutiny tool

> IndexScrutiny test tool does not work for salted and shared index tables
> 
>
> Key: PHOENIX-4233
> URL: https://issues.apache.org/jira/browse/PHOENIX-4233
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: James Taylor
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4233.patch
>
>
> Our IndexScrutiny test-only tool does not handle salted tables or local or 
> view indexes correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PHOENIX-4242) Fix Indexer post-compact hook logging of NPE and TableNotFound

2017-09-28 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-4242:
-

 Summary: Fix Indexer post-compact hook logging of NPE and 
TableNotFound
 Key: PHOENIX-4242
 URL: https://issues.apache.org/jira/browse/PHOENIX-4242
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.12.0
Reporter: Vincent Poon


The post-compact hook in the Indexer seems to log extraneous log messages 
indicating NPE or TableNotFound.  The TableNotFound exceptions seem to indicate 
actual table names prefixed with MERGE or RESTORE, and sometimes suffixed with 
a digit, so perhaps these are views or something similar.
Examples:
2017-09-28 13:35:03,118 WARN  [ctions-1506410238599] index.Indexer - Unable to 
permanently disable indexes being partially rebuild for SYSTEM.SEQUENCE
java.lang.NullPointerException
2017-09-28 10:20:56,406 WARN  [ctions-1506410238415] index.Indexer - Unable to 
permanently disable indexes being partially rebuild for 
MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2
org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table 
undefined. tableName=MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4219) Index gets out of sync on HBase 1.x

2017-09-28 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185148#comment-16185148
 ] 

Vincent Poon commented on PHOENIX-4219:
---

Yea I've been looking and seeing weird things as well
- The test always seems to end with same number of data and index table rows.  
When I examine the bad rows from the scrutiny output, there is one data row 
without an index row, and there is an index row with bad covered col values, so 
probably an extra index row for a given data table row.
- It doesn't seem to be a batching thing, because I get a failure even if I set 
batch size to 1, or get rid of executeBatch() altogether and just use execute() 
for each row.


> Index gets out of sync on HBase 1.x
> ---
>
> Key: PHOENIX-4219
> URL: https://issues.apache.org/jira/browse/PHOENIX-4219
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Priority: Blocker
> Attachments: PHOENIX-4219_test.patch, PHOENIX-4219_test_v2.patch
>
>
> When writing batches in parallel with multiple background threads, it seems 
> the index sometimes gets out of sync.  This only happens on the master and 
> 4.x-HBase-1.2.
> The tests pass for 4.x-HBase-0.98
> See the attached test, which writes with 2 background threads with batch size 
> of 100.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4219) Index gets out of sync on HBase 1.x

2017-09-28 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185178#comment-16185178
 ] 

Vincent Poon commented on PHOENIX-4219:
---

It seems to be a problem handling certain input.
It's more a function of totalRows than nBatches, nRowsPerBatch, nThreads.  If I 
play with any combination of the variables such that totalRows = 1200 , I 
always get a failure, with the id of the orphaned data row always being the 
same.
If totalRows is 1100 or less, the test always seems to pass.

The id is sequential, and the random number generator has a seed of 0.  So the 
input will always be the same from run to run.  Something in the input past 
1200 totalRows is causing things to fail.

However if I use a totalRows greater than 1200, the id of the orphaned row 
changes... (though stays the same frrom run to run keeping totalRows constant)

> Index gets out of sync on HBase 1.x
> ---
>
> Key: PHOENIX-4219
> URL: https://issues.apache.org/jira/browse/PHOENIX-4219
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Priority: Blocker
> Attachments: PHOENIX-4219_test.patch, PHOENIX-4219_test_v2.patch
>
>
> When writing batches in parallel with multiple background threads, it seems 
> the index sometimes gets out of sync.  This only happens on the master and 
> 4.x-HBase-1.2.
> The tests pass for 4.x-HBase-0.98
> See the attached test, which writes with 2 background threads with batch size 
> of 100.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-4263) Add test for partial index rebuild of index on view

2017-09-30 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-4263:
-

Assignee: Vincent Poon

> Add test for partial index rebuild of index on view
> ---
>
> Key: PHOENIX-4263
> URL: https://issues.apache.org/jira/browse/PHOENIX-4263
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Vincent Poon
>
> I don't think we have a test case for this, but we should add one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-4242) Fix Indexer post-compact hook logging of NPE and TableNotFound

2017-09-30 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-4242:
-

Assignee: Vincent Poon

> Fix Indexer post-compact hook logging of NPE and TableNotFound
> --
>
> Key: PHOENIX-4242
> URL: https://issues.apache.org/jira/browse/PHOENIX-4242
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>
> The post-compact hook in the Indexer seems to log extraneous log messages 
> indicating NPE or TableNotFound.  The TableNotFound exceptions seem to 
> indicate actual table names prefixed with MERGE or RESTORE, and sometimes 
> suffixed with a digit, so perhaps these are views or something similar.
> Examples:
> 2017-09-28 13:35:03,118 WARN  [ctions-1506410238599] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for SYSTEM.SEQUENCE
> java.lang.NullPointerException
> 2017-09-28 10:20:56,406 WARN  [ctions-1506410238415] index.Indexer - Unable 
> to permanently disable indexes being partially rebuild for 
> MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2
> org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table 
> undefined. tableName=MERGE_PLATFORM_ENTITY.PLATFORM_IMMUTABLE_ENTITY_DATA2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4238) MR IndexScrutinyTool break with salted tables and indexes on views

2017-09-29 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16186189#comment-16186189
 ] 

Vincent Poon commented on PHOENIX-4238:
---

lgtm, thanks [~churromorales]

> MR IndexScrutinyTool break with salted tables and indexes on views
> --
>
> Key: PHOENIX-4238
> URL: https://issues.apache.org/jira/browse/PHOENIX-4238
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: churro morales
> Attachments: PHOENIX-4238.patch, PHOENIX-4238.v1.patch
>
>
> The IndexScrutinyTool MR job doesn't work for salted and shared table.  We 
> should add support for this, similar to PHOENIX-4233



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4219) Index gets out of sync on HBase 1.x

2017-09-29 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16186748#comment-16186748
 ] 

Vincent Poon commented on PHOENIX-4219:
---

Still haven't figured this one out, but it seems to be a problem with the 
scrutiny index query.  If you run with totalRows=1200, it always fails on id 
704, complaining that comment_count is null
But when I query the index table manually for that row, it looks fine
In fact, if I take the exact same index table query from the scrutiny, and 
append it with where \":ORGANIZATION_ID\"='704' , the comment_count looks fine

So it seems to be a problem while iterating through the result set of the index 
table query.  At a certain point, it comes back with a null for the 
comment_count, despite what's actually in the table.
Perhaps there's something wrong with the batching of the results.

> Index gets out of sync on HBase 1.x
> ---
>
> Key: PHOENIX-4219
> URL: https://issues.apache.org/jira/browse/PHOENIX-4219
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Priority: Blocker
> Attachments: PHOENIX-4219_test.patch, PHOENIX-4219_test_v2.patch
>
>
> When writing batches in parallel with multiple background threads, it seems 
> the index sometimes gets out of sync.  This only happens on the master and 
> 4.x-HBase-1.2.
> The tests pass for 4.x-HBase-0.98
> See the attached test, which writes with 2 background threads with batch size 
> of 100.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PHOENIX-4238) Add support for salted and shared index tables to IndexScrutinyTool MR

2017-09-27 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned PHOENIX-4238:
-

Assignee: churro morales  (was: Vincent Poon)

> Add support for salted and shared index tables to IndexScrutinyTool MR
> --
>
> Key: PHOENIX-4238
> URL: https://issues.apache.org/jira/browse/PHOENIX-4238
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: churro morales
>
> The IndexScrutinyTool MR job doesn't work for salted and shared table.  We 
> should add support for this, similar to PHOENIX-4233



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PHOENIX-4214) Scans which write should not block region split or close

2017-09-27 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated PHOENIX-4214:
--
Attachment: PHOENIX-4214.master.v2.patch
PHOENIX-4214-0.98-v2.patch

[~jamestaylor] Attaching an updated v2 patches for Master and 0.98.  Tests pass 
on 4.x-HBase-1.1 and 4.x-HBase-0.98

Main changes were 
- moving the scanReferenceCounter increment back into the try/finally block to 
ensure the scanner gets closed, by introducing a new boolean.
- setting the client retries limit properly, and tweaking the test timeouts

BTW, with [~samarthjain] 's help, I found that the HBase client does retry 
transparently when the exception is thrown from doPostScannerOpen when a region 
is splitting/closing, as long as the client retry settings are set properly.

> Scans which write should not block region split or close
> 
>
> Key: PHOENIX-4214
> URL: https://issues.apache.org/jira/browse/PHOENIX-4214
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.12.0
>Reporter: Vincent Poon
>Assignee: Vincent Poon
> Attachments: PHOENIX-4214-0.98-v2.patch, 
> PHOENIX-4214-4.x-HBase-0.98_v1.patch, PHOENIX-4214.master.v1.patch, 
> PHOENIX-4214.master.v2.patch, splitDuringUpsertSelect_wip.patch
>
>
> PHOENIX-3111 introduced a scan reference counter which is checked during 
> region preSplit and preClose.  However, a steady stream of UPSERT SELECT or 
> DELETE can keep the count above 0 indefinitely, preventing or greatly 
> delaying a region split or close.
> We should try to avoid starvation of the split / close request, and 
> fail/reject queries where appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   4   5   >