[jira] [Created] (HBASE-28567) Race condition causes MetaRegionLocationCache to never set watcher to populate meta location

2024-05-05 Thread Vincent Poon (Jira)
Vincent Poon created HBASE-28567:


 Summary: Race condition causes MetaRegionLocationCache to never 
set watcher to populate meta location
 Key: HBASE-28567
 URL: https://issues.apache.org/jira/browse/HBASE-28567
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.5.8, 3.0.0
Reporter: Vincent Poon
Assignee: Vincent Poon


{{ZKWatcher#getMetaReplicaNodesAndWatchChildren()}} attempts to set a a watch 
on the base /hbase znode children using 
{{ZKUtil.listChildrenAndWatchForNewChildren()}}, but if the node does not 
exist, no watch gets set.

We've seen this in the test container Trino uses over at 
[trino/21569|https://github.com/trinodb/trino/pull/21569] , where ZK, master, 
and RS are all run in the same container.
The fix is to throw if the node does not exist so that 
{{MetaRegionLocationCache}} can retry until the node gets created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-20034) Make periodic flusher delay configurable

2018-02-20 Thread Vincent Poon (JIRA)
Vincent Poon created HBASE-20034:


 Summary: Make periodic flusher delay configurable
 Key: HBASE-20034
 URL: https://issues.apache.org/jira/browse/HBASE-20034
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 3.0.0
Reporter: Vincent Poon
Assignee: Vincent Poon


PeriodicMemstoreFlusher is currently configured to flush with a random delay of 
up to 5 minutes.  Make this configurable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-18060) Backport to branch-1 HBASE-9774 HBase native metrics and metric collection for coprocessors

2017-05-16 Thread Vincent Poon (JIRA)
Vincent Poon created HBASE-18060:


 Summary: Backport to branch-1 HBASE-9774 HBase native metrics and 
metric collection for coprocessors
 Key: HBASE-18060
 URL: https://issues.apache.org/jira/browse/HBASE-18060
 Project: HBase
  Issue Type: New Feature
Affects Versions: 1.4.0, 1.3.2, 1.5.0
Reporter: Vincent Poon
Assignee: Vincent Poon


I'd like to explore backporting HBASE-9774 to branch-1, as the ability for 
coprocessors to report custom metrics through HBase is useful for us, and if we 
have coprocessors use the native API, a re-write won't be necessary after an 
upgrade to 2.0.

The main issues I see so far are:
- the usage of Java 8 language features.  Seems we can work around this as most 
of it is syntactic sugar
- dropwizard 3.1.2 in Master.  branch-1 is still on yammer metrics 2.2.  Not 
sure if these can coexist just for this feature



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18026) ProtobufUtil seems to do extra array copying

2017-05-10 Thread Vincent Poon (JIRA)
Vincent Poon created HBASE-18026:


 Summary: ProtobufUtil seems to do extra array copying
 Key: HBASE-18026
 URL: https://issues.apache.org/jira/browse/HBASE-18026
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 1.3.2
Reporter: Vincent Poon
Priority: Minor


In ProtobufUtil, the protobuf fields are copied into an array using 
toByteArray().  These are then passed into the KeyValue constructor which does 
another copy.

It seems like we can avoid a copy here by using 
HBaseZeroCopyByteString#zeroCopyGetBytes() ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17341) Add a timeout during replication endpoint termination

2016-12-19 Thread Vincent Poon (JIRA)
Vincent Poon created HBASE-17341:


 Summary: Add a timeout during replication endpoint termination
 Key: HBASE-17341
 URL: https://issues.apache.org/jira/browse/HBASE-17341
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.2.4, 0.98.23, 1.1.7, 2.0.0, 1.3.0, 1.4.0
Reporter: Vincent Poon
Priority: Critical


In ReplicationSource#terminate(), a Future is obtained from 
ReplicationEndpoint#stop().  Future.get() is then called, but can potentially 
hang there if something went wrong in the endpoint stop().

Hanging there has serious implications, because the thread could potentially be 
the ZK event thread (e.g. watcher calls ReplicationSourceManager#removePeer() 
-> ReplicationSource#terminate() -> blocked).  This means no other events in 
the ZK event queue will get processed, which for HBase means other ZK watches 
such as replication watch notifications, snapshot watch notifications, even 
RegionServer shutdown will all get blocked.

The short term fix addressed here is to simply add a timeout for Future.get().  
But the severe consequences seen here perhaps suggest a broader refactoring of 
the ZKWatcher usage in HBase is in order, to protect against situations like 
this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17328) Properly dispose of looped replication peers

2016-12-16 Thread Vincent Poon (JIRA)
Vincent Poon created HBASE-17328:


 Summary: Properly dispose of looped replication peers
 Key: HBASE-17328
 URL: https://issues.apache.org/jira/browse/HBASE-17328
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.98.23, 2.0.0, 1.4.0
Reporter: Vincent Poon


When adding a looped replication peer (clusterId == peerClusterId), the 
following code terminates the replication source thread, but since the source 
manager still holds a reference, WALs continue to get enqueued, and never get 
cleaned because they're stuck in the queue, leading to an unsustainable 
buildup.  Furthermore, the replication statistics thread will continue to print 
statistics for the terminated source.

{code}
if (clusterId.equals(peerClusterId) && 
!replicationEndpoint.canReplicateToSameCluster()) {
  this.terminate("ClusterId " + clusterId + " is replicating to itself: 
peerClusterId "
  + peerClusterId + " which is not allowed by ReplicationEndpoint:"
  + replicationEndpoint.getClass().getName(), null, false);
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15995) Separate replication WAL reading from shipping

2016-06-08 Thread Vincent Poon (JIRA)
Vincent Poon created HBASE-15995:


 Summary: Separate replication WAL reading from shipping
 Key: HBASE-15995
 URL: https://issues.apache.org/jira/browse/HBASE-15995
 Project: HBase
  Issue Type: Sub-task
  Components: Replication
Affects Versions: 2.0.0
Reporter: Vincent Poon


Currently ReplicationSource reads edits from the WAL and ships them in the same 
thread.

By breaking out the reading from the shipping, we can introduce greater 
parallelism and lay the foundation for further refactoring to a pipelined, 
streaming model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)