[jira] [Commented] (HBASE-9855) evictBlocksByHfileName improvement for bucket cache

2013-10-29 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808166#comment-13808166
 ] 

Alex Feinberg commented on HBASE-9855:
--

Some comments from me (original author of this code in 89-fb):

1) This should be annotated as Threadsafe
2) Nit-pick (this is my own typo): "comparator specified when the class 
instance was constructor" -> "when the class instance was _constructed_"

Addressing Ted's comments:

1) [~te...@apache.org] - re: make DefaultValueSetFactory private -- Yes since 
it's static inner class it might as well be private.

2) Depends -- you need to do ImmustableList.copyOf() for iteration. This is 
generally the contract of most other collections in j.u which would thrown 
ConcurrentModificationException. Returning the results as a set can make 
membership tests efficient. ImmutableList.copyOf is used for iteration as that 
is the cheapest way to make a copy.

Other:

[~xieliang007] Can you look through the findbugs -- I'd think they are mostly 
red herrings, but I'd double check if the equals/hashCode() ones are relevant. 

Thanks for porting this over!

> evictBlocksByHfileName improvement for bucket cache
> ---
>
> Key: HBASE-9855
> URL: https://issues.apache.org/jira/browse/HBASE-9855
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HBase-9855.txt, HBase-9855-v2.txt
>
>
> In deed, it comes from fb's l2 cache by [~avf]'s nice work,  i just did a 
> simple backport here. It could improve a linear-time search through the whole 
> cache map into a log-access-time map search.
> I did a small bench, showed it brings a bit gc overhead, but considering the 
> evict on close triggered by frequent compaction activity, seems reasonable?
> and i thought bring a "evictOnClose" config  into BucketCache ctor and only 
> put/remove the new index map while evictOnClose is true, seems this value 
> could be set by each family schema, but BucketCache is a global instance not 
> per each family, so just ignore it rightnow...



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9855) evictBlocksByHfileName improvement for bucket cache

2013-10-29 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13807741#comment-13807741
 ] 

Alex Feinberg commented on HBASE-9855:
--

Yes, this was a big improvement. [~xieliang007] -- I am also talking about 
getting JVM GC settings that I used so far. 

Feel free to put me on code review for this.

- af

> evictBlocksByHfileName improvement for bucket cache
> ---
>
> Key: HBASE-9855
> URL: https://issues.apache.org/jira/browse/HBASE-9855
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HBase-9855.txt
>
>
> In deed, it comes from fb's l2 cache by [~avf]'s nice work,  i just did a 
> simple backport here. It could improve a linear-time search through the whole 
> cache map into a log-access-time map search.
> I did a small bench, showed it brings a bit gc overhead, but considering the 
> evict on close triggered by frequent compaction activity, seems reasonable?
> and i thought bring a "evictOnClose" config  into BucketCache ctor and only 
> put/remove the new index map while evictOnClose is true, seems this value 
> could be set by each family schema, but BucketCache is a global instance not 
> per each family, so just ignore it rightnow...



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8894) Forward port compressed l2 cache from 0.89fb

2013-10-23 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803923#comment-13803923
 ] 

Alex Feinberg commented on HBASE-8894:
--

Hi [~xieliang007]

This configuration creates way too memory pressure. I'd also suggest using 
Java7 (this is the setup that I used at FB). 

I'll try to come up with actual JVM options I used, but I used:

- Total JVM heap size: ~14gb (Xmx and Xms)
space available to the l1 cache (the regular block cache): 0.4-0.5 (we have 
never went above 0.5 as that caused too many problems)
- New gen size: 4gb (I *think*, not too sure)
- Direct memory: 10gb (I am roughly scaling down to your machine -- you want to 
leave some memory available to OS)
- I gave 0.9% of direct memory to the L2 cache

I did use CMS, but I don't remember the CMS initiating ratio. G1 might also work

I will try to find the exact JVM configuration

How much memory did you give to memstore?

Thanks!
- af


> Forward port compressed l2 cache from 0.89fb
> 
>
> Key: HBASE-8894
> URL: https://issues.apache.org/jira/browse/HBASE-8894
> Project: HBase
>  Issue Type: New Feature
>Reporter: stack
>Assignee: Liang Xie
>Priority: Critical
> Attachments: HBASE-8894-0.94-v1.txt, HBASE-8894-0.94-v2.txt
>
>
> Forward port Alex's improvement on hbase-7407 from 0.89-fb branch:
> {code}
>   1 r1492797 | liyin | 2013-06-13 11:18:20 -0700 (Thu, 13 Jun 2013) | 43 lines
>   2
>   3 [master] Implements a secondary compressed cache (L2 cache)
>   4
>   5 Author: avf
>   6
>   7 Summary:
>   8 This revision implements compressed and encoded second-level cache with 
> off-heap
>   9 (and optionally on-heap) storage and a bucket-allocator based on 
> HBASE-7404.
>  10
>  11 BucketCache from HBASE-7404 is extensively modified to:
>  12
>  13 * Only handle byte arrays (i.e., no more serialization/deserialization 
> within)
>  14 * Remove persistence support for the time being
>  15 * Keep an  index of hfilename to blocks for efficient eviction on close
>  16
>  17 A new interface (L2Cache) is introduced in order to separate it from the 
> current
>  18 implementation. The L2 cache is then integrated into the classes that 
> handle
>  19 reading from and writing to HFiles to allow cache-on-write as well as
>  20 cache-on-read. Metrics for the L2 cache are integrated into 
> RegionServerMetrics
>  21 much in the same fashion as metrics for the existing (L2) BlockCache.
>  22
>  23 Additionally, CacheConfig class is re-refactored to configure the L2 
> cache,
>  24 replace multile constructors with a Builder, as well as replace static 
> methods
>  25 for instantiating the caches with abstract factories (with singleton
>  26 implementations for both the existing LruBlockCache and the newly 
> introduced
>  27 BucketCache based L2 cache)
>  28
>  29 Test Plan:
>  30 1) Additional unit tests
>  31 2) Stress test on a single devserver
>  32 3) Test on a single-node in shadow cluster
>  33 4) Test on a whole shadow cluster
>  34
>  35 Revert Plan:
>  36
>  37 Reviewers: liyintang, aaiyer, rshroff, manukranthk, adela
>  38
>  39 Reviewed By: liyintang
>  40
>  41 CC: gqchen, hbase-eng@
>  42
>  43 Differential Revision: https://phabricator.fb.com/D837264
>  44
>  45 Task ID: 2325295
>  7 
>   6 r1492340 | liyin | 2013-06-12 11:36:03 -0700 (Wed, 12 Jun 2013) | 21 lines
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8894) Forward port compressed l2 cache from 0.89fb

2013-10-23 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803249#comment-13803249
 ] 

Alex Feinberg commented on HBASE-8894:
--

Hi [~saint@gmail.com] [~xieliang007],

I was going to reply but didn't have a chance.

1) What JVM settings are you using? These are very important. I do not recall 
seeing many full GCs. Let me know what parameters you pass to the JVM in 
regards to memory and GC.

2) What settings are you using for the L2 cache as well as the normal L1 cache? 
Can you paste the settings from the needed config files?

3) Are you using Jdk 6 or Jdk 7?

4) What about writes? This should change write performance quite a bit -- as 
serialization costs are also incurred on writes.

Thanks!
- af

> Forward port compressed l2 cache from 0.89fb
> 
>
> Key: HBASE-8894
> URL: https://issues.apache.org/jira/browse/HBASE-8894
> Project: HBase
>  Issue Type: New Feature
>Reporter: stack
>Assignee: Liang Xie
>Priority: Critical
> Attachments: HBASE-8894-0.94-v1.txt, HBASE-8894-0.94-v2.txt
>
>
> Forward port Alex's improvement on hbase-7407 from 0.89-fb branch:
> {code}
>   1 r1492797 | liyin | 2013-06-13 11:18:20 -0700 (Thu, 13 Jun 2013) | 43 lines
>   2
>   3 [master] Implements a secondary compressed cache (L2 cache)
>   4
>   5 Author: avf
>   6
>   7 Summary:
>   8 This revision implements compressed and encoded second-level cache with 
> off-heap
>   9 (and optionally on-heap) storage and a bucket-allocator based on 
> HBASE-7404.
>  10
>  11 BucketCache from HBASE-7404 is extensively modified to:
>  12
>  13 * Only handle byte arrays (i.e., no more serialization/deserialization 
> within)
>  14 * Remove persistence support for the time being
>  15 * Keep an  index of hfilename to blocks for efficient eviction on close
>  16
>  17 A new interface (L2Cache) is introduced in order to separate it from the 
> current
>  18 implementation. The L2 cache is then integrated into the classes that 
> handle
>  19 reading from and writing to HFiles to allow cache-on-write as well as
>  20 cache-on-read. Metrics for the L2 cache are integrated into 
> RegionServerMetrics
>  21 much in the same fashion as metrics for the existing (L2) BlockCache.
>  22
>  23 Additionally, CacheConfig class is re-refactored to configure the L2 
> cache,
>  24 replace multile constructors with a Builder, as well as replace static 
> methods
>  25 for instantiating the caches with abstract factories (with singleton
>  26 implementations for both the existing LruBlockCache and the newly 
> introduced
>  27 BucketCache based L2 cache)
>  28
>  29 Test Plan:
>  30 1) Additional unit tests
>  31 2) Stress test on a single devserver
>  32 3) Test on a single-node in shadow cluster
>  33 4) Test on a whole shadow cluster
>  34
>  35 Revert Plan:
>  36
>  37 Reviewers: liyintang, aaiyer, rshroff, manukranthk, adela
>  38
>  39 Reviewed By: liyintang
>  40
>  41 CC: gqchen, hbase-eng@
>  42
>  43 Differential Revision: https://phabricator.fb.com/D837264
>  44
>  45 Task ID: 2325295
>  7 
>   6 r1492340 | liyin | 2013-06-12 11:36:03 -0700 (Wed, 12 Jun 2013) | 21 lines
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8894) Forward port compressed l2 cache from 0.89fb

2013-10-01 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783441#comment-13783441
 ] 

Alex Feinberg commented on HBASE-8894:
--

Vladimir,

Those are very legitimate issues:

1) One approach around the on-heap keys (not an issue in my setup as I was not 
using file based cache, but certainly an issue with fusion IO) could be to use 
a hash table (with extension array) or (in cases where the block index is 
expected to not fit in ram) a b-tree over direct/memory-mapped byte buffers. 
This would be tricky to implement but it has been done:
https://github.com/jankotek/MapDB/tree/master/src/main/java/org/mapdb

2) Eviction algorithm is indeed primitive (and also high on priority of things 
to fix), but as far as I re-call eviction ( freeSpace() here -- 
https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java#L488-625
 )  only blocks draining to the ioEngine -- in other words, while the cache 
space is being free you can still use read from the cache (this uses striped 
locking) -- and writes will enter RAMCache and be queued for the ioEngine ( 
https://github.com/apache/hbase/blob/0.89-fb/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java#L648-660
 ). All IOEngine draining threads will be blocked during eviction, however -- 
that may be more problematic for file-based caches -- long draining may cause a 
lot of entries to be built up in RAMCache. If the queue is full the threads 
will be blocked, but you can configure to wait up to a maximum amount -- but 
this doesn't affect actual writes to HBase, as with L2Cache writes only happen 
during flushes (i.e., flushes will take longer if they happen during eviction).

Thanks,
- af

> Forward port compressed l2 cache from 0.89fb
> 
>
> Key: HBASE-8894
> URL: https://issues.apache.org/jira/browse/HBASE-8894
> Project: HBase
>  Issue Type: New Feature
>Reporter: stack
>Assignee: Liang Xie
>Priority: Critical
> Fix For: 0.98.0
>
>
> Forward port Alex's improvement on hbase-7407 from 0.89-fb branch:
> {code}
>   1 r1492797 | liyin | 2013-06-13 11:18:20 -0700 (Thu, 13 Jun 2013) | 43 lines
>   2
>   3 [master] Implements a secondary compressed cache (L2 cache)
>   4
>   5 Author: avf
>   6
>   7 Summary:
>   8 This revision implements compressed and encoded second-level cache with 
> off-heap
>   9 (and optionally on-heap) storage and a bucket-allocator based on 
> HBASE-7404.
>  10
>  11 BucketCache from HBASE-7404 is extensively modified to:
>  12
>  13 * Only handle byte arrays (i.e., no more serialization/deserialization 
> within)
>  14 * Remove persistence support for the time being
>  15 * Keep an  index of hfilename to blocks for efficient eviction on close
>  16
>  17 A new interface (L2Cache) is introduced in order to separate it from the 
> current
>  18 implementation. The L2 cache is then integrated into the classes that 
> handle
>  19 reading from and writing to HFiles to allow cache-on-write as well as
>  20 cache-on-read. Metrics for the L2 cache are integrated into 
> RegionServerMetrics
>  21 much in the same fashion as metrics for the existing (L2) BlockCache.
>  22
>  23 Additionally, CacheConfig class is re-refactored to configure the L2 
> cache,
>  24 replace multile constructors with a Builder, as well as replace static 
> methods
>  25 for instantiating the caches with abstract factories (with singleton
>  26 implementations for both the existing LruBlockCache and the newly 
> introduced
>  27 BucketCache based L2 cache)
>  28
>  29 Test Plan:
>  30 1) Additional unit tests
>  31 2) Stress test on a single devserver
>  32 3) Test on a single-node in shadow cluster
>  33 4) Test on a whole shadow cluster
>  34
>  35 Revert Plan:
>  36
>  37 Reviewers: liyintang, aaiyer, rshroff, manukranthk, adela
>  38
>  39 Reviewed By: liyintang
>  40
>  41 CC: gqchen, hbase-eng@
>  42
>  43 Differential Revision: https://phabricator.fb.com/D837264
>  44
>  45 Task ID: 2325295
>  7 
>   6 r1492340 | liyin | 2013-06-12 11:36:03 -0700 (Wed, 12 Jun 2013) | 21 lines
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8894) Forward port compressed l2 cache from 0.89fb

2013-10-01 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783370#comment-13783370
 ] 

Alex Feinberg commented on HBASE-8894:
--

Keep in mind that this is itself based on HBASE-7404. Since I wanted to get 
this out the door quickly, I kept some of the package/class names similar to 
original HBase-7404 -- so we'd want to rename them. 

The big differences is that I've removed any kind of SerDe and changed the 
flow. Now the flow is:

Read:

1. Check if item is in smaller L1 block cache (traditional BlockCache in JVM 
heap)
2. If not, check if it's in L2 cache
3. Otherwise, go to disk

Flush:

1. Write the already compressed and serialized data to L2 cache along with disk.


Basically, here L2 cache replaces the OS page cache and allows for a smaller L1 
cache. This should help performance in several ways:

1. Compared to page-cache, there's ability (using an map I keep) to evict all 
the blocks associated with a given file when it's compacted. There's also no 
page cache pollution as a result of compaction reads, or HDFS replication 
traffic (already a 3X gain in efficiency). The latter, however, is also true 
for HBASE-7404.

2. Compared to HBASE-7404, there's ability to keep very hot blocks (both data 
and meta blocks) in the regular BlockCache, which becomes the L1 cache. That 
avoids serialization costs for those blocks, unlike only keeping 
meta-blocks/keeping all blocks in the compressed/serialized cache

Basically this gives you a "better page cache" (potentially persistent if other 
IO engines are introduced, finer grained evictions/control than fadvise, 
etc...). The proper ratio of L1 to L2 cache (including the direct memory 
available for JVM's use vs. JVM  GC'd heap size) is still to be determined, but 
there's some math that can be done on this based on things like expected cache 
hit ratios and costs of hits/misses to different caches.

There's also few other low hanging fruits that could be changed in my diff:

* Sending blocks evicted from L1 directly to L2
* Evicting blocks from the L2 cache upon promotion to the L1 cache
* Porting and testing the file based IO engine (e.g., for fusionIO cards)

Thanks!
- af

> Forward port compressed l2 cache from 0.89fb
> 
>
> Key: HBASE-8894
> URL: https://issues.apache.org/jira/browse/HBASE-8894
> Project: HBase
>  Issue Type: New Feature
>Reporter: stack
>Assignee: Liang Xie
>Priority: Critical
> Fix For: 0.98.0
>
>
> Forward port Alex's improvement on hbase-7407 from 0.89-fb branch:
> {code}
>   1 r1492797 | liyin | 2013-06-13 11:18:20 -0700 (Thu, 13 Jun 2013) | 43 lines
>   2
>   3 [master] Implements a secondary compressed cache (L2 cache)
>   4
>   5 Author: avf
>   6
>   7 Summary:
>   8 This revision implements compressed and encoded second-level cache with 
> off-heap
>   9 (and optionally on-heap) storage and a bucket-allocator based on 
> HBASE-7404.
>  10
>  11 BucketCache from HBASE-7404 is extensively modified to:
>  12
>  13 * Only handle byte arrays (i.e., no more serialization/deserialization 
> within)
>  14 * Remove persistence support for the time being
>  15 * Keep an  index of hfilename to blocks for efficient eviction on close
>  16
>  17 A new interface (L2Cache) is introduced in order to separate it from the 
> current
>  18 implementation. The L2 cache is then integrated into the classes that 
> handle
>  19 reading from and writing to HFiles to allow cache-on-write as well as
>  20 cache-on-read. Metrics for the L2 cache are integrated into 
> RegionServerMetrics
>  21 much in the same fashion as metrics for the existing (L2) BlockCache.
>  22
>  23 Additionally, CacheConfig class is re-refactored to configure the L2 
> cache,
>  24 replace multile constructors with a Builder, as well as replace static 
> methods
>  25 for instantiating the caches with abstract factories (with singleton
>  26 implementations for both the existing LruBlockCache and the newly 
> introduced
>  27 BucketCache based L2 cache)
>  28
>  29 Test Plan:
>  30 1) Additional unit tests
>  31 2) Stress test on a single devserver
>  32 3) Test on a single-node in shadow cluster
>  33 4) Test on a whole shadow cluster
>  34
>  35 Revert Plan:
>  36
>  37 Reviewers: liyintang, aaiyer, rshroff, manukranthk, adela
>  38
>  39 Reviewed By: liyintang
>  40
>  41 CC: gqchen, hbase-eng@
>  42
>  43 Differential Revision: https://phabricator.fb.com/D837264
>  44
>  45 Task ID: 2325295
>  7 
>   6 r1492340 | liyin | 2013-06-12 11:36:03 -0700 (Wed, 12 Jun 2013) | 21 lines
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-8237) Integrate HDFS request profiling with HBase request profiling

2013-07-01 Thread Alex Feinberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Feinberg updated HBASE-8237:
-

Assignee: Liyin Tang  (was: Alex Feinberg)

> Integrate HDFS request profiling with HBase request profiling
> -
>
> Key: HBASE-8237
> URL: https://issues.apache.org/jira/browse/HBASE-8237
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.89-fb
>Reporter: Alex Feinberg
>Assignee: Liyin Tang
> Fix For: 0.89-fb
>
>
> Since the building blocks to retrieve the RegionServer/DataNode profiling 
> data is done (in Facebook's HDFS branch -- the changes are/will be posted to 
> Github soon), it would be great to integrate them together, so that the HBase 
> client can not only get the RegionServer metrics but also the DataNode 
> status. It will offer the client a much clear view from end to end 
> perspective including the disk/network level detail information for each 
> request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8237) Integrate HDFS request profiling with HBase request profiling

2013-04-01 Thread Alex Feinberg (JIRA)
Alex Feinberg created HBASE-8237:


 Summary: Integrate HDFS request profiling with HBase request 
profiling
 Key: HBASE-8237
 URL: https://issues.apache.org/jira/browse/HBASE-8237
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.89-fb
Reporter: Alex Feinberg
Assignee: Alex Feinberg
 Fix For: 0.89-fb


Since the building blocks to retrieve the RegionServer/DataNode profiling data 
is done (in Facebook's HDFS branch -- the changes are/will be posted to Github 
soon), it would be great to integrate them together, so that the HBase client 
can not only get the RegionServer metrics but also the DataNode status. It will 
offer the client a much clear view from end to end perspective including the 
disk/network level detail information for each request.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-12-04 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510158#comment-13510158
 ] 

Alex Feinberg commented on HBASE-5991:
--

Oh great! Didn't know there was a hackathon going on.

I actually looked at curator code, but found it a bit overkill for this 
specific use case (particularly because we already had an implementation of the 
recovery logic in RecoveringZooKeeper -- so we'd either have to migrate 
wholesale or keep two implementations of the same code). I did borrow a few 
ideas from there (even though I didn't follow the exact logic used, I believe), 
however, so it wasn't purely from the wiki + scratch. 

After I wrote this patch, we've also open sourced a library that Puma and 
several other apps use to handle ZK. They use a slightly different version of 
RecoveringZooKeeper, however, that doesn't embed additional information into 
the data (like we do).

https://github.com/facebook/jcommon/tree/master/zookeeper/src/main/java/com/facebook/zookeeper

There are implementations of different recipes there as well. I have no strong 
preference on which part is better, there's a lot I like about curator (I would 
seriously consider using it for something I start from scratch). I'd just avoid 
having multiple implementations of the same ZK abstraction in the codebase.

One approach could be to just implement the interfaces with curator and then 
run this through the unit tests. 

Good luck! Feel free to put me on the diff(s). I am even more excited about 
what could now be done on top of these abstractions.

- af

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
> Fix For: 0.89-fb
>
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-12-04 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510103#comment-13510103
 ] 

Alex Feinberg commented on HBASE-5991:
--

[~enis] Nice, thanks for jumping into this! Make sure to sync up with 
[~saint@gmail.com] -- he also wanted to work on protobuf conversion/trunk 
patch.


> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
> Fix For: 0.89-fb
>
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-11-07 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492584#comment-13492584
 ] 

Alex Feinberg commented on HBASE-5991:
--

Hi Matteo,

It's committed to 89-fb. Stack is working to port this to trunk. 

Re: asynchronous. I'll have to take a look at trunk first, but couldn't unlock 
be done using a callback/ListenableFuture (in which case the unlock will be in 
both .onFailure() and .onSuccess())?

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6508) Filter out edits at log split time

2012-08-08 Thread Alex Feinberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Feinberg resolved HBASE-6508.
--

Resolution: Fixed

Done. Will be merged to 89-fb overnight. 

> Filter out edits at log split time
> --
>
> Key: HBASE-6508
> URL: https://issues.apache.org/jira/browse/HBASE-6508
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver, wal
>Affects Versions: 0.89-fb
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
> Fix For: 0.89-fb
>
>
> At log splitting time, we can filter out many edits if we have a conservative 
> estimate of what was saved last in each region.
> This patch does the following:
> 1) When a region server flushes a MemStore to HFile, store the last flushed 
> sequence id for the region in a map.
> 2) Send the map to master it as a part of the region server report.
> 3) Adds an RPC call in HMasterRegionInterface to allow a region server to 
> query the last last flushed sequence id for a region.
> 4) Skips any log entry with sequence id lower than last flushed sequence id 
> for the region during log split time.
> 5) When a region is removed from a region server, removed the the entry for 
> that region from the map, so that it isn't sent during the next report.
> This can reduce downtime when a regionserver goes down quite a bit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-08-03 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428382#comment-13428382
 ] 

Alex Feinberg commented on HBASE-5991:
--

(Please disregard last comment, wrong diff linked.)

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6508) Filter out edits at log split time

2012-08-03 Thread Alex Feinberg (JIRA)
Alex Feinberg created HBASE-6508:


 Summary: Filter out edits at log split time
 Key: HBASE-6508
 URL: https://issues.apache.org/jira/browse/HBASE-6508
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver, wal
Affects Versions: 0.89-fb
Reporter: Alex Feinberg
Assignee: Alex Feinberg
 Fix For: 0.89-fb


At log splitting time, we can filter out many edits if we have a conservative 
estimate of what was saved last in each region.

This patch does the following:

1) When a region server flushes a MemStore to HFile, store the last flushed 
sequence id for the region in a map.

2) Send the map to master it as a part of the region server report.

3) Adds an RPC call in HMasterRegionInterface to allow a region server to query 
the last last flushed sequence id for a region.

4) Skips any log entry with sequence id lower than last flushed sequence id for 
the region during log split time.

5) When a region is removed from a region server, removed the the entry for 
that region from the map, so that it isn't sent during the next report.

This can reduce downtime when a regionserver goes down quite a bit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-08-02 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427796#comment-13427796
 ] 

Alex Feinberg commented on HBASE-5991:
--

Integrated and fully working. Will add Javadoc and put up a diff shortly.

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-08-01 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427143#comment-13427143
 ] 

Alex Feinberg commented on HBASE-5991:
--

Unit test with custom timeout passing. Now working to integrate this and 
preparing a diff.

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-27 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424190#comment-13424190
 ] 

Alex Feinberg commented on HBASE-5991:
--

All unit tests for exclusion functionality passing. Few issues remaining in 
handling custom specified timeouts. Will iron out and post a diff early next 
week.

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-26 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423686#comment-13423686
 ] 

Alex Feinberg commented on HBASE-5991:
--

Got tests for write lock passing (verifying that write locks excludes other 
writers). Now writing test for read locks (verifying that write locks exclude 
readers, but that readers do not exclude other reads). 

After that the tasks are to integrate misc functionality (printing information 
on lock owners) into the code, clean up, and then replace DistributedLock with 
WriteLock and run full end-to-end tests. Will put up a diff once this is done. 


> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-26 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422909#comment-13422909
 ] 

Alex Feinberg commented on HBASE-5991:
--

Mostly done in terms of implementing the locks itself based on the recipe (with 
recoverable zooKeeper). Should have this integrated into HMaster (in place of 
my DistributedLock code) and have a diff ready soon.

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-25 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422437#comment-13422437
 ] 

Alex Feinberg commented on HBASE-5991:
--

Working on this right now.

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-23 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421072#comment-13421072
 ] 

Alex Feinberg commented on HBASE-5991:
--

Just spoke to Liyin about this -- I'll work on this week and will post an 
update (and hopefully a diff) by Thursday. 


> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-23 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421070#comment-13421070
 ] 

Alex Feinberg commented on HBASE-5991:
--

Hi Jesse,

1) Re: progress -- I have another issue I am working on this week (related to 
log splitting) -- but let me see if I can shuffle things around and get this 
finished this week (I'll let you know an ETA): I'll let you know (at the 
latest) sometime tomorrow (hopefully earlier) if I can get this done this week. 
If not, I'll let you know so you could work on this.

2) Re: locking the table to read only -- Sequential locks let us introduce 
read-write lock for metadata, so I think it will also be possible to introduce 
a write lock for data itself. Good suggestion.

Thanks,
- Alex

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-07 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408832#comment-13408832
 ] 

Alex Feinberg commented on HBASE-5991:
--

Hi Ted,

Sorry, I haven't followed up on this -- I have been busy.

Yes, I still intend to do work on this. Unless you've started working on it, I 
can finish it: I've already started and have a design in mind.

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-07-07 Thread Alex Feinberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Feinberg reassigned HBASE-5991:


Assignee: Alex Feinberg

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-05-12 Thread Alex Feinberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Feinberg reassigned HBASE-5991:


Assignee: Alex Feinberg

> Introduce sequential ZNode based read/write locks 
> --
>
> Key: HBASE-5991
> URL: https://issues.apache.org/jira/browse/HBASE-5991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Alex Feinberg
>Assignee: Alex Feinberg
>
> This is a continuation of HBASE-5494:
> Currently table-level write locks have been implemented using non-sequential 
> ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
> track converting the table-level locks to sequential ZNodes and supporting 
> read-write locks, as to solve the issue of preventing schema changes during 
> region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-11 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273756#comment-13273756
 ] 

Alex Feinberg commented on HBASE-5494:
--

Yeah, you can go ahead and close out.

> Introduce a zk hosted table-wide read/write lock so only one table operation 
> at a time
> --
>
> Key: HBASE-5494
> URL: https://issues.apache.org/jira/browse/HBASE-5494
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
> Attachments: D2997.3.patch, D2997.4.patch, D2997.5.patch, 
> D2997.6.patch
>
>
> I saw this facility over in the accumulo code base.
> Currently we just try to sort out the mess when splits come in during an 
> online schema edit; somehow we figure we can figure all possible region 
> transition combinations and make the right call.
> We could try and narrow the number of combinations by taking out a zk table 
> lock when doing table operations.
> For example, on split or merge, we could take a read-only lock meaning the 
> table can't be disabled while these are running.
> We could then take a write only lock if we want to ensure the table doesn't 
> change while disabling or enabling process is happening.
> Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-11 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273558#comment-13273558
 ] 

Alex Feinberg commented on HBASE-5494:
--

Created HBASE-5991 for implementation of sequential znode based read/write 
locks.

> Introduce a zk hosted table-wide read/write lock so only one table operation 
> at a time
> --
>
> Key: HBASE-5494
> URL: https://issues.apache.org/jira/browse/HBASE-5494
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
> Attachments: D2997.3.patch, D2997.4.patch, D2997.5.patch, 
> D2997.6.patch
>
>
> I saw this facility over in the accumulo code base.
> Currently we just try to sort out the mess when splits come in during an 
> online schema edit; somehow we figure we can figure all possible region 
> transition combinations and make the right call.
> We could try and narrow the number of combinations by taking out a zk table 
> lock when doing table operations.
> For example, on split or merge, we could take a read-only lock meaning the 
> table can't be disabled while these are running.
> We could then take a write only lock if we want to ensure the table doesn't 
> change while disabling or enabling process is happening.
> Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5991) Introduce sequential ZNode based read/write locks

2012-05-11 Thread Alex Feinberg (JIRA)
Alex Feinberg created HBASE-5991:


 Summary: Introduce sequential ZNode based read/write locks 
 Key: HBASE-5991
 URL: https://issues.apache.org/jira/browse/HBASE-5991
 Project: HBase
  Issue Type: Improvement
Reporter: Alex Feinberg


This is a continuation of HBASE-5494:

Currently table-level write locks have been implemented using non-sequential 
ZNodes as part of HBASE-5494 and committed to 89-fb branch. This issue is to 
track converting the table-level locks to sequential ZNodes and supporting 
read-write locks, as to solve the issue of preventing schema changes during 
region splits or merges.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-03 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268107#comment-13268107
 ] 

Alex Feinberg commented on HBASE-5494:
--

.bq You thought this overkill for your case? 
http://zookeeper.apache.org/doc/r3.1.2/recipes.html#Shared+Locks That is fine. 
Do you think we could backfill it later underneath the patch attached here?

I went down the non-sequential route (as you said, thinking it was over-kill 
and simple "create if not exist" approach would work), although I later 
realized that some of the potential race conditions would likely not happen if 
I went with their approach. I think we could backfill it later once we create 
read-write locks. 

I do like the idea of a new master coming up to finish previous work. If we 
make the ZNode data more machine parseable (e.g., convert it to protobuf in 
trunk) than this would be feasible to do (when a new master is brought up, the 
master scans the lock to see if there were any operations in progress when the 
previous master died). 

I agree that lock and unlock shouldn't really be public APIs (in the sense of 
being directly accessible to end developers) -- I'll make lockTable() and 
unlockTable() be package-local methods then, to that end. 

> Introduce a zk hosted table-wide read/write lock so only one table operation 
> at a time
> --
>
> Key: HBASE-5494
> URL: https://issues.apache.org/jira/browse/HBASE-5494
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
> Attachments: D2997.3.patch, D2997.4.patch
>
>
> I saw this facility over in the accumulo code base.
> Currently we just try to sort out the mess when splits come in during an 
> online schema edit; somehow we figure we can figure all possible region 
> transition combinations and make the right call.
> We could try and narrow the number of combinations by taking out a zk table 
> lock when doing table operations.
> For example, on split or merge, we could take a read-only lock meaning the 
> table can't be disabled while these are running.
> We could then take a write only lock if we want to ensure the table doesn't 
> change while disabling or enabling process is happening.
> Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-03 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268042#comment-13268042
 ] 

Alex Feinberg commented on HBASE-5494:
--

Re: "One thing we'd like to prevent is a table being disabled while splits (or 
merges) are going on. How hard would it be to add this facility (in another 
jira?). One way of doing it would be that a regionserver before splitting, it'd 
take out the table lock. That woul prevent the table from being disabled. But 
what about the case if two regionservers try to split a region from the same 
table at the one time? Or, what if the regionserver dies mid-split; the lock 
will be stuck in place."

This is an interesting question. I think one approach may be to create a region 
level lock manager, and to convert the table-level lock manager to support 
read-write locks. Schema modifications (create/disable/alter/delete/) would 
acquire a table-wide read lock (as now). For splits and merges, region servers 
would acquire a table wide _read lock_ (to allow two regionserves to split 
differnet regions of a table at the same time, but prevent schema modifications 
during a split/merge), and a write lock (i.e., a usual lock) over the regions 
that are being split (I'm not even sure if this step is even needed at this 
point).

We also need a way to handle stuck locks (currently DistributedLock uses 
persistent ZNodes) after crashes with minimal (if any) manual intervention (key 
thing being that whatever schema-modification was started prior to the crash is 
safely rolled back -- which may be non-trivial, as I would guess it would more 
complex than just keeping a txn id in the log and then reading through the HLog 
for META). 

> Introduce a zk hosted table-wide read/write lock so only one table operation 
> at a time
> --
>
> Key: HBASE-5494
> URL: https://issues.apache.org/jira/browse/HBASE-5494
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
> Attachments: D2997.3.patch, D2997.4.patch
>
>
> I saw this facility over in the accumulo code base.
> Currently we just try to sort out the mess when splits come in during an 
> online schema edit; somehow we figure we can figure all possible region 
> transition combinations and make the right call.
> We could try and narrow the number of combinations by taking out a zk table 
> lock when doing table operations.
> For example, on split or merge, we could take a read-only lock meaning the 
> table can't be disabled while these are running.
> We could then take a write only lock if we want to ensure the table doesn't 
> change while disabling or enabling process is happening.
> Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-03 Thread Alex Feinberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267681#comment-13267681
 ] 

Alex Feinberg commented on HBASE-5494:
--

This patch implements a ZK-hosted mutual exclusion lock (DistributedLock), and 
table level locks (TableLockManager), and ensures that all schema changing 
operations are serialized. Further work would be needed to add read-write locks 
to handle region splitting and merges.

> Introduce a zk hosted table-wide read/write lock so only one table operation 
> at a time
> --
>
> Key: HBASE-5494
> URL: https://issues.apache.org/jira/browse/HBASE-5494
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
> Attachments: D2997.3.patch
>
>
> I saw this facility over in the accumulo code base.
> Currently we just try to sort out the mess when splits come in during an 
> online schema edit; somehow we figure we can figure all possible region 
> transition combinations and make the right call.
> We could try and narrow the number of combinations by taking out a zk table 
> lock when doing table operations.
> For example, on split or merge, we could take a read-only lock meaning the 
> table can't be disabled while these are running.
> We could then take a write only lock if we want to ensure the table doesn't 
> change while disabling or enabling process is happening.
> Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira