[jira] [Commented] (HBASE-5776) HTableMultiplexer

2012-04-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253768#comment-13253768
 ] 

Liyin Tang commented on HBASE-5776:
---

@Todd, The HTableMultiplexer is designed to process the puts requests across 
different tables. 
All the puts across the tables will be sharded into each different queues based 
on their destination region server. It will help to batch more puts for each 
region server before sending out the rpc request. 

 HTableMultiplexer 
 --

 Key: HBASE-5776
 URL: https://issues.apache.org/jira/browse/HBASE-5776
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D2775.1.patch, D2775.1.patch, D2775.2.patch, 
 D2775.2.patch


 There is a known issue in HBase client that single slow/dead region server 
 could slow down the multiput operations across all the region servers. So the 
 HBase client will be as slow as the slowest region server in the cluster. 
  
 To solve this problem, HTableMultiplexer will separate the multiput 
 submitting threads with the flush threads, which means the multiput operation 
 will be a nonblocking operation. 
 The submitting thread will shard all the puts into different queues based on 
 its destination region server and return immediately. The flush threads will 
 flush these puts from each queue to its destination region server. 
 Currently the HTableMultiplexer only supports the put operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui

2012-02-17 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210455#comment-13210455
 ] 

Liyin Tang commented on HBASE-5407:
---

Hi Stack. 
This patch is to add total read/write request number and read/write request per 
second for each region in 89-fb branch.
For the apache trunk, I will also need to add the read/write request per second 
only.

 Show the per-region level request/sec count in the web ui
 -

 Key: HBASE-5407
 URL: https://issues.apache.org/jira/browse/HBASE-5407
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch


 It would be nice to show the per-region level request/sec count in the web 
 ui, especially when debugging the hot region problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5407) Show the per-region level request count in the web ui

2012-02-15 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208742#comment-13208742
 ] 

Liyin Tang commented on HBASE-5407:
---

Awesome! Thanks Jean. I think I just need to port this patch.

 Show the per-region level request count in the web ui
 -

 Key: HBASE-5407
 URL: https://issues.apache.org/jira/browse/HBASE-5407
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 It would be nice to show the per-region level request count in the web ui, 
 especially when debugging the hot region problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5403) Checkpoint the compressed HLog

2012-02-15 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208831#comment-13208831
 ] 

Liyin Tang commented on HBASE-5403:
---

@Nicolas, The block size in the DFS usually will be set quite large, let's say 
256M. And it is inefficient to write small log file which is less than one dfs 
block. I asume this is the main benefit of checkpointing vs rolling the log.


 Checkpoint the compressed HLog
 --

 Key: HBASE-5403
 URL: https://issues.apache.org/jira/browse/HBASE-5403
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 Let's assume that HBase replication can be based on replaying the HLog in the 
 replica cluster.
 The replica process could be crash during the replay. Obviously, the replica 
 process need a way to start from the lastest check point in the HLog, even 
 the HLog is compressed.
 So the proposal is to write a series of checkpoints within the HLog. 
 Each each checkpoint will have a header with some special sequence of bytes.
 And between each checkpoints, HLog should use new dictionaries to compress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui

2012-02-15 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208995#comment-13208995
 ] 

Liyin Tang commented on HBASE-5407:
---

The total number is very useful and it would be nice to add the request/sec on 
the web UI as well. I have updated the title and description for the jira. 
Thanks Jean for the heads up.

 Show the per-region level request/sec count in the web ui
 -

 Key: HBASE-5407
 URL: https://issues.apache.org/jira/browse/HBASE-5407
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 It would be nice to show the per-region level request/sec count in the web 
 ui, especially when debugging the hot region problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5381) Make memstore.flush.size as a table level configuration

2012-02-10 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205689#comment-13205689
 ] 

Liyin Tang commented on HBASE-5381:
---

Thanks Jean and Ted. I missed something before. Please close this jira for me.
Thanks a lot

 Make memstore.flush.size as a table level configuration
 ---

 Key: HBASE-5381
 URL: https://issues.apache.org/jira/browse/HBASE-5381
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 Currently the region server will flush mem store of the region based on the 
 limitation of the global mem store flush size and global low water mark. 
 However, It will cause the hot tables, which serve more write traffic, to 
 flush too frequently even though the overall mem store heap usage is quite 
 low. Too frequently flush would also contribute to too many minor 
 compactions. 
 So if we can make memstore.flush.size as a table level configuration, it 
 would be more flexible to config different tables with different desired mem 
 store flush size based on compaction ratio, recovery time and put ops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-02-09 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204926#comment-13204926
 ] 

Liyin Tang commented on HBASE-5199:
---

Ping committers !

 Delete out of TTL store files before compaction selection
 -

 Key: HBASE-5199
 URL: https://issues.apache.org/jira/browse/HBASE-5199
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
 D1311.4.patch, D1311.5.patch, D1311.5.patch, HBASE-5199.patch


 Currently, HBase deletes the out of TTL store files after compaction. We can 
 change the sequence to delete the out of TTL store files before selecting 
 store files for compactions. 
 In this way, HBase can keep deleting the old invalid store files without 
 compaction, and also prevent from unnecessary compactions since the out of 
 TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5373) Table level lock to prevent the race of multiple table level operation

2012-02-09 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205120#comment-13205120
 ] 

Liyin Tang commented on HBASE-5373:
---

Cool! Sounds like what I try to do here. I will take a look over Accumulo.

 Table level lock to prevent the race of multiple table level operation
 --

 Key: HBASE-5373
 URL: https://issues.apache.org/jira/browse/HBASE-5373
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 A table level lock can guarantee that only one table operation would happen 
 at one time for each table. The master should require and release these table 
 locks correctly during the failover time. One proposal is to keep track of 
 the lock and its corresponding operation in the zookeeper. If there is a 
 master failover, the secondary should have a way to check whether these 
 operations are succeeded nor not before releasing the lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-30 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196315#comment-13196315
 ] 

Liyin Tang commented on HBASE-5259:
---

@Ted, the TableInputFormatBase in the mapred package has already Deprecated as 
the code marked. No need to update the patch.

@Deprecated
public abstract class TableInputFormatBase {}

 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
 D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch, HBASE-5259.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-27 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195061#comment-13195061
 ] 

Liyin Tang commented on HBASE-5259:
---

Hi Ted, 
I totally understand your concern and appreciate your feedback.
It would be nice to fault tolerant all kinds of DNS server failures, which 
could be transient failures, loss of PTR or DNS service crash. The tradeoff is 
to select a most frequent happening failure case and try to tolerate it 
gracefully. In my perspective, for some large impact failures such as DNS 
server crash, sometimes it would be better to fire alarm and try to fix it as 
soon as possible. Also for minor impact failures, it would be great to recovery 
it naturally. For others, it would be fine to pay some cost. 


If you believe the loss of PTR record is the normal failure case in your 
systems, I would encourage to open a new jira to handle it properly across all 
the code base of HBase, DFS and MapReduce. I do believe we need a better fault 
tolerant policy across all these dependent components.

 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
 D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-27 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195345#comment-13195345
 ] 

Liyin Tang commented on HBASE-5259:
---

Has that package been deprecated ?
Two similar packages look confusing to me.

 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
 D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch, HBASE-5259.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-27 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195360#comment-13195360
 ] 

Liyin Tang commented on HBASE-5259:
---

I see. I will generate another patch including /mapred.
Thanks Ted.

 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
 D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch, HBASE-5259.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-24 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192730#comment-13192730
 ] 

Liyin Tang commented on HBASE-5274:
---

I am not sure why the Phabricator add so many duplicated comments. Sorry about 
the spamming.
@Todd, HBase-5274 tries to avoid scanning any data from the expired store file 
scanner. So compacting the expired store file will be very cheap. And 
HBase-5199 actually is related to HBASE-4717. which performs the age-out 
compaction.

 Filter out the expired store file scanner during the compaction
 ---

 Key: HBASE-5274
 URL: https://issues.apache.org/jira/browse/HBASE-5274
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
 D1407.1.patch, D1407.1.patch


 During the compaction time, HBase will generate a store scanner which will 
 scan a list of store files. And it would be more efficient to filer out the 
 expired store file since there is no need to read any key values from these 
 store files.
 This optimization has been already implemented on 89-fb and this is the 
 building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
 the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186059#comment-13186059
 ] 

Liyin Tang commented on HBASE-5199:
---

Thanks Lars and Kannan. I will double check this.

 Delete out of TTL store files before compaction selection
 -

 Key: HBASE-5199
 URL: https://issues.apache.org/jira/browse/HBASE-5199
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 Currently, HBase deletes the out of TTL store files after major compaction. 
 We can change the sequence to delete the out of TTL store files before 
 selecting store files for compactions. 
 In this way, HBase can keep deleting the old invalid store files without 
 major compaction, and also prevent from unnecessary major compactions since 
 the out of TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time

2012-01-11 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184603#comment-13184603
 ] 

Liyin Tang commented on HBASE-5033:
---

Thanks Ted. BTW, I do use --no-prefix for this recently submitted patch. 

 Opening/Closing store in parallel to reduce region open/close time
 --

 Key: HBASE-5033
 URL: https://issues.apache.org/jira/browse/HBASE-5033
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: 5033-trunk.txt, 5033.txt, D933.1.patch, D933.2.patch, 
 D933.3.patch, D933.4.patch, D933.5.patch, HBASE-5033-apach-trunk.patch


 Region servers are opening/closing each store and each store file for every 
 store in sequential fashion, which may cause inefficiency to open/close 
 regions. 
 So this diff is to open/close each store in parallel in order to reduce 
 region open/close time. Also it would help to reduce the cluster restart time.
 1) Opening each store in parallel
 2) Loading each store file for every store in parallel
 3) Closing each store in parallel
 4) Closing each store file for every store in parallel.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time

2012-01-05 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181134#comment-13181134
 ] 

Liyin Tang commented on HBASE-5033:
---

ping committers !

 Opening/Closing store in parallel to reduce region open/close time
 --

 Key: HBASE-5033
 URL: https://issues.apache.org/jira/browse/HBASE-5033
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D933.1.patch, D933.2.patch, D933.3.patch, D933.4.patch, 
 D933.5.patch, HBASE-5033-apach-trunk.patch


 Region servers are opening/closing each store and each store file for every 
 store in sequential fashion, which may cause inefficiency to open/close 
 regions. 
 So this diff is to open/close each store in parallel in order to reduce 
 region open/close time. Also it would help to reduce the cluster restart time.
 1) Opening each store in parallel
 2) Loading each store file for every store in parallel
 3) Closing each store in parallel
 4) Closing each store file for every store in parallel.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-12-19 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172723#comment-13172723
 ] 

Liyin Tang commented on HBASE-4742:
---

@Nicolas, this is only for 89-fb. Trunk has already splited the dead server's 
log in parallel in another way.

 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.10.patch, D237.2.patch, D237.3.patch, 
 D237.4.patch, D237.5.patch, D237.6.patch, D237.7.patch, D237.8.patch, 
 D237.9.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-11-01 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141360#comment-13141360
 ] 

Liyin Tang commented on HBASE-4532:
---

Shall we add an incompatible flag for this jira?
Because adding a new block type is not backward compatible.

 Avoid top row seek by dedicated bloom filter for delete family bloom filter
 ---

 Key: HBASE-4532
 URL: https://issues.apache.org/jira/browse/HBASE-4532
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: D27.1.patch, D27.1.patch, HBASE-4532-apache-trunk.patch, 
 hbase-4532-89-fb.patch, hbase-4532-remove-system.out.println.patch


 The previous jira, HBASE-4469, is to avoid the top row seek operation if 
 row-col bloom filter is enabled. 
 This jira tries to avoid top row seek for all the cases by creating a 
 dedicated bloom filter only for delete family
 The only subtle use case is when we are interested in the top row with empty 
 column.
 For example, 
 we are interested in row1/cf1:/1/put.
 So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
 bloom filter will say there is NO delete family.
 Then it will avoid the top row seek and return a fake kv, which is the last 
 kv for this row (createLastOnRowCol).
 In this way, we have already missed the real kv we are interested in.
 The solution for the above problem is to disable this optimization if we are 
 trying to GET/SCAN a row with empty column.
 Evaluation from TestSeekOptimization:
 Previously:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings ONLY if the ROWCOL bloom filter is 
 enabled.[HBASE-4469]
 
 After this change:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings for ALL kinds of bloom filter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-28 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138799#comment-13138799
 ] 

Liyin Tang commented on HBASE-4532:
---

Thanks Jonathan for the patch. I should remove this line out.

 Avoid top row seek by dedicated bloom filter for delete family bloom filter
 ---

 Key: HBASE-4532
 URL: https://issues.apache.org/jira/browse/HBASE-4532
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D27.1.patch, D27.1.patch, HBASE-4532-apache-trunk.patch, 
 hbase-4532-89-fb.patch, hbase-4532-remove-system.out.println.patch


 The previous jira, HBASE-4469, is to avoid the top row seek operation if 
 row-col bloom filter is enabled. 
 This jira tries to avoid top row seek for all the cases by creating a 
 dedicated bloom filter only for delete family
 The only subtle use case is when we are interested in the top row with empty 
 column.
 For example, 
 we are interested in row1/cf1:/1/put.
 So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
 bloom filter will say there is NO delete family.
 Then it will avoid the top row seek and return a fake kv, which is the last 
 kv for this row (createLastOnRowCol).
 In this way, we have already missed the real kv we are interested in.
 The solution for the above problem is to disable this optimization if we are 
 trying to GET/SCAN a row with empty column.
 Evaluation from TestSeekOptimization:
 Previously:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings ONLY if the ROWCOL bloom filter is 
 enabled.[HBASE-4469]
 
 After this change:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings for ALL kinds of bloom filter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-28 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139016#comment-13139016
 ] 

Liyin Tang commented on HBASE-4532:
---

Thank Ted, Jonathan Gray for committing this. 
I will double check the submitted patch to avoid this problem.

Nice Catch Jonathan Hsieh. Thank you for the patch:)

 Avoid top row seek by dedicated bloom filter for delete family bloom filter
 ---

 Key: HBASE-4532
 URL: https://issues.apache.org/jira/browse/HBASE-4532
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: D27.1.patch, D27.1.patch, HBASE-4532-apache-trunk.patch, 
 hbase-4532-89-fb.patch, hbase-4532-remove-system.out.println.patch


 The previous jira, HBASE-4469, is to avoid the top row seek operation if 
 row-col bloom filter is enabled. 
 This jira tries to avoid top row seek for all the cases by creating a 
 dedicated bloom filter only for delete family
 The only subtle use case is when we are interested in the top row with empty 
 column.
 For example, 
 we are interested in row1/cf1:/1/put.
 So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
 bloom filter will say there is NO delete family.
 Then it will avoid the top row seek and return a fake kv, which is the last 
 kv for this row (createLastOnRowCol).
 In this way, we have already missed the real kv we are interested in.
 The solution for the above problem is to disable this optimization if we are 
 trying to GET/SCAN a row with empty column.
 Evaluation from TestSeekOptimization:
 Previously:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings ONLY if the ROWCOL bloom filter is 
 enabled.[HBASE-4469]
 
 After this change:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings for ALL kinds of bloom filter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-23 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133585#comment-13133585
 ] 

Liyin Tang commented on HBASE-4532:
---

Thanks Ted:)
here is the test results I got.
So the testConnectionUniqueness in TestHCM has been fixed now ?

==
Results :

Tests in error: 
  testConnectionUniqueness(org.apache.hadoop.hbase.client.TestHCM)
  
testOrphanLogCreation(org.apache.hadoop.hbase.master.TestDistributedLogSplitting):
 Unexpected exception, 
expectedorg.apache.hadoop.hbase.regionserver.wal.OrphanHLogAfterSplitException
 but wasjava.lang.NullPointerException
  
testOrphanLogCreation(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)
  
testRecoveredEdits(org.apache.hadoop.hbase.master.TestDistributedLogSplitting): 
/data/users/liyintang/hbase-os-trunk/target/test-data/3d058c80-266a-4164-8143-925d514f016e/09d560d3-254e-4986-abe1-22b876d299f1/4758e332-2ae7-4194-bfea-900ee4a2e3ab/dfs/name1/current/fsimage
 (Too many open files)
  testRecoveredEdits(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)
  testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting): 
/data/users/liyintang/hbase-os-trunk/target/test-data/3d058c80-266a-4164-8143-925d514f016e/09d560d3-254e-4986-abe1-22b876d299f1/4758e332-2ae7-4194-bfea-900ee4a2e3ab/3949c75c-8c23-4513-b1cc-e94b1bba640b/dfs/name1/current/fsimage
 (Too many open files) 
  testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)

Tests run: 1056, Failures: 0, Errors: 7, Skipped: 9

 Avoid top row seek by dedicated bloom filter for delete family bloom filter
 ---

 Key: HBASE-4532
 URL: https://issues.apache.org/jira/browse/HBASE-4532
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D27.1.patch, D27.1.patch, HBASE-4532-apache-trunk.patch, 
 hbase-4532-89-fb.patch


 The previous jira, HBASE-4469, is to avoid the top row seek operation if 
 row-col bloom filter is enabled. 
 This jira tries to avoid top row seek for all the cases by creating a 
 dedicated bloom filter only for delete family
 The only subtle use case is when we are interested in the top row with empty 
 column.
 For example, 
 we are interested in row1/cf1:/1/put.
 So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
 bloom filter will say there is NO delete family.
 Then it will avoid the top row seek and return a fake kv, which is the last 
 kv for this row (createLastOnRowCol).
 In this way, we have already missed the real kv we are interested in.
 The solution for the above problem is to disable this optimization if we are 
 trying to GET/SCAN a row with empty column.
 Evaluation from TestSeekOptimization:
 Previously:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings ONLY if the ROWCOL bloom filter is 
 enabled.[HBASE-4469]
 
 After this change:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings for ALL kinds of bloom filter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2011-10-22 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133449#comment-13133449
 ] 

Liyin Tang commented on HBASE-4532:
---

For 89-fb, all the unit tests are passed.
For apache-trunk, there are 2 unit tests failed with and without my change:
TestHCM and TestDistributedLogSpliting

 Avoid top row seek by dedicated bloom filter for delete family bloom filter
 ---

 Key: HBASE-4532
 URL: https://issues.apache.org/jira/browse/HBASE-4532
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D27.1.patch, D27.1.patch, HBASE-4532-apache-trunk.patch, 
 hbase-4532-89-fb.patch


 The previous jira, HBASE-4469, is to avoid the top row seek operation if 
 row-col bloom filter is enabled. 
 This jira tries to avoid top row seek for all the cases by creating a 
 dedicated bloom filter only for delete family
 The only subtle use case is when we are interested in the top row with empty 
 column.
 For example, 
 we are interested in row1/cf1:/1/put.
 So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
 bloom filter will say there is NO delete family.
 Then it will avoid the top row seek and return a fake kv, which is the last 
 kv for this row (createLastOnRowCol).
 In this way, we have already missed the real kv we are interested in.
 The solution for the above problem is to disable this optimization if we are 
 trying to GET/SCAN a row with empty column.
 Evaluation from TestSeekOptimization:
 Previously:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1714 (68.40%), savings: 31.60%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings ONLY if the ROWCOL bloom filter is 
 enabled.[HBASE-4469]
 
 After this change:
 For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
 optimization: 1458 (58.18%), savings: 41.82%
 So we can get about 10% more seek savings for ALL kinds of bloom filter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4191) Utilize getTopBlockLocations in load balancer

2011-10-20 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13132200#comment-13132200
 ] 

Liyin Tang commented on HBASE-4191:
---

Hi Ted, do you have started working on this.
I have a similar feature to do :)

 Utilize getTopBlockLocations in load balancer
 -

 Key: HBASE-4191
 URL: https://issues.apache.org/jira/browse/HBASE-4191
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu

 HBASE-4114 implemented getTopBlockLocations().
 Load balancer should utilize this method and assign the region to be moved to 
 the region server with the highest block affinity.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4633) Potential memory leak in client RPC timeout mechanism

2011-10-19 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13131347#comment-13131347
 ] 

Liyin Tang commented on HBASE-4633:
---

I have also noticed some memory leak problems in HBase client.
Our symptom is that the memory footprint will increase as time. But the actual 
heap size of the client is not increasing.
The leak should come from non-heap memory.
But Not sure the leak comes from HBase Client jar itself or just our client 
code.
So I am very interested to know when you have keep the heap size in control, is 
the memory leaking solved ?


 Potential memory leak in client RPC timeout mechanism
 -

 Key: HBASE-4633
 URL: https://issues.apache.org/jira/browse/HBASE-4633
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.3
 Environment: HBase version: 0.90.3 + Patches , Hadoop version: CDH3u0
Reporter: Shrijeet Paliwal

 Relevant Jiras: https://issues.apache.org/jira/browse/HBASE-2937,
 https://issues.apache.org/jira/browse/HBASE-4003
 We have been using the 'hbase.client.operation.timeout' knob
 introduced in 2937 for quite some time now. It helps us enforce SLA.
 We have two HBase clusters and two HBase client clusters. One of them
 is much busier than the other.
 We have seen a deterministic behavior of clients running in busy
 cluster. Their (client's) memory footprint increases consistently
 after they have been up for roughly 24 hours.
 This memory footprint almost doubles from its usual value (usual case
 == RPC timeout disabled). After much investigation nothing concrete
 came out and we had to put a hack
 which keep heap size in control even when RPC timeout is enabled. Also
 note , the same behavior is not observed in 'not so busy
 cluster.
 The patch is here : https://gist.github.com/1288023

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4611) Add support for Phabricator/Differential as an alternative code review tool

2011-10-18 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13130137#comment-13130137
 ] 

Liyin Tang commented on HBASE-4611:
---

Very Awesome. I have tried to created one review on Phabricator :)

 Add support for Phabricator/Differential as an alternative code review tool
 ---

 Key: HBASE-4611
 URL: https://issues.apache.org/jira/browse/HBASE-4611
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Gray
 Attachments: D21.1.patch, D21.1.patch


 From http://phabricator.org/ : Phabricator is a open source collection of 
 web applications which make it easier to write, review, and share source 
 code. It is currently available as an early release. Phabricator was 
 developed at Facebook.
 It's open source so pretty much anyone could host an instance of this 
 software.
 To begin with, there will be a public-facing instance located at 
 http://reviews.facebook.net (sponsored by Facebook and hosted by the OSUOSL 
 http://osuosl.org).
 We will use this JIRA to deal with adding (and ensuring) Apache-friendly 
 support that will allow us to do code reviews with Phabricator for HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-17 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129511#comment-13129511
 ] 

Liyin Tang commented on HBASE-4585:
---

I have run all the unit tests. The following unit tests failed with and without 
the patch of this jira.
TestAvroServer and TestDistributedLogSplitting


 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: hbase-4585-89.patch, hbase-4585-apache-trunk.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126760#comment-13126760
 ] 

Liyin Tang commented on HBASE-4469:
---

@stack. HBASE-4469 optimizes the top row seek if the ROWCOL Bloom filter is 
enabled.
And HBASE-4532  will optimize the top row seek if ROW or NONE Bloom filter is 
enabled.
So HBASE-4469 + HBASE-4532 will optimize all the cases.
 
And it is necessary to commit this first:)


 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4418) Show all the hbase configuration in the web ui

2011-10-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126979#comment-13126979
 ] 

Liyin Tang commented on HBASE-4418:
---

@stack, it is pretty safte to commit HBASE-4418_1.patch :)

 Show all the hbase configuration in the web ui
 --

 Key: HBASE-4418
 URL: https://issues.apache.org/jira/browse/HBASE-4418
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: HBASE-4418_1.patch, HBASE-4418_2.patch


 The motivation is to show ALL the HBase configuration, which takes effect in 
 the run time, in a global place.
 So we can easily know which configuration takes effect and what the value is.
 The configuration shows all the HBase and DFS configuration entry in the 
 configuration file and also includes all the HBase default setting in the 
 code, which is not the config file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127066#comment-13127066
 ] 

Liyin Tang commented on HBASE-4469:
---

Cool, I just downloaded the patch from review board 
(https://reviews.apache.org/r/2235/) and attached here:)
Thanks Jonathan.


 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: HBASE-4469_1.patch


 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127114#comment-13127114
 ] 

Liyin Tang commented on HBASE-4469:
---

@Jonathan, 
For this jira specifically, it has been committed to 89-fb internal branch 
before cutting the public 89-fb branch.
So it should already in the public 89-fb right now.




 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.94.0

 Attachments: HBASE-4469_1.patch


 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127272#comment-13127272
 ] 

Liyin Tang commented on HBASE-4585:
---

Patch for 89-fb and apache trunk are all available right now.

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hbase-4585-89.patch, hbase-4585-trunk.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4585) Avoid seek operation when current kv is deleted

2011-10-13 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127273#comment-13127273
 ] 

Liyin Tang commented on HBASE-4585:
---

Patch for 89-fb and apache trunk are all available right now.

 Avoid seek operation when current kv is deleted
 ---

 Key: HBASE-4585
 URL: https://issues.apache.org/jira/browse/HBASE-4585
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hbase-4585-89.patch, hbase-4585-trunk.patch


 When the current kv is deleted during the matching in the ScanQueryMatcher, 
 currently the matcher will return skip and continue to seek.
 Actually, if the current kv is deleted because of family deleted or column 
 deleted, the matcher should seek to next col.
 If the current kv is deleted because of version deleted, the matcher should 
 just return skip.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-12 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126336#comment-13126336
 ] 

Liyin Tang commented on HBASE-4469:
---

HBASE-4532 will enable delete family Bloom filter only when Row or None Bloom 
filter is enabled.
Because if there is a delete family the store file, the RowCol Bloom filter has 
already had this information.


 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4469) Avoid top row seek by looking up bloomfilter

2011-10-06 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122161#comment-13122161
 ] 

Liyin Tang commented on HBASE-4469:
---

Yes, I didn't change that unit tests TestBlocksRead, which is passed 
successfully. 


 Avoid top row seek by looking up bloomfilter
 

 Key: HBASE-4469
 URL: https://issues.apache.org/jira/browse/HBASE-4469
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is that when seeking for the row/col in the hfile, we will go to 
 top of the row in order to check for row delete marker (delete family). 
 However, if the bloomfilter is enabled for the column family, then if a 
 delete family operation is done on a row, the row is already being added to 
 bloomfilter. We can take advantage of this factor to avoid seeking to the top 
 of row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4241) Optimize flushing of the Store cache for max versions and (new) min versions

2011-10-03 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13119629#comment-13119629
 ] 

Liyin Tang commented on HBASE-4241:
---

Hi Lars, Thanks for your patch. I am trying to back port this feature for 
hbase-89
I have a quick question:) 

why we use CollectionBackedScanner but not reuse memstore scanner?

Thanks a lot

 Optimize flushing of the Store cache for max versions and (new) min versions
 

 Key: HBASE-4241
 URL: https://issues.apache.org/jira/browse/HBASE-4241
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.0

 Attachments: 4241-v2.txt, 4241-v8.txt, 4241.txt


 As discussed with with Jon, there is room for improvement in how the memstore 
 is flushed to disk.
 Currently only expired KVs are pruned before flushing, but we can also prune 
 versions if we find at least maxVersions versions in the memstore.
 The same holds for the new minversion feature: If we find at least minVersion 
 versions in the store we can remove all further versions that are expired.
 Generally we should use the same mechanism here that is used for Compaction. 
 I.e. StoreScanner. We only need to add a scanner to Memstore that can scan 
 along the current snapshot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4522) Make hbase-site-custom.xml override the hbase-site.xml

2011-09-30 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118284#comment-13118284
 ] 

Liyin Tang commented on HBASE-4522:
---

@Jonathon: It can:) 
That's why I am wondering should I open source this change:)

For us, hbase-site.xml works as hbase-default.xml and hbase-site-custom.xml 
works as hbase-site xml.
That's why we need to make hbase-site-custom.xml overrides to hbase-site.xml.
But in the open source trunk, we don't even have hbase-site-custom.xml at all.

 Make hbase-site-custom.xml override the hbase-site.xml
 --

 Key: HBASE-4522
 URL: https://issues.apache.org/jira/browse/HBASE-4522
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Liyin Tang
Priority: Minor
 Fix For: 0.94.0


 The motivation for diff is that we want to override some config change for 
 any specific cluster easily by just adding the config entries in the 
 hbase-site-custom.xml for that cluster. This change adds the 
 hbase-site-custom.xml configuration file into HBaseConfiguration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4418) Show all the hbase configuration in the web ui

2011-09-30 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118542#comment-13118542
 ] 

Liyin Tang commented on HBASE-4418:
---

Thanks stack and Todd.

I have attached a very simple patch here. If user runs hadoop with HADOOP-6408, 
/conf will show all the hbase configuration.
If user run hadoop without HADOOP-6408, nothing will change:)

What do you guys think?


 Show all the hbase configuration in the web ui
 --

 Key: HBASE-4418
 URL: https://issues.apache.org/jira/browse/HBASE-4418
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: HBASE-4418_1.patch


 The motivation is to show ALL the HBase configuration, which takes effect in 
 the run time, in a global place.
 So we can easily know which configuration takes effect and what the value is.
 The configuration shows all the HBase and DFS configuration entry in the 
 configuration file and also includes all the HBase default setting in the 
 code, which is not the config file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4418) Show all the hbase configuration in the web ui

2011-09-30 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118544#comment-13118544
 ] 

Liyin Tang commented on HBASE-4418:
---

BTW, I have tested it with hadoop-22 and it works.

 Show all the hbase configuration in the web ui
 --

 Key: HBASE-4418
 URL: https://issues.apache.org/jira/browse/HBASE-4418
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: HBASE-4418_1.patch


 The motivation is to show ALL the HBase configuration, which takes effect in 
 the run time, in a global place.
 So we can easily know which configuration takes effect and what the value is.
 The configuration shows all the HBase and DFS configuration entry in the 
 configuration file and also includes all the HBase default setting in the 
 code, which is not the config file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4418) Show all the hbase configuration in the web ui

2011-09-30 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118585#comment-13118585
 ] 

Liyin Tang commented on HBASE-4418:
---

@stack, HBASE-4418_2.patch has 2 links from master and region server web ui to 
configuration page.
if you would make hbase-trunk based on hadoop-0.23, then HBASE-4418_2.patch is 
better.
Otherwise HBASE-4418_1.patch is better.
Thanks


 Show all the hbase configuration in the web ui
 --

 Key: HBASE-4418
 URL: https://issues.apache.org/jira/browse/HBASE-4418
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: HBASE-4418_1.patch, HBASE-4418_2.patch


 The motivation is to show ALL the HBase configuration, which takes effect in 
 the run time, in a global place.
 So we can easily know which configuration takes effect and what the value is.
 The configuration shows all the HBase and DFS configuration entry in the 
 configuration file and also includes all the HBase default setting in the 
 code, which is not the config file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4418) Show all the hbase configuration in the web ui

2011-09-29 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117573#comment-13117573
 ] 

Liyin Tang commented on HBASE-4418:
---

@stack,is it true that all the patches for hbase trunk should rebase on hadoop 
trunk or hadoop-0.24 (the latest release) ? 


 Show all the hbase configuration in the web ui
 --

 Key: HBASE-4418
 URL: https://issues.apache.org/jira/browse/HBASE-4418
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The motivation is to show ALL the HBase configuration, which takes effect in 
 the run time, in a global place.
 So we can easily know which configuration takes effect and what the value is.
 The configuration shows all the HBase and DFS configuration entry in the 
 configuration file and also includes all the HBase default setting in the 
 code, which is not the config file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4418) Show all the hbase configuration in the web ui

2011-09-29 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117874#comment-13117874
 ] 

Liyin Tang commented on HBASE-4418:
---

@Todd, I have created a HADOOP-7702, which will show all the default 
configuration value in /conf servlet.
So HBase can reuse them for fee. 
Can you assign that jira to me? 
Thanks 

 Show all the hbase configuration in the web ui
 --

 Key: HBASE-4418
 URL: https://issues.apache.org/jira/browse/HBASE-4418
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The motivation is to show ALL the HBase configuration, which takes effect in 
 the run time, in a global place.
 So we can easily know which configuration takes effect and what the value is.
 The configuration shows all the HBase and DFS configuration entry in the 
 configuration file and also includes all the HBase default setting in the 
 code, which is not the config file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4418) Show all the hbase configuration in the web ui

2011-09-29 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117875#comment-13117875
 ] 

Liyin Tang commented on HBASE-4418:
---

@Todd, I have created a HADOOP-7702, which will show all the default 
configuration value in /conf servlet.
So HBase can reuse them for fee. 
Can you assign that jira to me? 
Thanks 

 Show all the hbase configuration in the web ui
 --

 Key: HBASE-4418
 URL: https://issues.apache.org/jira/browse/HBASE-4418
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The motivation is to show ALL the HBase configuration, which takes effect in 
 the run time, in a global place.
 So we can easily know which configuration takes effect and what the value is.
 The configuration shows all the HBase and DFS configuration entry in the 
 configuration file and also includes all the HBase default setting in the 
 code, which is not the config file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4491) HBase Locality Checker

2011-09-26 Thread Liyin Tang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13115025#comment-13115025
 ] 

Liyin Tang commented on HBASE-4491:
---

@Ted: Yes, it looks like be covered HBASE-4191. I can follow up for HBASE-4191.

 HBase Locality Checker
 --

 Key: HBASE-4491
 URL: https://issues.apache.org/jira/browse/HBASE-4491
 Project: HBase
  Issue Type: New Feature
Reporter: Liyin Tang
Assignee: Liyin Tang

 If we run data node and region server in the same physical machine, region 
 server will be benefit if the store files for its serving regions have a 
 local replica in the data node process.
 So for each regions, there exists a best locality region server which has 
 most local blocks for this region.
 The HBase Locality Checker will show how many regions is running on its best 
 locality region server. 
 The higher the number is, the more performance benefits HBase can get from 
 data locality.
 Also there would be a followup task to use these region locality information 
 for region assignment. Assignment manager will prefer assign regions to its 
 best locality region server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira