[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-26 Thread Teruyoshi Zenmyo (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194526#comment-13194526
 ] 

Teruyoshi Zenmyo commented on HBASE-3134:
-

Thanks for feedbacks.
{quote}
The difference to \{add|remove\}Peer is that the ReplicationSource still keeps 
track of the logs to be transfered to just temporarily disables shipment?
{quote}
yes

{quote}
Maybe we can show the state via ReplicationZookeeper.listPeers, which is used 
by the shell to show all peers.
{quote}

Should I do this in this ticket or file another issue?

{quote}
Personally I like enablePeer() and disablePeer(), because it is entirely clear 
what they are doing.
{quote}

I think so, too. I'd like to extract a method to remove duplicated parts.



> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Fix For: 0.94.0
>
> Attachments: HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194519#comment-13194519
 ] 

Jieshan Bean commented on HBASE-5153:
-

@Lars:
During the retry, if we get any exceptions, Zookeeper and the Trackers also 
need to close.
{noformat}
+  try {
+setupZookeeperTrackers();
+break;
+  } catch (ZooKeeperConnectionException zkce) {
+if (tries >= this.numRetries) {
+  throw zkce;
+}
+  }
{noformat}

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4991) Provide capability to delete named region

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194518#comment-13194518
 ] 

Zhihong Yu commented on HBASE-4991:
---

Good point.
As I outlined @ 26/Jan/12 23:38, we don't need to create empty region. 
Therefore we don't merge regions.
For successive regions R1, R2 and R3, if we delete R2, we can change the end 
key of R1 to be the original end key of R2 and drop region R2 directly.

> Provide capability to delete named region
> -
>
> Key: HBASE-4991
> URL: https://issues.apache.org/jira/browse/HBASE-4991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> See discussion titled 'Able to control routing to Solr shards or not' on 
> lily-discuss
> User may want to quickly dispose of out of date records by deleting specific 
> regions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4991) Provide capability to delete named region

2012-01-26 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194517#comment-13194517
 ] 

Mubarak Seyed commented on HBASE-4991:
--

I think we can make use of 

{code}
protected boolean merge(final HRegionInfo[] info) throws IOException {
  if ((currentSize + nextSize) <= (maxFilesize / 2)) {
  // We merge two adjacent regions if their total size is less than
  // one half of the desired maximum size
  LOG.info("Merging regions " + currentRegion.getRegionNameAsString() +
" and " + nextRegion.getRegionNameAsString());
  HRegion mergedRegion =
HRegion.mergeAdjacent(currentRegion, nextRegion);
  updateMeta(currentRegion.getRegionName(), nextRegion.getRegionName(),
  mergedRegion);
  break;
  }
}
{code}

what happens if sum(previous_region_size and next_region_size ) > maxFileSize 
when we try to merge adjacent regions (to bridge the hole)?

> Provide capability to delete named region
> -
>
> Key: HBASE-4991
> URL: https://issues.apache.org/jira/browse/HBASE-4991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> See discussion titled 'Able to control routing to Solr shards or not' on 
> lily-discuss
> User may want to quickly dispose of out of date records by deleting specific 
> regions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194513#comment-13194513
 ] 

Phabricator commented on HBASE-5259:


gqchen has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I guess this is more like a personal preference. For non-trivial data 
structures, I personally find it helpful to have the type in the variable name, 
so that when you read the code where the variable is being used, you don't have 
to guess.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194514#comment-13194514
 ] 

Phabricator commented on HBASE-5259:


gqchen has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I guess this is more like a personal preference. For non-trivial data 
structures, I personally find it helpful to have the type in the variable name, 
so that when you read the code where the variable is being used, you don't have 
to guess.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194511#comment-13194511
 ] 

Phabricator commented on HBASE-5259:


gqchen has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I guess this is more like a personal preference. For non-trivial data 
structures, I personally find it helpful to have the type in the variable name, 
so that when you read the code where the variable is being used, you don't have 
to guess.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194512#comment-13194512
 ] 

Phabricator commented on HBASE-5259:


gqchen has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I guess this is more like a personal preference. For non-trivial data 
structures, I personally find it helpful to have the type in the variable name, 
so that when you read the code where the variable is being used, you don't have 
to guess.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194502#comment-13194502
 ] 

Hudson commented on HBASE-5271:
---

Integrated in HBase-TRUNK-security #92 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/92/])
HBASE-5271  Result.getValue and Result.getColumnLatest return the wrong 
column (Ghais Issa)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java


> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
>Assignee: Ghais Issa
> Fix For: 0.94.0, 0.90.7, 0.92.1
>
> Attachments: 5271-90.txt, 5271-v2.txt, 
> fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194501#comment-13194501
 ] 

Hudson commented on HBASE-5274:
---

Integrated in HBase-TRUNK-security #92 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/92/])
[jira] [HBASE-5274] Filter out expired scanners on compaction as well

Summary: This is a followup for D1017 to make it similar to D909 (89-fb). The
fix for 89-fb used the TTL-based scanner filtering logic on both normal scanners
and compactions, while the trunk fix D1017 did not. This is just the delta
between the two diffs that brings filtering expired store files on compaction to
trunk.

Test Plan: Unit tests

Reviewers: Liyin, JIRA, lhofhansl, Kannan

Reviewed By: Liyin

CC: Liyin, tedyu, Kannan, mbautin, lhofhansl

Differential Revision: https://reviews.facebook.net/D1473

mbautin : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5282) Possible file handle leak with truncated HLog file.

2012-01-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194500#comment-13194500
 ] 

Hudson commented on HBASE-5282:
---

Integrated in HBase-TRUNK-security #92 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/92/])
HBASE-5282 Possible file handle leak with truncated HLog file

jmhsieh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> Possible file handle leak with truncated HLog file.
> ---
>
> Key: HBASE-5282
> URL: https://issues.apache.org/jira/browse/HBASE-5282
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.0, 0.92.1
>
> Attachments: hbase-5282.patch, hbase-5282.v2.patch
>
>
> When debugging hbck, found that the code responsible for this exception can 
> leak open file handles.
> {code}
> 12/01/15 05:58:11 INFO regionserver.HRegion: Replaying edits from 
> hdfs://haus01.
> sf.cloudera.com:56020/hbase-jon/test5/98a1e7255731aae44b3836641840113e/recovered
> .edits/3211315; minSequenceid=3214658
> 12/01/15 05:58:11 ERROR handler.OpenRegionHandler: Failed open of 
> region=test5,8
> \x90\x00\x00\x00\x00\x00\x00/05_0,1326597390073.98a1e7255731aae44b3836641840
> 113e.
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1437)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1424)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1419)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:57)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:158)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:572)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1940)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1896)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:312)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4909) Detailed Block Cache Metrics

2012-01-26 Thread Otis Gospodnetic (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194499#comment-13194499
 ] 

Otis Gospodnetic commented on HBASE-4909:
-

Yes please, +1 fo mo metrix!

> Detailed Block Cache Metrics
> 
>
> Key: HBASE-4909
> URL: https://issues.apache.org/jira/browse/HBASE-4909
> Project: HBase
>  Issue Type: Sub-task
>  Components: client, regionserver
>Reporter: Nicolas Spiegelberg
> Fix For: 0.94.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4422) Move block cache parameters and references into single CacheConf class

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194497#comment-13194497
 ] 

Phabricator commented on HBASE-4422:


Kannan has commented on the revision "[jira] [HBASE-4422] [89-fb] Move block 
cache parameters and references into single CacheConfig class".

  looks great. Only one question/comment inline..

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java:294 here, 
we have gone from using "cacheBlock" to cacheConf.shouldCacheDataOnRead() based 
check, but in line 330 below, we are using both. Why is that?

REVISION DETAIL
  https://reviews.facebook.net/D1341


> Move block cache parameters and references into single CacheConf class
> --
>
> Key: HBASE-4422
> URL: https://issues.apache.org/jira/browse/HBASE-4422
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Reporter: Jonathan Gray
>Assignee: Jonathan Gray
> Fix For: 0.92.0
>
> Attachments: CacheConfig92-v8.patch, D1341.1.patch, 
> HBASE-4422-FINAL-branch92.patch, HBASE-4422-FINAL-trunk.patch
>
>
> From StoreFile down to HFile, we currently use a boolean argument for each of 
> the various block cache configuration parameters that exist.  The number of 
> parameters is going to continue to increase as we look at compressed cache, 
> delta encoding, and more specific L1/L2 configuration.  Every new config 
> currently requires changing many constructors because it introduces a new 
> boolean.
> We should move everything into a single class so that modifications are much 
> less disruptive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194494#comment-13194494
 ] 

Phabricator commented on HBASE-5259:


Kannan has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
Ted: If you are trying to optimize for the performance of this error case which 
isn't supposed to happen, I don't think it is really worth it. Furthermore, the 
defaulting logic of falling back to the old style hostname is in the caller.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194495#comment-13194495
 ] 

Phabricator commented on HBASE-5259:


Kannan has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
Ted: If you are trying to optimize for the performance of this error case which 
isn't supposed to happen, I don't think it is really worth it. Furthermore, the 
defaulting logic of falling back to the old style hostname is in the caller.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194492#comment-13194492
 ] 

Phabricator commented on HBASE-5259:


Kannan has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
Ted: If you are trying to optimize for the performance of this error case which 
isn't supposed to happen, I don't think it is really worth it. Furthermore, the 
defaulting logic of falling back to the old style hostname is in the caller.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194493#comment-13194493
 ] 

Phabricator commented on HBASE-5259:


Kannan has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
Ted: If you are trying to optimize for the performance of this error case which 
isn't supposed to happen, I don't think it is really worth it. Furthermore, the 
defaulting logic of falling back to the old style hostname is in the caller.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194491#comment-13194491
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
Map is reflected in the type of this field.
  My suggestion was only for your reference.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
If NamingException is thrown out of line 201, line 202 would be skipped.
  Line 169 might be executed multiple times because regionServerAddress across 
multiple iterations may carry the same (unresolvable) value.

  Correct me if I am wrong.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194490#comment-13194490
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
Map is reflected in the type of this field.
  My suggestion was only for your reference.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
If NamingException is thrown out of line 201, line 202 would be skipped.
  Line 169 might be executed multiple times because regionServerAddress across 
multiple iterations may carry the same (unresolvable) value.

  Correct me if I am wrong.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194489#comment-13194489
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
Map is reflected in the type of this field.
  My suggestion was only for your reference.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
If NamingException is thrown out of line 201, line 202 would be skipped.
  Line 169 might be executed multiple times because regionServerAddress across 
multiple iterations may carry the same (unresolvable) value.

  Correct me if I am wrong.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194487#comment-13194487
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
Map is reflected in the type of this field.
  My suggestion was only for your reference.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
If NamingException is thrown out of line 201, line 202 would be skipped.
  Line 169 might be executed multiple times because regionServerAddress across 
multiple iterations may carry the same (unresolvable) value.

  Correct me if I am wrong.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194482#comment-13194482
 ] 

Phabricator commented on HBASE-5259:


Kannan has accepted the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

  Liyin -- looks good to me. One minor suggestion inlined.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:171 
logging here is unnecessary because of logging in line 191. The "split" 
(TableSplit's toString() method already will print the regionLocation along 
with the start/stop keys for each map task.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194481#comment-13194481
 ] 

Phabricator commented on HBASE-5259:


Kannan has accepted the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

  Liyin -- looks good to me. One minor suggestion inlined.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:171 
logging here is unnecessary because of logging in line 191. The "split" 
(TableSplit's toString() method already will print the regionLocation along 
with the start/stop keys for each map task.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194483#comment-13194483
 ] 

Phabricator commented on HBASE-5259:


Kannan has accepted the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

  Liyin -- looks good to me. One minor suggestion inlined.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:171 
logging here is unnecessary because of logging in line 191. The "split" 
(TableSplit's toString() method already will print the regionLocation along 
with the start/stop keys for each map task.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194480#comment-13194480
 ] 

Phabricator commented on HBASE-5259:


Kannan has accepted the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

  Liyin -- looks good to me. One minor suggestion inlined.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:171 
logging here is unnecessary because of logging in line 191. The "split" 
(TableSplit's toString() method already will print the regionLocation along 
with the start/stop keys for each map task.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194475#comment-13194475
 ] 

Phabricator commented on HBASE-5259:


Liyin has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
@tedyu, I prefer to name this variable as reverseDNSCacheMap. Do you have any 
specific reason to changing it to reverseDNSCache ?
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
@tedyu, This function is a wrapper function of DNS.reverseDNS which also 
provides caching as you suggest.
  However I believe this function is supposed to keep the same behavior as 
DNS.reverseDNS including throwing out NamingException.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194472#comment-13194472
 ] 

Phabricator commented on HBASE-5259:


Liyin has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
@tedyu, I prefer to name this variable as reverseDNSCacheMap. Do you have any 
specific reason to changing it to reverseDNSCache ?
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
@tedyu, This function is a wrapper function of DNS.reverseDNS which also 
provides caching as you suggest.
  However I believe this function is supposed to keep the same behavior as 
DNS.reverseDNS including throwing out NamingException.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194473#comment-13194473
 ] 

Phabricator commented on HBASE-5259:


Liyin has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
@tedyu, I prefer to name this variable as reverseDNSCacheMap. Do you have any 
specific reason to changing it to reverseDNSCache ?
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
@tedyu, This function is a wrapper function of DNS.reverseDNS which also 
provides caching as you suggest.
  However I believe this function is supposed to keep the same behavior as 
DNS.reverseDNS including throwing out NamingException.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194474#comment-13194474
 ] 

Phabricator commented on HBASE-5259:


Liyin has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
@tedyu, I prefer to name this variable as reverseDNSCacheMap. Do you have any 
specific reason to changing it to reverseDNSCache ?
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 
@tedyu, This function is a wrapper function of DNS.reverseDNS which also 
provides caching as you suggest.
  However I believe this function is supposed to keep the same behavior as 
DNS.reverseDNS including throwing out NamingException.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194470#comment-13194470
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
I think NamingException handling @ line 166 should be moved here.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194469#comment-13194469
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
I think NamingException handling @ line 166 should be moved here.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5294) Make sure javadoc is included in tarball bundle when we release

2012-01-26 Thread stack (Created) (JIRA)
Make sure javadoc is included in tarball bundle when we release
---

 Key: HBASE-5294
 URL: https://issues.apache.org/jira/browse/HBASE-5294
 Project: HBase
  Issue Type: Task
Reporter: stack
Priority: Critical
 Fix For: 0.92.1


0.92.0 doesn't have javadoc in the tarball.  Fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194467#comment-13194467
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
I think NamingException handling @ line 166 should be moved here.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194468#comment-13194468
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision "[jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
I think NamingException handling @ line 166 should be moved here.

REVISION DETAIL
  https://reviews.facebook.net/D1413


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.3.patch

Liyin updated the revision "[jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup.".
Reviewers: Kannan, Karthik, mbautin

  refactoring the code.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.3.patch

Liyin updated the revision "[jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup.".
Reviewers: Kannan, Karthik, mbautin

  refactoring the code.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.3.patch

Liyin updated the revision "[jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup.".
Reviewers: Kannan, Karthik, mbautin

  refactoring the code.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-26 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.3.patch

Liyin updated the revision "[jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup.".
Reviewers: Kannan, Karthik, mbautin

  refactoring the code.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---
>
> Key: HBASE-5259
> URL: https://issues.apache.org/jira/browse/HBASE-5259
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, 
> D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5293) Purge hfile v1 from code base

2012-01-26 Thread stack (Created) (JIRA)
Purge hfile v1 from code base
-

 Key: HBASE-5293
 URL: https://issues.apache.org/jira/browse/HBASE-5293
 Project: HBase
  Issue Type: Task
Reporter: stack


Remove all hfile v1 references from code base.

If we do this though, as Matt Corgan suggests up on mailing list, we will need 
to make sure all hfile v1s in an hbase.rootdir have been compacted out of 
existence.  We'll probably need to bump the hbase.version to indicate the check 
for hfile v1s has been run.  A migration script will need to be run that checks 
the hbase.rootdir for hfile v1s and runs a major compaction if any found.

I've not put a version on this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194458#comment-13194458
 ] 

Lars Hofhansl commented on HBASE-3134:
--

Just so I understand... The difference to {add|remove}Peer is that the 
ReplicationSource still keeps track of the logs to be transfered to just 
temporarily disables shipment?

Maybe we can show the state via ReplicationZookeeper.listPeers, which is used 
by the shell to show all peers.

Personally I like enablePeer() and disablePeer(), because it is entirely clear 
what they are doing.

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Fix For: 0.94.0
>
> Attachments: HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194457#comment-13194457
 ] 

Hadoop QA commented on HBASE-5153:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12512077/5153-trunk-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 162 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.io.hfile.TestHFileBlock
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/857//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/857//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/857//console

This message is automatically generated.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194455#comment-13194455
 ] 

Lars Hofhansl commented on HBASE-5010:
--

Thanks for clarifying Prakash.

> Filter HFiles based on TTL
> --
>
> Key: HBASE-5010
> URL: https://issues.apache.org/jira/browse/HBASE-5010
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
> D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch
>
>
> In ScanWildcardColumnTracker we have
> {code:java}
>  
>   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
>   ...
>   private boolean isExpired(long timestamp) {
> return timestamp < oldestStamp;
>   }
> {code}
> but this time range filtering does not participate in HFile selection. In one 
> real case this caused next() calls to time out because all KVs in a table got 
> expired, but next() had to iterate over the whole table to find that out. We 
> should be able to filter out those HFiles right away. I think a reasonable 
> approach is to add a "default timerange filter" to every scan for a CF with a 
> finite TTL and utilize existing filtering in 
> StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194453#comment-13194453
 ] 

Lars Hofhansl commented on HBASE-5153:
--

@Jieshan: Hmm... I do see the endless loop in the debugger. It happens from 
HBaseAdmin.checkHBaseAvailable when HBase is actually down. We get into 
resetZooKeeperTrackersWithRetries, which calls setupZookeeperTrackers, which 
causes abort to be called, which calls resetZooKeeperTrackersWithRetries.
I am not sure getZooKeeperWatcher() would throw if ZK is not available.

@Ted: calling this.zooKeeper.close() (if it is not null) first seems prudent.

@Jieshan: Good catch, yes it should be volatile.


> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2012-01-26 Thread Prakash Khemani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194454#comment-13194454
 ] 

Prakash Khemani commented on HBASE-5010:


This change is doesn't break HBASE-4721.

HBASE-4721 introduced another parameter called 
hbase.hstore.time.to.purge.deletes to keep deletes even after major 
compactions. But hbase.hstore.time.to.purge.deletes doesn't override the TTL of 
the store.

Pasting the comment from code which hopefully makes it clear that this diff 
works with HBASE-4721

  // By default, when hbase.hstore.time.to.purge.deletes is 0ms, a delete
  // marker is always removed during a major compaction. If set to non-zero
  // value then major compaction will try to keep a delete marker around for
  // the given number of milliseconds. We want to keep the delete markers
  // around a bit longer because old puts might appear out-of-order. For
  // example, during log replication between two clusters.
  //
  // If the delete marker has lived longer than its column-family's TTL then
  // the delete marker will be removed even if time.to.purge.deletes has not
  // passed. This is because all the Puts that this delete marker can influence
  // would have also expired. (Removing of delete markers on col family TTL will
  // not happen if min-versions is set to non-zero)
  //

> Filter HFiles based on TTL
> --
>
> Key: HBASE-5010
> URL: https://issues.apache.org/jira/browse/HBASE-5010
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
> D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch
>
>
> In ScanWildcardColumnTracker we have
> {code:java}
>  
>   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
>   ...
>   private boolean isExpired(long timestamp) {
> return timestamp < oldestStamp;
>   }
> {code}
> but this time range filtering does not participate in HFile selection. In one 
> real case this caused next() calls to time out because all KVs in a table got 
> expired, but next() had to iterate over the whole table to find that out. We 
> should be able to filter out those HFiles right away. I think a reasonable 
> approach is to add a "default timerange filter" to every scan for a CF with a 
> finite TTL and utilize existing filtering in 
> StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194451#comment-13194451
 ] 

Jieshan Bean commented on HBASE-5153:
-

@Lars:
Adding a "isResettingZKTrackers" sounds good to me. One doubt: Is it necessary 
to add the keyword of "volatile"?

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194450#comment-13194450
 ] 

Zhihong Yu commented on HBASE-5153:
---

{code}
+if (isResettingZKTrackers) {
+  return;
+}
{code}
I was thinking about something similar to the above.

In resetZooKeeperTrackersWithRetries(), shall we call zooKeeper.close() before 
resetting zooKeeper?
{code}
this.zooKeeper.close();
this.zooKeeper = null;
{code}

Thanks for the help.

I think we need an addendum for 0.90

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194447#comment-13194447
 ] 

Jieshan Bean commented on HBASE-5153:
-

bq. Are you guys saying we do not need this in 0.92+?

We need this, but the patch for 0.92+ should take notice of this:)

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194445#comment-13194445
 ] 

Jieshan Bean commented on HBASE-5153:
-

"The endless loop happens when ZK is actually down."
If ZK is actually down, the below code will throw a Exception:
 this.zooKeeper = getZooKeeperWatcher();
Then catched by the below code:
{noformat}
  try {
  LOG.info("This client just lost it's session with ZooKeeper, trying" +
  " to reconnect.");
  resetZooKeeperTrackersWithRetries();
  LOG.info("Reconnected successfully. This disconnect could have been" +
  " caused by a network partition or a long-running GC pause," +
  " either way it's recommended that you verify your environment.");
  return;
} catch (ZooKeeperConnectionException e) {
  LOG.error("Could not reconnect to ZooKeeper after session" +
  " expiration, aborting");
  t = e;
}
  if (t != null) LOG.fatal(msg, t);
  else LOG.fatal(msg);
  HConnectionManager.deleteStaleConnection(this);
{noformat}

It should not be a endless loop. Does that make sense?


> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194443#comment-13194443
 ] 

Lars Hofhansl commented on HBASE-5153:
--

bq. As discussed with Ted. Trunk and 92 already including a retry logic in 
RecoverableZooKeeper. So that makes the retry logic in 
resetZooKeeperTrackersWithRetries less important.

Are you guys saying we do not need this in 0.92+?

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1319#comment-1319
 ] 

Jieshan Bean commented on HBASE-5153:
-

Thanks Lars. It's nice of you:)

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194441#comment-13194441
 ] 

Lars Hofhansl commented on HBASE-5153:
--

@Jieshan: The endless loop happens when ZK is actually down.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5153:
-

Attachment: 5153-trunk-v2.txt

Please let me know what you think of 5153-trunk-v2.txt... Thanks

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk-v2.txt, 5153-trunk.txt, 
> 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, 
> HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, 
> HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, 
> HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, 
> TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194437#comment-13194437
 ] 

Lars Hofhansl commented on HBASE-5153:
--

There was also some weird stuff in HBaseAdmin.checkHBaseAvailable. It did set 
the client retry to 1, but left the the retry count on in RecoverableZookeeper, 
which leads to long, unnecessary waits.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194435#comment-13194435
 ] 

Lars Hofhansl commented on HBASE-5153:
--

I have a new patch... Testing it now.
It is mostly what you have Jieshan, but it does not need all the changes to the 
ZookeeperNodeTracker and subclasses.
I'll attach it soon... Then please let me know what you think.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194426#comment-13194426
 ] 

Jieshan Bean commented on HBASE-5153:
-

Thanks, Ted...I will take a look at that stack trace.
 
If a ZooKeeperConnectionException thrown by the below code:
{noformat}
   try {
  if (setupZookeeperTrackers(isLastTime)) {
break;
  }
} catch (ZooKeeperConnectionException zkce) {
  if (isLastTime) {
throw zkce;
  }
}
{noformat}

If will be catched in abort method, then calling LOG.fatal(msg, t);

No problem here. Don't know whether I get you correctly:(.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194414#comment-13194414
 ] 

Zhihong Yu commented on HBASE-5153:
---

See the stack trace I pasted here:
https://issues.apache.org/jira/browse/HBASE-5153?focusedCommentId=13187774&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13187774

bq. If this exception happened for long time, Zookeeper must has some problem.
We should be prepared when the above indeed happens. I am sure this scenario is 
possible.

See also this part of the code:
{code}
+} catch (ZooKeeperConnectionException zkce) {
+  if (isLastTime) {
+throw zkce;
+  }
+}
{code}


> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194412#comment-13194412
 ] 

Jieshan Bean commented on HBASE-5153:
-

As discussed with Ted. Trunk and 92 already including a retry logic in 
RecoverableZooKeeper. So that makes the retry logic in 
resetZooKeeperTrackersWithRetries less important.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194408#comment-13194408
 ] 

Zhihong Yu commented on HBASE-4218:
---

TestHFileBlock was reported as failing by Hadoop QA (@26/Jan/12 02:58) before 
the checkin.

Now the test failure appears in every TRUNK build and every Hadoop QA report.

> Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
> ---
>
> Key: HBASE-4218
> URL: https://issues.apache.org/jira/browse/HBASE-4218
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 0.94.0
>Reporter: Jacek Migdal
>Assignee: Mikhail Bautin
>  Labels: compression
> Fix For: 0.94.0
>
> Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
> 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
> D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
> D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
> D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
> D447.23.patch, D447.24.patch, D447.25.patch, D447.26.patch, D447.3.patch, 
> D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, 
> D447.9.patch, Data-block-encoding-2011-12-23.patch, 
> Delta-encoding-2012-01-17_11_09_09.patch, 
> Delta-encoding-2012-01-25_00_45_29.patch, 
> Delta-encoding-2012-01-25_16_32_14.patch, 
> Delta-encoding.patch-2011-12-22_11_52_07.patch, 
> Delta-encoding.patch-2012-01-05_15_16_43.patch, 
> Delta-encoding.patch-2012-01-05_16_31_44.patch, 
> Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
> Delta-encoding.patch-2012-01-05_18_50_47.patch, 
> Delta-encoding.patch-2012-01-07_14_12_48.patch, 
> Delta-encoding.patch-2012-01-13_12_20_07.patch, 
> Delta_encoding_with_memstore_TS.patch, open-source.diff
>
>
> A compression for keys. Keys are sorted in HFile and they are usually very 
> similar. Because of that, it is possible to design better compression than 
> general purpose algorithms,
> It is an additional step designed to be used in memory. It aims to save 
> memory in cache as well as speeding seeks within HFileBlocks. It should 
> improve performance a lot, if key lengths are larger than value lengths. For 
> example, it makes a lot of sense to use it when value is a counter.
> Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
> shows that I could achieve decent level of compression:
>  key compression ratio: 92%
>  total compression ratio: 85%
>  LZO on the same data: 85%
>  LZO after delta encoding: 91%
> While having much better performance (20-80% faster decompression ratio than 
> LZO). Moreover, it should allow far more efficient seeking which should 
> improve performance a bit.
> It seems that a simple compression algorithms are good enough. Most of the 
> savings are due to prefix compression, int128 encoding, timestamp diffs and 
> bitfields to avoid duplication. That way, comparisons of compressed data can 
> be much faster than a byte comparator (thanks to prefix compression and 
> bitfields).
> In order to implement it in HBase two important changes in design will be 
> needed:
> -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
> and iterating; access to uncompressed buffer in HFileBlock will have bad 
> performance
> -extend comparators to support comparison assuming that N first bytes are 
> equal (or some fields are equal)
> Link to a discussion about something similar:
> http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Jieshan Bean (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194406#comment-13194406
 ] 

Jieshan Bean commented on HBASE-5153:
-

It should not leading to an endless loop. Unless, each retry will get a 
ZookeeperLossException. If this exception happened for long time, Zookeeper 
must has some problem. so when create a new Zookeeper instance, it already 
thrown a Exception. So it won't be an endless loop:
{noformat}
if ((t instanceof KeeperException.SessionExpiredException)
  || (t instanceof KeeperException.ConnectionLossException)) {
try {
  LOG.info("This client just lost it's session with ZooKeeper, trying" +
  " to reconnect.");
  resetZooKeeperTrackersWithRetries();
  LOG.info("Reconnected successfully. This disconnect could have been" +
  " caused by a network partition or a long-running GC pause," +
  " either way it's recommended that you verify your environment.");
  return;
} catch (ZooKeeperConnectionException e) {
  LOG.error("Could not reconnect to ZooKeeper after session" +
  " expiration, aborting");
  t = e;
}
  }
  if (t != null) LOG.fatal(msg, t);
  else LOG.fatal(msg);
  HConnectionManager.deleteStaleConnection(this);
{noformat}

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194401#comment-13194401
 ] 

Hadoop QA commented on HBASE-4720:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12512066/HBASE-4720.trunk.v7.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 161 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.io.hfile.TestHFileBlock
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/856//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/856//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/856//console

This message is automatically generated.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, 
> HBASE-4720.trunk.v7.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194399#comment-13194399
 ] 

Zhihong Yu commented on HBASE-4720:
---

@Mubarak:
Thanks for your persistence.

Please also describe the scenarios that you tested in your cluster.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, 
> HBASE-4720.trunk.v7.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5186) Add metrics to ThriftServer

2012-01-26 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5186:
---

Attachment: HBASE-5186.D1461.5.patch

sc updated the revision "HBASE-5186 [jira] Add metrics to ThriftServer".
Reviewers: dhruba, tedyu, JIRA, heyongqiang

  Remove unnecessary locking in ThriftMetrics

REVISION DETAIL
  https://reviews.facebook.net/D1461

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/CallQueue.java
  src/main/java/org/apache/hadoop/hbase/thrift/HbaseHandlerMetricsProxy.java
  src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftMetrics.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestCallQueue.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java


> Add metrics to ThriftServer
> ---
>
> Key: HBASE-5186
> URL: https://issues.apache.org/jira/browse/HBASE-5186
> Project: HBase
>  Issue Type: Improvement
>Reporter: Scott Chen
>Assignee: Scott Chen
> Attachments: HBASE-5186.D1461.1.patch, HBASE-5186.D1461.2.patch, 
> HBASE-5186.D1461.3.patch, HBASE-5186.D1461.4.patch, HBASE-5186.D1461.5.patch
>
>
> It will be useful to have some metrics (queue length, waiting time, 
> processing time ...) similar to Hadoop RPC server. This allows us to monitor 
> system health also provide a tool to diagnose the problem where thrift calls 
> are slow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-26 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-3134:
--

Fix Version/s: 0.94.0

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Fix For: 0.94.0
>
> Attachments: HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-26 Thread Zhihong Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reassigned HBASE-3134:
-

Assignee: Teruyoshi Zenmyo

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Attachments: HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194388#comment-13194388
 ] 

Lars Hofhansl commented on HBASE-5153:
--

Sure... There's a bit more to this too. resetZooKeeperTrackersWithRetries on 
its last try calls setupZookeeperTrackers with allow aborts, which will call 
resetZooKeeperTrackersWithRetries again. Leading to an endless loop. Need to 
think about how to refactor this.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194386#comment-13194386
 ] 

Hadoop QA commented on HBASE-3134:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511955/HBASE-3134.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 161 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.replication.TestReplicationPeer
  org.apache.hadoop.hbase.io.hfile.TestHFileBlock
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/855//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/855//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/855//console

This message is automatically generated.

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Priority: Minor
>  Labels: replication
> Attachments: HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194385#comment-13194385
 ] 

Zhihong Yu commented on HBASE-3134:
---

{code}
+  ZKUtil.deleteNode(this.zookeeper, getPeerStateZNode(id));
{code}
There might be confusion because whether the peer is enabled/disabled is 
represented by the presence of the peer state znode. A better way is to store 
data in corresponding peer state znode.

I also see similarity between enablePeer() and disablePeer(). Is it possible to 
create a single method, changePeerState(String id, ChangeType ct) where 
ChangeType is an enum indicating what to change ?

Uploading the patch onto reviewboard would allow other people to give more 
precise reviews.

Thanks

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Priority: Minor
>  Labels: replication
> Attachments: HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5282) Possible file handle leak with truncated HLog file.

2012-01-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194381#comment-13194381
 ] 

Hudson commented on HBASE-5282:
---

Integrated in HBase-0.92 #265 (See 
[https://builds.apache.org/job/HBase-0.92/265/])
HBASE-5282 Possible file handle leak with truncated HLog file

jmhsieh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> Possible file handle leak with truncated HLog file.
> ---
>
> Key: HBASE-5282
> URL: https://issues.apache.org/jira/browse/HBASE-5282
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.0, 0.92.1
>
> Attachments: hbase-5282.patch, hbase-5282.v2.patch
>
>
> When debugging hbck, found that the code responsible for this exception can 
> leak open file handles.
> {code}
> 12/01/15 05:58:11 INFO regionserver.HRegion: Replaying edits from 
> hdfs://haus01.
> sf.cloudera.com:56020/hbase-jon/test5/98a1e7255731aae44b3836641840113e/recovered
> .edits/3211315; minSequenceid=3214658
> 12/01/15 05:58:11 ERROR handler.OpenRegionHandler: Failed open of 
> region=test5,8
> \x90\x00\x00\x00\x00\x00\x00/05_0,1326597390073.98a1e7255731aae44b3836641840
> 113e.
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1437)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1424)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1419)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:57)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:158)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:572)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1940)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1896)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:312)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-26 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-4720:
-

Attachment: HBASE-4720.trunk.v7.patch

The attached file (HBASE-4720.trunk.v7.patch)   addresses option # 1 to add 
query param /table/row?check=put or /table/row?check=delete

@Andrew
Can you please review the changes?

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, 
> HBASE-4720.trunk.v7.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194373#comment-13194373
 ] 

Zhihong Yu commented on HBASE-5153:
---

Thanks for tracking down the issue, Lars.
If you can upload the latest 5153-trunk.txt to reviewboard first followed by 
your new patch, that would help us know your changes easily.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-26 Thread Teruyoshi Zenmyo (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teruyoshi Zenmyo updated HBASE-3134:


Labels: replication  (was: )
Status: Patch Available  (was: Open)

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Priority: Minor
>  Labels: replication
> Attachments: HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Mikhail Bautin (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin resolved HBASE-5274.
---

Resolution: Fixed
  Assignee: Mikhail Bautin  (was: Liyin Tang)

Fix committed to trunk.

> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL

2012-01-26 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5010:
--

   Resolution: Fixed
Fix Version/s: 0.94.0
 Assignee: Mikhail Bautin  (was: Zhihong Yu)
   Status: Resolved  (was: Patch Available)

A follow-up fix was submitted as part of HBASE-5274 to bring the trunk fix for 
this issue to parity with the 89-fb fix. Resolving.

> Filter HFiles based on TTL
> --
>
> Key: HBASE-5010
> URL: https://issues.apache.org/jira/browse/HBASE-5010
> Project: HBase
>  Issue Type: Bug
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, 
> D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch
>
>
> In ScanWildcardColumnTracker we have
> {code:java}
>  
>   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
>   ...
>   private boolean isExpired(long timestamp) {
> return timestamp < oldestStamp;
>   }
> {code}
> but this time range filtering does not participate in HFile selection. In one 
> real case this caused next() calls to time out because all KVs in a table got 
> expired, but next() had to iterate over the whole table to find that out. We 
> should be able to filter out those HFiles right away. I think a reasonable 
> approach is to add a "default timerange filter" to every scan for a CF with a 
> finite TTL and utilize existing filtering in 
> StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194357#comment-13194357
 ] 

Lars Hofhansl commented on HBASE-5153:
--

So here's the problem. This is hanging while validating that HBase is not 
running via HBaseAdmin.checkHBaseAvailable, which just attempts to create a new 
HBaseAdmin after it sets hbase.client.retries.number to 1. However 
HConnectionImpl caches hbase.client.retries.number in numRetries, and hence if 
ZK is not running resetZooKeeperTrackersWithRetries will retry for a while.
The simplest fix would be for resetZooKeeperTrackersWithRetries to ignore he 
cached setting and to retrieve the value again from the setting. While I am at 
it, I'll also add another option to a different number of retries here.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194347#comment-13194347
 ] 

Lars Hofhansl commented on HBASE-5153:
--

So is this change in 0.90 now? I'm confused. Should revert it from there too, I 
guess.
I will see what's up with TestMergeTool in trunk now.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194344#comment-13194344
 ] 

Phabricator commented on HBASE-5274:


mbautin has committed the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

REVISION DETAIL
  https://reviews.facebook.net/D1473

COMMIT
  https://reviews.facebook.net/rHBASE1236483


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5292) getsize per-CF metric incorrectly counts compaction related reads as well

2012-01-26 Thread Kannan Muthukkaruppan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kannan Muthukkaruppan updated HBASE-5292:
-

Description: 
The per-CF "getsize" metric's intent was to track bytes returned (to HBase 
clients) per-CF. [Note: We already have metrics to track # of HFileBlock's read 
for compaction vs. non-compaction cases -- e.g., compactionblockreadcnt vs. 
fsblockreadcnt.]

Currently, the "getsize" metric gets updated for both client initiated Get/Scan 
operations as well for compaction related reads. The metric is updated in 
StoreScanner.java:next() when the Scan query matcher returns an INCLUDE* code 
via a:

 HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());

We should not do the above in case of compactions.


  was:
The per-CF "getsize" metric's intent was to track bytes returned (to HBase 
clients) per-CF. [Note: We already have metrics to track # of HFileBlock's read 
for compaction vs. non-compaction cases -- e.g., compactionblockreadcnt vs. 
fsblockreadcnt.]

However, currently, the metric gets updated for both client initiated Get/Scan 
operations as well for compaction related reads. The metric is updated in 
StoreScanner.java:next() when the Scan query matcher returns an INCLUDE* code 
via a:

 HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());

We should not do the above in case of compactions.



> getsize per-CF metric incorrectly counts compaction related reads as well 
> --
>
> Key: HBASE-5292
> URL: https://issues.apache.org/jira/browse/HBASE-5292
> Project: HBase
>  Issue Type: Bug
>Reporter: Kannan Muthukkaruppan
>
> The per-CF "getsize" metric's intent was to track bytes returned (to HBase 
> clients) per-CF. [Note: We already have metrics to track # of HFileBlock's 
> read for compaction vs. non-compaction cases -- e.g., compactionblockreadcnt 
> vs. fsblockreadcnt.]
> Currently, the "getsize" metric gets updated for both client initiated 
> Get/Scan operations as well for compaction related reads. The metric is 
> updated in StoreScanner.java:next() when the Scan query matcher returns an 
> INCLUDE* code via a:
>  HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());
> We should not do the above in case of compactions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5292) getsize per-CF metric incorrectly counts compaction related reads as well

2012-01-26 Thread Kannan Muthukkaruppan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kannan Muthukkaruppan updated HBASE-5292:
-

Description: 
The per-CF "getsize" metric's intent was to track bytes returned (to HBase 
clients) per-CF. [Note: We already have metrics to track # of HFileBlock's read 
for compaction vs. non-compaction cases -- e.g., compactionblockreadcnt vs. 
fsblockreadcnt.]

However, currently, the metric gets updated for both client initiated Get/Scan 
operations as well for compaction related reads. The metric is updated in 
StoreScanner.java:next() when the Scan query matcher returns an INCLUDE* code 
via a:

 HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());

We should not do the above in case of compactions.


  was:
The per-CF "getsize" metric's intent was to track bytes returned per-CF. [Note: 
We already have metrics to track # of HFileBlock's read for compaction vs. 
non-compaction cases -- e.g., compactionblockreadcnt vs. fsblockreadcnt.]

However, currently, the metric gets updated for both client initiated Get/Scan 
operations as well for compaction related reads. The metric is updated in 
StoreScanner.java:next() when the Scan query matcher returns an INCLUDE* code 
via a:

 HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());

We should not do the above in case of compactions.



> getsize per-CF metric incorrectly counts compaction related reads as well 
> --
>
> Key: HBASE-5292
> URL: https://issues.apache.org/jira/browse/HBASE-5292
> Project: HBase
>  Issue Type: Bug
>Reporter: Kannan Muthukkaruppan
>
> The per-CF "getsize" metric's intent was to track bytes returned (to HBase 
> clients) per-CF. [Note: We already have metrics to track # of HFileBlock's 
> read for compaction vs. non-compaction cases -- e.g., compactionblockreadcnt 
> vs. fsblockreadcnt.]
> However, currently, the metric gets updated for both client initiated 
> Get/Scan operations as well for compaction related reads. The metric is 
> updated in StoreScanner.java:next() when the Scan query matcher returns an 
> INCLUDE* code via a:
>  HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());
> We should not do the above in case of compactions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5292) getsize per-CF metric incorrectly counts compaction related reads as well

2012-01-26 Thread Kannan Muthukkaruppan (Created) (JIRA)
getsize per-CF metric incorrectly counts compaction related reads as well 
--

 Key: HBASE-5292
 URL: https://issues.apache.org/jira/browse/HBASE-5292
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan


The per-CF "getsize" metric's intent was to track bytes returned per-CF. [Note: 
We already have metrics to track # of HFileBlock's read for compaction vs. 
non-compaction cases -- e.g., compactionblockreadcnt vs. fsblockreadcnt.]

However, currently, the metric gets updated for both client initiated Get/Scan 
operations as well for compaction related reads. The metric is updated in 
StoreScanner.java:next() when the Scan query matcher returns an INCLUDE* code 
via a:

 HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength());

We should not do the above in case of compactions.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5291) Add Kerberos HTTP SPNEGO authentication support to HBase web consoles

2012-01-26 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194317#comment-13194317
 ] 

Alejandro Abdelnur commented on HBASE-5291:
---

You could copycat hadoop-httpfs AuthFilter (this would enable reading the 
security related config from hbase config files)



> Add Kerberos HTTP SPNEGO authentication support to HBase web consoles
> -
>
> Key: HBASE-5291
> URL: https://issues.apache.org/jira/browse/HBASE-5291
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver, security
>Reporter: Andrew Purtell
>
> Like HADOOP-7119, the same motivations:
> {quote}
> Hadoop RPC already supports Kerberos authentication. 
> {quote}
> As does the HBase secure RPC engine.
> {quote}
> Kerberos enables single sign-on.
> Popular browsers (Firefox and Internet Explorer) have support for Kerberos 
> HTTP SPNEGO.
> Adding support for Kerberos HTTP SPNEGO to [HBase] web consoles would provide 
> a unified authentication mechanism and single sign-on for web UI and RPC.
> {quote}
> Also like HADOOP-7119, the same solution:
> A servlet filter is configured in front of all Hadoop web consoles for 
> authentication.
> This filter verifies if the incoming request is already authenticated by the 
> presence of a signed HTTP cookie. If the cookie is present, its signature is 
> valid and its value didn't expire; then the request continues its way to the 
> page invoked by the request. If the cookie is not present, it is invalid or 
> it expired; then the request is delegated to an authenticator handler. The 
> authenticator handler then is responsible for requesting/validating the 
> user-agent for the user credentials. This may require one or more additional 
> interactions between the authenticator handler and the user-agent (which will 
> be multiple HTTP requests). Once the authenticator handler verifies the 
> credentials and generates an authentication token, a signed cookie is 
> returned to the user-agent for all subsequent invocations.
> The authenticator handler is pluggable and 2 implementations are provided out 
> of the box: pseudo/simple and kerberos.
> 1. The pseudo/simple authenticator handler is equivalent to the Hadoop 
> pseudo/simple authentication. It trusts the value of the user.name query 
> string parameter. The pseudo/simple authenticator handler supports an 
> anonymous mode which accepts any request without requiring the user.name 
> query string parameter to create the token. This is the default behavior, 
> preserving the behavior of the HBase web consoles before this patch.
> 2. The kerberos authenticator handler implements the Kerberos HTTP SPNEGO 
> implementation. This authenticator handler will generate a token only if a 
> successful Kerberos HTTP SPNEGO interaction is performed between the 
> user-agent and the authenticator. Browsers like Firefox and Internet Explorer 
> support Kerberos HTTP SPNEGO.
> We can build on the support added to Hadoop via HADOOP-7119. Should just be a 
> matter of wiring up the filter to our infoservers in a similar manner. 
> And from 
> https://issues.apache.org/jira/browse/HBASE-5050?focusedCommentId=13171086&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171086
> {quote}
> Hadoop 0.23 onwards has a hadoop-auth artifact that provides SPNEGO/Kerberos 
> authentication for webapps via a filter. You should consider using it. You 
> don't have to move Hbase to 0.23 for that, just consume the hadoop-auth 
> artifact, which has no dependencies on the rest of Hadoop 0.23 artifacts.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5291) Add Kerberos HTTP SPNEGO authentication support to HBase web consoles

2012-01-26 Thread Andrew Purtell (Created) (JIRA)
Add Kerberos HTTP SPNEGO authentication support to HBase web consoles
-

 Key: HBASE-5291
 URL: https://issues.apache.org/jira/browse/HBASE-5291
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver, security
Reporter: Andrew Purtell


Like HADOOP-7119, the same motivations:

{quote}
Hadoop RPC already supports Kerberos authentication. 
{quote}

As does the HBase secure RPC engine.

{quote}
Kerberos enables single sign-on.

Popular browsers (Firefox and Internet Explorer) have support for Kerberos HTTP 
SPNEGO.

Adding support for Kerberos HTTP SPNEGO to [HBase] web consoles would provide a 
unified authentication mechanism and single sign-on for web UI and RPC.
{quote}

Also like HADOOP-7119, the same solution:

A servlet filter is configured in front of all Hadoop web consoles for 
authentication.

This filter verifies if the incoming request is already authenticated by the 
presence of a signed HTTP cookie. If the cookie is present, its signature is 
valid and its value didn't expire; then the request continues its way to the 
page invoked by the request. If the cookie is not present, it is invalid or it 
expired; then the request is delegated to an authenticator handler. The 
authenticator handler then is responsible for requesting/validating the 
user-agent for the user credentials. This may require one or more additional 
interactions between the authenticator handler and the user-agent (which will 
be multiple HTTP requests). Once the authenticator handler verifies the 
credentials and generates an authentication token, a signed cookie is returned 
to the user-agent for all subsequent invocations.

The authenticator handler is pluggable and 2 implementations are provided out 
of the box: pseudo/simple and kerberos.

1. The pseudo/simple authenticator handler is equivalent to the Hadoop 
pseudo/simple authentication. It trusts the value of the user.name query string 
parameter. The pseudo/simple authenticator handler supports an anonymous mode 
which accepts any request without requiring the user.name query string 
parameter to create the token. This is the default behavior, preserving the 
behavior of the HBase web consoles before this patch.

2. The kerberos authenticator handler implements the Kerberos HTTP SPNEGO 
implementation. This authenticator handler will generate a token only if a 
successful Kerberos HTTP SPNEGO interaction is performed between the user-agent 
and the authenticator. Browsers like Firefox and Internet Explorer support 
Kerberos HTTP SPNEGO.

We can build on the support added to Hadoop via HADOOP-7119. Should just be a 
matter of wiring up the filter to our infoservers in a similar manner. 

And from 
https://issues.apache.org/jira/browse/HBASE-5050?focusedCommentId=13171086&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171086

{quote}
Hadoop 0.23 onwards has a hadoop-auth artifact that provides SPNEGO/Kerberos 
authentication for webapps via a filter. You should consider using it. You 
don't have to move Hbase to 0.23 for that, just consume the hadoop-auth 
artifact, which has no dependencies on the rest of Hadoop 0.23 artifacts.
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5282) Possible file handle leak with truncated HLog file.

2012-01-26 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5282:
--

   Resolution: Fixed
Fix Version/s: 0.92.1
   0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

> Possible file handle leak with truncated HLog file.
> ---
>
> Key: HBASE-5282
> URL: https://issues.apache.org/jira/browse/HBASE-5282
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.0, 0.92.1
>
> Attachments: hbase-5282.patch, hbase-5282.v2.patch
>
>
> When debugging hbck, found that the code responsible for this exception can 
> leak open file handles.
> {code}
> 12/01/15 05:58:11 INFO regionserver.HRegion: Replaying edits from 
> hdfs://haus01.
> sf.cloudera.com:56020/hbase-jon/test5/98a1e7255731aae44b3836641840113e/recovered
> .edits/3211315; minSequenceid=3214658
> 12/01/15 05:58:11 ERROR handler.OpenRegionHandler: Failed open of 
> region=test5,8
> \x90\x00\x00\x00\x00\x00\x00/05_0,1326597390073.98a1e7255731aae44b3836641840
> 113e.
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1437)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1424)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1419)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:57)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:158)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:572)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1940)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1896)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:312)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194286#comment-13194286
 ] 

Phabricator commented on HBASE-5274:


lhofhansl has commented on the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:975 Fair 
enough. +1 :)

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4991) Provide capability to delete named region

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194282#comment-13194282
 ] 

Zhihong Yu commented on HBASE-4991:
---

BTW OnlineMerger is in src/main/java/org/apache/hadoop/hbase/util/HMerge.java

I think for this case we don't need to create an empty region because we would 
end up closing at least two regions. That may increase the downtime for the 
underlying table.

> Provide capability to delete named region
> -
>
> Key: HBASE-4991
> URL: https://issues.apache.org/jira/browse/HBASE-4991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> See discussion titled 'Able to control routing to Solr shards or not' on 
> lily-discuss
> User may want to quickly dispose of out of date records by deleting specific 
> regions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5282) Possible file handle leak with truncated HLog file.

2012-01-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194278#comment-13194278
 ] 

Jonathan Hsieh commented on HBASE-5282:
---

First code commit! Thanks for the review Ted, Lars!

> Possible file handle leak with truncated HLog file.
> ---
>
> Key: HBASE-5282
> URL: https://issues.apache.org/jira/browse/HBASE-5282
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: hbase-5282.patch, hbase-5282.v2.patch
>
>
> When debugging hbck, found that the code responsible for this exception can 
> leak open file handles.
> {code}
> 12/01/15 05:58:11 INFO regionserver.HRegion: Replaying edits from 
> hdfs://haus01.
> sf.cloudera.com:56020/hbase-jon/test5/98a1e7255731aae44b3836641840113e/recovered
> .edits/3211315; minSequenceid=3214658
> 12/01/15 05:58:11 ERROR handler.OpenRegionHandler: Failed open of 
> region=test5,8
> \x90\x00\x00\x00\x00\x00\x00/05_0,1326597390073.98a1e7255731aae44b3836641840
> 113e.
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:299)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:320)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1486)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1437)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1424)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1419)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:57)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:158)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:572)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1940)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1896)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2661)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2647)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:312)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:99)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4991) Provide capability to delete named region

2012-01-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194267#comment-13194267
 ] 

Jonathan Hsieh commented on HBASE-4991:
---

Oops -- wasn't looking at the comment tab.

There is similar code in OnlineMerge and uber hbck.

The code in uber hbck creates a new empty region, closes old regions,  moves 
data into the new empty region, and then activates the new now populated region.

Beware -- I found just closing a region seems to have left data around in the 
HMaster's memory which cause disabling to have problems in the 0.90.x version.  
I'm in the process of porting to trunk/0.92 currently and am finding out if 
there are similar or different problems.  I think I saw something else in 
closeRegion recently that I need to try out -- don't remember which version 
that is however.


> Provide capability to delete named region
> -
>
> Key: HBASE-4991
> URL: https://issues.apache.org/jira/browse/HBASE-4991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> See discussion titled 'Able to control routing to Solr shards or not' on 
> lily-discuss
> User may want to quickly dispose of out of date records by deleting specific 
> regions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194268#comment-13194268
 ] 

Phabricator commented on HBASE-5274:


mbautin has commented on the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

  Lars: please see my response inline.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:975 In that 
case I would have to make LruBlockCache.getCachedFileNamesForTest public. In 
addition, this patch makes HBASE-5010 implementation consistent in 89-fb and 
trunk, and moving the unit test around might create confusion.

  Please let me know if this is OK to commit.

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4991) Provide capability to delete named region

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194265#comment-13194265
 ] 

Zhihong Yu commented on HBASE-4991:
---

I think both of them should be done.

> Provide capability to delete named region
> -
>
> Key: HBASE-4991
> URL: https://issues.apache.org/jira/browse/HBASE-4991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> See discussion titled 'Able to control routing to Solr shards or not' on 
> lily-discuss
> User may want to quickly dispose of out of date records by deleting specific 
> regions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4991) Provide capability to delete named region

2012-01-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194263#comment-13194263
 ] 

Jonathan Hsieh commented on HBASE-4991:
---

When you are deleting regions, do you intend to just getting rid of all the 
data in region, or do you mean to create a hole in a region and the merge with 
an preceding or succeeding region?



> Provide capability to delete named region
> -
>
> Key: HBASE-4991
> URL: https://issues.apache.org/jira/browse/HBASE-4991
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> See discussion titled 'Able to control routing to Solr shards or not' on 
> lily-discuss
> User may want to quickly dispose of out of date records by deleting specific 
> regions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194249#comment-13194249
 ] 

Phabricator commented on HBASE-5274:


lhofhansl has commented on the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

  +1

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:975 Minor 
comment: Is there no way to move the tests into the same package and leave this 
protected?

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194246#comment-13194246
 ] 

Phabricator commented on HBASE-5274:


tedyu has commented on the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:218 That 
should be fine.
  Another approach is to use comment directly.

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194238#comment-13194238
 ] 

Phabricator commented on HBASE-5274:


mbautin has commented on the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

  Ted: replying to your comment inline. Please let me know if this is OK to be 
committed.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:218 
@tedyu: this is just for clarity. Boolean parameters are inherently confusing, 
and this is an equivalent of a comment saying that "false" means "isCompaction".

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5290) [FindBugs] Synchronization on boxed primitive

2012-01-26 Thread Liyin Tang (Created) (JIRA)
[FindBugs] Synchronization on boxed primitive
-

 Key: HBASE-5290
 URL: https://issues.apache.org/jira/browse/HBASE-5290
 Project: HBase
  Issue Type: Bug
Reporter: Liyin Tang
Assignee: Liyin Tang
Priority: Minor


This bug is reported by the findBugs tool, which is a static analysis tool.

Bug: Synchronization on Integer in 
org.apache.hadoop.hbase.regionserver.compactions.CompactSelection.emptyFileList()
The code synchronizes on a boxed primitive constant, such as an Integer.

private static Integer count = 0;
...
  synchronized(count) {
 count++;
 }
...
Since Integer objects can be cached and shared, this code could be 
synchronizing on the same object as other, unrelated code, leading to 
unresponsiveness and possible deadlock

See CERT CON08-J. Do not synchronize on objects that may be reused for more 
information.

Confidence: Normal, Rank: Troubling (14)
Pattern: DL_SYNCHRONIZATION_ON_BOXED_PRIMITIVE 
Type: DL, Category: MT_CORRECTNESS (Multithreaded correctness)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194224#comment-13194224
 ] 

Phabricator commented on HBASE-5274:


Kannan has commented on the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

  +1

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194220#comment-13194220
 ] 

Phabricator commented on HBASE-5274:


tedyu has commented on the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

  Looks good.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:218 
isCompaction is not needed, can pass false directly.

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194213#comment-13194213
 ] 

Phabricator commented on HBASE-5274:


Liyin has accepted the revision "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".

  LGTM. Thanks Mikhail !

REVISION DETAIL
  https://reviews.facebook.net/D1473


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-26 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5274:
---

Attachment: D1473.1.patch

mbautin requested code review of "[jira] [HBASE-5274] Filter out expired 
scanners on compaction as well".
Reviewers: Liyin, JIRA, lhofhansl, Kannan

  This is a followup for D1017 to make it similar to D909 (89-fb). The fix for 
89-fb used the TTL-based scanner filtering logic on both normal scanners and 
compactions, while the trunk fix D1017 did not. This is just the delta between 
the two diffs that brings filtering expired store files on compaction to trunk.

TEST PLAN
  Unit tests

REVISION DETAIL
  https://reviews.facebook.net/D1473

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/3063/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


> Filter out the expired store file scanner during the compaction
> ---
>
> Key: HBASE-5274
> URL: https://issues.apache.org/jira/browse/HBASE-5274
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
> D1407.1.patch, D1407.1.patch, D1473.1.patch
>
>
> During the compaction time, HBase will generate a store scanner which will 
> scan a list of store files. And it would be more efficient to filer out the 
> expired store file since there is no need to read any key values from these 
> store files.
> This optimization has been already implemented on 89-fb and this is the 
> building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
> the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194193#comment-13194193
 ] 

Zhihong Yu commented on HBASE-5153:
---

There were two failed tests:
https://builds.apache.org/job/HBase-0.92-security/81/

If you can resolve the hanging TestMergeTool, that would be great.

I am on-call this week, FYI

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers

2012-01-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194189#comment-13194189
 ] 

Lars Hofhansl commented on HBASE-5153:
--

Also it occurred to me that another nice change would be to be able specify the 
retry count for resetZooKeeperTrackersWithRetries different from the other 
operations. 
The thinking is this:
While the ZK is not reachable the HConnection (and any other HConnection) is 
essentially not usable. In some settings it might be good to have the 
connection just sit there, and retry until the connection is bad. Maybe for 
another jira.

Where are we with this generally?
Is it just TestMergeTool hanging? If so I'll have a look at it today.

> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> ---
>
> Key: HBASE-5153
> URL: https://issues.apache.org/jira/browse/HBASE-5153
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194186#comment-13194186
 ] 

Hudson commented on HBASE-5271:
---

Integrated in HBase-0.92 #263 (See 
[https://builds.apache.org/job/HBase-0.92/263/])
HBASE-5271  Result.getValue and Result.getColumnLatest return the wrong 
column (Ghais Issa)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java


> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
>Assignee: Ghais Issa
> Fix For: 0.94.0, 0.90.7, 0.92.1
>
> Attachments: 5271-90.txt, 5271-v2.txt, 
> fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >