[jira] [Updated] (HBASE-10451) Enable back Tag compression on HFiles

2014-02-24 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-10451:
---

Attachment: HBASE-10451_V6.patch

Tests are passing with this.  IntegrationTestIngestWithVisibilityLabels run 
also looks fine.

Change in TestEncodedSeekers is to remove encodeOnDisk parameter.  We dont have 
any such setting available in now in HCD.

 Enable back Tag compression on HFiles
 -

 Key: HBASE-10451
 URL: https://issues.apache.org/jira/browse/HBASE-10451
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10451.patch, HBASE-10451_V2.patch, 
 HBASE-10451_V3.patch, HBASE-10451_V4.patch, HBASE-10451_V5.patch, 
 HBASE-10451_V6.patch


 HBASE-10443 disables tag compression on HFiles. This Jira is to fix the 
 issues we have found out in HBASE-10443 and enable it back.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10594) Speed up TestRestoreSnapshotFromClient a bit

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910100#comment-13910100
 ] 

Hudson commented on HBASE-10594:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #166 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/166/])
HBASE-10594 Speed up TestRestoreSnapshotFromClient a bit. (larsh: rev 1571142)
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRestoreSnapshotFromClient.java


 Speed up TestRestoreSnapshotFromClient a bit
 

 Key: HBASE-10594
 URL: https://issues.apache.org/jira/browse/HBASE-10594
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: 10594-0.94.txt, 10594-trunk.txt


 Looking through the longest running test in 0.94 I noticed that 
 TestRestoreSnapshotFromClient runs for over 10 minutes on the jenkins boxes 
 (264s on my local box).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-8304) Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured without default port.

2014-02-24 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated HBASE-8304:


Attachment: (was: HBASE-9537.patch)

 Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured 
 without default port.
 ---

 Key: HBASE-8304
 URL: https://issues.apache.org/jira/browse/HBASE-8304
 Project: HBase
  Issue Type: Bug
  Components: HFile, regionserver
Affects Versions: 0.94.5
Reporter: Raymond Liu
  Labels: bulkloader
 Attachments: HBASE-8304.patch


 When fs.default.name or fs.defaultFS in hadoop core-site.xml is configured as 
 hdfs://ip, and hbase.rootdir is configured as hdfs://ip:port/hbaserootdir 
 where port is the hdfs namenode's default port. the bulkload operation will 
 not remove the file in bulk output dir. Store::bulkLoadHfile will think 
 hdfs:://ip and hdfs:://ip:port as different filesystem and go with copy 
 approaching instead of rename.
 The root cause is that hbase master will rewrite fs.default.name/fs.defaultFS 
 according to hbase.rootdir when regionserver started, thus, dest fs uri from 
 the hregion will not matching src fs uri passed from client.
 any suggestion what is the best approaching to fix this issue? 
 I kind of think that we could check for default port if src uri come without 
 port info.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10451) Enable back Tag compression on HFiles

2014-02-24 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-10451:
---

Status: Open  (was: Patch Available)

 Enable back Tag compression on HFiles
 -

 Key: HBASE-10451
 URL: https://issues.apache.org/jira/browse/HBASE-10451
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10451.patch, HBASE-10451_V2.patch, 
 HBASE-10451_V3.patch, HBASE-10451_V4.patch, HBASE-10451_V5.patch


 HBASE-10443 disables tag compression on HFiles. This Jira is to fix the 
 issues we have found out in HBASE-10443 and enable it back.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-8304) Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured without default port.

2014-02-24 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated HBASE-8304:


Attachment: HBASE-8304.patch

 Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured 
 without default port.
 ---

 Key: HBASE-8304
 URL: https://issues.apache.org/jira/browse/HBASE-8304
 Project: HBase
  Issue Type: Bug
  Components: HFile, regionserver
Affects Versions: 0.94.5
Reporter: Raymond Liu
  Labels: bulkloader
 Attachments: HBASE-8304.patch


 When fs.default.name or fs.defaultFS in hadoop core-site.xml is configured as 
 hdfs://ip, and hbase.rootdir is configured as hdfs://ip:port/hbaserootdir 
 where port is the hdfs namenode's default port. the bulkload operation will 
 not remove the file in bulk output dir. Store::bulkLoadHfile will think 
 hdfs:://ip and hdfs:://ip:port as different filesystem and go with copy 
 approaching instead of rename.
 The root cause is that hbase master will rewrite fs.default.name/fs.defaultFS 
 according to hbase.rootdir when regionserver started, thus, dest fs uri from 
 the hregion will not matching src fs uri passed from client.
 any suggestion what is the best approaching to fix this issue? 
 I kind of think that we could check for default port if src uri come without 
 port info.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10451) Enable back Tag compression on HFiles

2014-02-24 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-10451:
---

Status: Patch Available  (was: Open)

 Enable back Tag compression on HFiles
 -

 Key: HBASE-10451
 URL: https://issues.apache.org/jira/browse/HBASE-10451
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10451.patch, HBASE-10451_V2.patch, 
 HBASE-10451_V3.patch, HBASE-10451_V4.patch, HBASE-10451_V5.patch, 
 HBASE-10451_V6.patch


 HBASE-10443 disables tag compression on HFiles. This Jira is to fix the 
 issues we have found out in HBASE-10443 and enable it back.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10594) Speed up TestRestoreSnapshotFromClient a bit

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910093#comment-13910093
 ] 

Hudson commented on HBASE-10594:


FAILURE: Integrated in HBase-0.94-security #420 (See 
[https://builds.apache.org/job/HBase-0.94-security/420/])
HBASE-10594 Speed up TestRestoreSnapshotFromClient a bit. (larsh: rev 1571144)
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestRestoreSnapshotFromClient.java


 Speed up TestRestoreSnapshotFromClient a bit
 

 Key: HBASE-10594
 URL: https://issues.apache.org/jira/browse/HBASE-10594
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: 10594-0.94.txt, 10594-trunk.txt


 Looking through the longest running test in 0.94 I noticed that 
 TestRestoreSnapshotFromClient runs for over 10 minutes on the jenkins boxes 
 (264s on my local box).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10595) HBaseAdmin.getTableDescriptor can wrongly get the previous table's TableDescriptor even after the table dir in hdfs is removed

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910114#comment-13910114
 ] 

Hadoop QA commented on HBASE-10595:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630609/HBASE-10595-trunk_v2.patch
  against trunk revision .
  ATTACHMENT ID: 12630609

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.util.TestHBaseFsck.testSplitDaughtersNotInMeta(TestHBaseFsck.java:1477)
at 
org.apache.hadoop.hbase.util.TestHBaseFsck.testOverlapAndOrphan(TestHBaseFsck.java:859)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8781//console

This message is automatically generated.

 HBaseAdmin.getTableDescriptor can wrongly get the previous table's 
 TableDescriptor even after the table dir in hdfs is removed
 --

 Key: HBASE-10595
 URL: https://issues.apache.org/jira/browse/HBASE-10595
 Project: HBase
  Issue Type: Bug
  Components: master, util
Reporter: Feng Honghua
Assignee: Feng Honghua
 Attachments: HBASE-10595-trunk_v1.patch, HBASE-10595-trunk_v2.patch


 When a table dir (in hdfs) is removed(by outside), HMaster will still return 
 the cached TableDescriptor to client for getTableDescriptor request.
 On the contrary, HBaseAdmin.listTables() is handled correctly in current 
 implementation, for a table whose table dir in hdfs is removed by outside, 
 getTableDescriptor can still retrieve back a valid (old) table descriptor, 
 while listTables says it doesn't exist, this is inconsistent
 The reason for this bug is because HMaster (via FSTableDescriptors) doesn't 
 check if the table dir exists for getTableDescriptor() request, (while it 
 lists all existing table dirs(not firstly respects cache) and returns 
 accordingly for listTables() request)
 When a table is deleted via deleteTable, the cache will be cleared after the 
 table dir and tableInfo file is removed, listTables/getTableDescriptor 
 inconsistency should be transient(though still exists, when table dir is 
 removed while cache is not cleared) and harder to expose



--
This 

[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt

2014-02-24 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910125#comment-13910125
 ] 

Nicolas Liochon commented on HBASE-10525:
-

bq. when the connection.interrupt() is invoked the reader thread gets it.. What 
happens to the writer thread if it was waiting in the callsToWrite.take() call.
In #markClosed, we put a Call to the queue (the DEATH_PILL), this way the 
writers exits the 'take' method. The reader thread calls #markClosed on any 
exception, interruptions included.

 Allow the client to use a different thread for writing to ease interrupt
 

 Key: HBASE-10525
 URL: https://issues.apache.org/jira/browse/HBASE-10525
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 
 10525.v4.patch, 10525.v5.patch, 10525.v6.patch, 10525.v7.patch, 
 HBaseclient-EventualConsistency.pdf


 This is an issue in the HBASE-10070 context, but as well more generally if 
 you want to interrupt an operation with a limited cost. 
 I will attach a doc with a more detailed explanation.
 This adds a thread per region server; so it's otional. The first patch 
 activates it by default to see how it behaves on a full hadoop-qa run. The 
 target is to be unset by default.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8304) Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured without default port.

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910126#comment-13910126
 ] 

Hadoop QA commented on HBASE-8304:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630630/HBASE-8304.patch
  against trunk revision .
  ATTACHMENT ID: 12630630

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 hadoop1.0{color}.  The patch failed to compile against the 
hadoop 1.0 profile.
Here is snippet of errors:
{code}[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hbase-server: Compilation failure
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java:[430,70]
 cannot find symbol
[ERROR] symbol  : method 
getNNServiceRpcAddresses(org.apache.hadoop.conf.Configuration)
[ERROR] location: class org.apache.hadoop.hdfs.DFSUtil
[ERROR] - [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) 
on project hbase-server: Compilation failure
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java:[430,70]
 cannot find symbol
symbol  : method getNNServiceRpcAddresses(org.apache.hadoop.conf.Configuration)
location: class org.apache.hadoop.hdfs.DFSUtil

at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
--
Caused by: org.apache.maven.plugin.CompilationFailureException: Compilation 
failure
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionFileSystem.java:[430,70]
 cannot find symbol
symbol  : method getNNServiceRpcAddresses(org.apache.hadoop.conf.Configuration)
location: class org.apache.hadoop.hdfs.DFSUtil

at 
org.apache.maven.plugin.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:729){code}

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8783//console

This message is automatically generated.

 Bulkload fail to remove files if fs.default.name / fs.defaultFS is configured 
 without default port.
 ---

 Key: HBASE-8304
 URL: https://issues.apache.org/jira/browse/HBASE-8304
 Project: HBase
  Issue Type: Bug
  Components: HFile, regionserver
Affects Versions: 0.94.5
Reporter: Raymond Liu
  Labels: bulkloader
 Attachments: HBASE-8304.patch


 When fs.default.name or fs.defaultFS in hadoop core-site.xml is configured as 
 hdfs://ip, and hbase.rootdir is configured as hdfs://ip:port/hbaserootdir 
 where port is the hdfs namenode's default port. the bulkload operation will 
 not remove the file in bulk output dir. Store::bulkLoadHfile will think 
 hdfs:://ip and hdfs:://ip:port as different filesystem and go with copy 
 approaching instead of rename.
 The root cause is that hbase master will rewrite fs.default.name/fs.defaultFS 
 according to hbase.rootdir when regionserver started, thus, dest fs uri from 
 the hregion will not matching src fs uri passed from client.
 any suggestion what is the best approaching to fix this issue? 
 I kind of think that we could check for default port if src uri come without 
 port info.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10355) Failover RPC's from client using region replicas

2014-02-24 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910141#comment-13910141
 ] 

Nicolas Liochon commented on HBASE-10355:
-

bq. The following fragment solves the issue for me. Basically we just rethrow 
InterrruptedIOEx. Can you take a look:
This would work. We need as well to exclude the SocketTimeoutException. This is 
done by an utility class. So it would become:
{code}
  private RegionLocations getRegionLocations(boolean useCache)
  throws RetriesExhaustedException, DoNotRetryIOException, 
InterruptedIOException {
RegionLocations rl;
try {
  rl = cConnection.locateRegion(tableName, get.getRow(), useCache, true);
} catch (DoNotRetryIOException e) {
  throw e;
} catch (RetriesExhaustedException e) {
  throw e;
} catch (IOException e) {
  ExceptionUtil.rethrowIfInterrupt(e);
  throw new RetriesExhaustedException(Can't get the location, e);
}
if (rl == null) {
  throw new RetriesExhaustedException(Can't get the locations);
}

return rl;
  }
{code}

 Failover RPC's from client using region replicas
 

 Key: HBASE-10355
 URL: https://issues.apache.org/jira/browse/HBASE-10355
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Reporter: Enis Soztutar
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10355.v1.patch, 10355.v2.patch, 10355.v3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-24 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-10579:


Assignee: Aleksandr Shulman

 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910145#comment-13910145
 ] 

Nicolas Liochon commented on HBASE-10566:
-

bq. I suppose it is ok. Maybe rename the class so it is not confused with 
Callable.
Actually, we never use the fact that it's a java Callable. But changing the 
name can impact a lot of code. I will try (intelliJ will do the change for me, 
but it can make the patch much bigger, I dont know).

bq. Is TimeLimitedRpcController left as an exercise to the reader
I forgot it (usual stuff: not added to git, so not included in git diff). But 
the patch globally compiles but does not set the timeout all the time.

bq. Doesn't callTimeout make more sense for this parameter name?
Often timeout indicates a duration, while here I used something like a cutoff 
time. That's what I wanted to express. There is an implication however: the 
client and the server time must be in sync. Even if it's a common requirement, 
I'm not sure I'm not going to change my mind.

Thanks a lot for the feedback, I'm going to try to write the full patch.




 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-24 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-10579.
-

   Resolution: Fixed
Fix Version/s: 0.99.0

committed, thanks for the patch

 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)
cuijianwei created HBASE-10598:
--

 Summary: Written data can not be read out because 
MemStore#timeRangeTracker might be updated concurrently
 Key: HBASE-10598
 URL: https://issues.apache.org/jira/browse/HBASE-10598
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.16
Reporter: cuijianwei


In our test environment, we found that written data can't be read out 
occasionally. After debugging, we find that maximumTimestamp/minimumTimestamp 
of MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(the same region), but contain different 
rowkeys. Consequently, kv1 and kv2 could be updated concurrently. When we see 
the implementation of HRegionServer#multi, kv1 and kv2 will be add to MemStore 
by HRegion#doMiniBatchMutation#applyFamilyMapToMemstore. Then, 
MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will be 
updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp is t0 before includeTimestamp invoked, 
kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are both set by user(then, user 
knows the timestamp of kv1 and kv2), and t1  t2. T1 and T2 will be executed 
concurrently, therefore, the two threads might both find the current 
maximumTimestamp is less than the timestamp of its kv. After that, T1 and T2 
will both set maximumTimestamp to timestamp of its kv. If T1 set 
maximumTimestamp before T2 doing that, the maximumTimestamp will be set to t2. 
Then, before any new update with bigger timestamp has been applied to the 
MemStore, if we try to read out kv1 by HTable#get and set the timestamp of 
'Get' to t1, the StoreScanner will decide whether the MemStoreScanner(imagining 
kv1 has not been flushed) should be selected as candidate scanner by the method 
MemStoreScanner#shouldUseScanner. The MemStore won't be selected because 
maximumTimestamp of the MemStore has been set to t2 (t2  t1). Consequently, 
the written kv1 can't be read out and kv1 is lost from user's perspective.
If the analysis of above is right, after maximumTimestamp of 
MemStore#timeRangeTracker has been set to t2, user will experience data lass in 
the following situations:
1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
read request of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new put with kv.timestamp  t1 has been add to the MemStore, if a 
flush happened, the data of MemStore will be flushed to StoreFile with 
StoreFile#maximumTimestamp set to t2. After that, any read request with 
timestamp=t2 can not read kv1 before next compaction(the content of StoreFile 
won't change and kv1.timestamp might also not be included even after 
compaction).
The second situation is much more serious because the incorrect timeRange of 
MemStore has been persisted to the file. And Similarly, the concurrent update 
of TimeRangeTracker#minimumTimestamp may also cause this problem.
As a simple way to fix the problem, we could add synchronized to 
TimeRangeTracker#includeTimestamp so that this method won't be invoked 
concurrently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10598:
---

Description: 
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#doMiniBatchMutation#applyFamilyMapToMemstore. Then, 
MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will be 
updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are 
both set by user(then, user knows the timestamps of kv1 and kv2), and t1  t2  
t0. T1 and T2 will be executed concurrently, therefore, the two threads might 
both find the current maximumTimestamp is less than the timestamp of its kv. 
After that, T1 and T2 will both set maximumTimestamp to timestamp of its kv. If 
T1 set maximumTimestamp before T2 doing that, the maximumTimestamp will be set 
to t2. Then, before any new update with bigger timestamp has been applied to 
the MemStore, if we try to read out kv1 by HTable#get and set the timestamp of 
'Get' to t1, the StoreScanner will decide whether the MemStoreScanner(imagining 
kv1 has not been flushed) should be selected as candidate scanner by 
MemStoreScanner#shouldUseScanner. Then, the MemStore won't be selected because 
maximumTimestamp of the MemStore has been set to t2 (t2  t1). Consequently, 
the written kv1 can't be read out and kv1 is lost from user's perspective.
If the analysis of above is right, after maximumTimestamp of 
MemStore#timeRangeTracker has been set to t2, user will experience data lass in 
the following situations:
1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
read request of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new put with kv.timestamp  t1 has been add to the MemStore, if a 
flush happened, the data of MemStore will be flushed to StoreFile with 
StoreFile#maximumTimestamp set to t2. After that, any read request with 
timestamp=t2 can not read kv1 before next compaction(the content of StoreFile 
won't change and kv1.timestamp might also not be included even after 
compaction).
The second situation is much more serious because the incorrect timeRange of 
MemStore has been persisted to the file. And Similarly, the concurrent update 
of TimeRangeTracker#minimumTimestamp may also cause this problem.
As a simple way to fix the problem, we could add synchronized to 
TimeRangeTracker#includeTimestamp so that this method won't be invoked 
concurrently.

  was:
In our test environment, we found that written data can't be read out 
occasionally. After debugging, we find that maximumTimestamp/minimumTimestamp 
of MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(the same region), but contain different 
rowkeys. Consequently, kv1 and kv2 could be updated concurrently. When we see 
the implementation of HRegionServer#multi, kv1 and kv2 will be add to MemStore 
by HRegion#doMiniBatchMutation#applyFamilyMapToMemstore. Then, 
MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will be 
updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp is t0 before includeTimestamp invoked, 
kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are both set by user(then, user 
knows the timestamp of kv1 and kv2), and t1  t2. T1 and T2 will be executed 
concurrently, therefore, the two threads might both find the current 
maximumTimestamp is less 

[jira] [Updated] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10598:
---

Description: 
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#doMiniBatchMutation#applyFamilyMapToMemstore. Then, 
MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will be 
updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are 
both set by user(then, user knows the timestamps of kv1 and kv2), and t1  t2  
t0. T1 and T2 will be executed concurrently, therefore, the two threads might 
both find the current maximumTimestamp is less than the timestamp of its kv. 
After that, T1 and T2 will both set maximumTimestamp to timestamp of its kv. If 
T1 set maximumTimestamp before T2 doing that, the maximumTimestamp will be set 
to t2. Then, before any new update with bigger timestamp has been applied to 
the MemStore, if we try to read out kv1 by HTable#get and set the timestamp of 
'Get' to t1, the StoreScanner will decide whether the MemStoreScanner(imagining 
kv1 has not been flushed) should be selected as candidate scanner by 
MemStoreScanner#shouldUseScanner. Then, the MemStore won't be selected because 
maximumTimestamp of the MemStore has been set to t2 (t2  t1). Consequently, 
the written kv1 can't be read out and kv1 is lost from user's perspective.
If the above analysis is right, after maximumTimestamp of 
MemStore#timeRangeTracker has been set to t2, user will experience data lass in 
the following situations:
1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
read request of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new write with kv.timestamp  t1 has been add to the MemStore, if 
a flush happened, the data of MemStore will be flushed to StoreFile with 
StoreFile#maximumTimestamp set to t2. After that, any read request with 
timestamp=t1 can not read kv1 before next compaction(kv1.timestamp might also 
not be included in timeRange of StoreFile even after compaction).
The second situation is much more serious because the incorrect timeRange of 
MemStore has been persisted to the file. 
Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may also 
cause this problem.
As a simple way to fix the problem, we could add synchronized to 
TimeRangeTracker#includeTimestamp so that this method won't be invoked 
concurrently.

  was:
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#doMiniBatchMutation#applyFamilyMapToMemstore. Then, 
MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will be 
updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are 
both set by user(then, user knows the timestamps of kv1 and kv2), and t1  t2  
t0. T1 and T2 will be executed concurrently, therefore, the two threads might 
both find the current 

[jira] [Updated] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10598:
---

Description: 
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#doMiniBatchMutation. 
Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will 
be updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp(...) invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 
are both set by user(then, user knows the timestamps of kv1 and kv2), and t1  
t2  t0. T1 and T2 will be executed concurrently, therefore, the two threads 
might both find the current maximumTimestamp is less than the timestamp of its 
kv. After that, T1 and T2 will both set maximumTimestamp to timestamp of its 
kv. If T1 set maximumTimestamp before T2 doing that, the maximumTimestamp will 
be set to t2. Then, before any new update with bigger timestamp has been 
applied to the MemStore, if we try to read out kv1 by HTable#get and set the 
timestamp of 'Get' to t1, the StoreScanner will decide whether the 
MemStoreScanner(imagining kv1 has not been flushed) should be selected as 
candidate scanner by MemStoreScanner#shouldUseScanner. Then, the MemStore won't 
be selected because maximumTimestamp of the MemStore has been set to t2 (t2  
t1). Consequently, the written kv1 can't be read out and kv1 is lost from 
user's perspective.
If the above analysis is right, after maximumTimestamp of 
MemStore#timeRangeTracker has been set to t2, user will experience data lass in 
the following situations:
1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
read request of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new write with kv.timestamp  t1 has been add to the MemStore, if 
a flush happened, the data of MemStore will be flushed to StoreFile with 
StoreFile#maximumTimestamp set to t2. After that, any read request with 
timestamp=t1 can not read kv1 before next compaction(kv1.timestamp might also 
not be included in timeRange of StoreFile even after compaction).
The second situation is much more serious because the incorrect timeRange of 
MemStore has been persisted to the file. 
Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may also 
cause this problem.
As a simple way to fix the problem, we could add synchronized to 
TimeRangeTracker#includeTimestamp so that this method won't be invoked 
concurrently.

  was:
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#doMiniBatchMutation. 
Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will 
be updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are 
both set by user(then, user knows the timestamps of kv1 and kv2), and t1  t2  
t0. T1 and T2 will be executed concurrently, therefore, the two threads might 
both 

[jira] [Updated] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10598:
---

Description: 
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#doMiniBatchMutation. 
Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will 
be updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are 
both set by user(then, user knows the timestamps of kv1 and kv2), and t1  t2  
t0. T1 and T2 will be executed concurrently, therefore, the two threads might 
both find the current maximumTimestamp is less than the timestamp of its kv. 
After that, T1 and T2 will both set maximumTimestamp to timestamp of its kv. If 
T1 set maximumTimestamp before T2 doing that, the maximumTimestamp will be set 
to t2. Then, before any new update with bigger timestamp has been applied to 
the MemStore, if we try to read out kv1 by HTable#get and set the timestamp of 
'Get' to t1, the StoreScanner will decide whether the MemStoreScanner(imagining 
kv1 has not been flushed) should be selected as candidate scanner by 
MemStoreScanner#shouldUseScanner. Then, the MemStore won't be selected because 
maximumTimestamp of the MemStore has been set to t2 (t2  t1). Consequently, 
the written kv1 can't be read out and kv1 is lost from user's perspective.
If the above analysis is right, after maximumTimestamp of 
MemStore#timeRangeTracker has been set to t2, user will experience data lass in 
the following situations:
1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
read request of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new write with kv.timestamp  t1 has been add to the MemStore, if 
a flush happened, the data of MemStore will be flushed to StoreFile with 
StoreFile#maximumTimestamp set to t2. After that, any read request with 
timestamp=t1 can not read kv1 before next compaction(kv1.timestamp might also 
not be included in timeRange of StoreFile even after compaction).
The second situation is much more serious because the incorrect timeRange of 
MemStore has been persisted to the file. 
Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may also 
cause this problem.
As a simple way to fix the problem, we could add synchronized to 
TimeRangeTracker#includeTimestamp so that this method won't be invoked 
concurrently.

  was:
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#doMiniBatchMutation#applyFamilyMapToMemstore. Then, 
MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will be 
updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 are 
both set by user(then, user knows the timestamps of kv1 and kv2), and t1  t2  
t0. T1 and T2 will be executed concurrently, therefore, the two threads might 
both find the current 

[jira] [Updated] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt

2014-02-24 Thread Nicolas Liochon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10525:


  Resolution: Fixed
Release Note: If hbase.ipc.client.allowsInterrupt is set to true (default 
being false), the writes are performed in a different thread. This workarounds 
a Java limitation with interruptions and i/o; and limits the impact of 
interrupting a client call. It's strongly recommended to activate this 
parameter when using tables with multiple replicas.
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk, thanks for the reviews, all. (Devaraj, I understood your 
comment as 'ok if', I can obviously revert / amend if you want more time for 
the review.). Same goes for everyone who wants to chime in: this code is 
obviously critical  complex.

 Allow the client to use a different thread for writing to ease interrupt
 

 Key: HBASE-10525
 URL: https://issues.apache.org/jira/browse/HBASE-10525
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 
 10525.v4.patch, 10525.v5.patch, 10525.v6.patch, 10525.v7.patch, 
 HBaseclient-EventualConsistency.pdf


 This is an issue in the HBASE-10070 context, but as well more generally if 
 you want to interrupt an operation with a limited cost. 
 I will attach a doc with a more detailed explanation.
 This adds a thread per region server; so it's otional. The first patch 
 activates it by default to see how it behaves on a full hadoop-qa run. The 
 target is to be unset by default.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10598:
---

Description: 
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#doMiniBatchMutation. 
Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will 
be updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp(...) invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 
are both set by user(then, user knows the timestamps of kv1 and kv2), and t1  
t2  t0. T1 and T2 will be executed concurrently, therefore, the two threads 
might both find the current maximumTimestamp is less than the timestamp of its 
kv. After that, T1 and T2 will both set maximumTimestamp to timestamp of its 
kv. If T1 set maximumTimestamp before T2 doing that, the maximumTimestamp will 
be set to t2. Then, before any new update with bigger timestamp has been 
applied to the MemStore, if we try to read out kv1 by HTable#get and set the 
timestamp of 'Get' to t1, the StoreScanner will decide whether the 
MemStoreScanner(imagining kv1 has not been flushed) should be selected as 
candidate scanner by MemStoreScanner#shouldUseScanner. Then, the MemStore won't 
be selected in MemStoreScanner#shouldUseScanner because maximumTimestamp of the 
MemStore has been set to t2 (t2  t1). Consequently, the written kv1 can't be 
read out and kv1 is lost from user's perspective.
If the above analysis is right, after maximumTimestamp of 
MemStore#timeRangeTracker has been set to t2, user will experience data lass in 
the following situations:
1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
read request of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new write with kv.timestamp  t1 has been add to the MemStore, if 
a flush happened, the data of MemStore will be flushed to StoreFile with 
StoreFile#maximumTimestamp set to t2. After that, any read request with 
timestamp=t1 can not read kv1 before next compaction(kv1.timestamp might also 
not be included in timeRange of StoreFile even after compaction).
The second situation is much more serious because the incorrect timeRange of 
MemStore has been persisted to the file. 
Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may also 
cause this problem.
As a simple way to fix the problem, we could add synchronized to 
TimeRangeTracker#includeTimestamp so that this method won't be invoked 
concurrently.

  was:
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#doMiniBatchMutation. 
Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will 
be updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp(...) invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 
are both set by user(then, user knows the timestamps of kv1 and kv2), and t1  
t2  t0. T1 and T2 will be executed concurrently, 

[jira] [Commented] (HBASE-10451) Enable back Tag compression on HFiles

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910166#comment-13910166
 ] 

Hadoop QA commented on HBASE-10451:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630629/HBASE-10451_V6.patch
  against trunk revision .
  ATTACHMENT ID: 12630629

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8782//console

This message is automatically generated.

 Enable back Tag compression on HFiles
 -

 Key: HBASE-10451
 URL: https://issues.apache.org/jira/browse/HBASE-10451
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10451.patch, HBASE-10451_V2.patch, 
 HBASE-10451_V3.patch, HBASE-10451_V4.patch, HBASE-10451_V5.patch, 
 HBASE-10451_V6.patch


 HBASE-10443 disables tag compression on HFiles. This Jira is to fix the 
 issues we have found out in HBASE-10443 and enable it back.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10598:
---

Description: 
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#doMiniBatchMutation. 
Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will 
be updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp(...) invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 
are both set by user(then, user knows the timestamps of kv1 and kv2), and t1  
t2  t0. T1 and T2 will be executed concurrently, therefore, the two threads 
might both find the current maximumTimestamp is less than the timestamp of its 
kv. After that, T1 and T2 will both set maximumTimestamp to timestamp of its 
kv. If T1 set maximumTimestamp before T2 doing that, the maximumTimestamp will 
be set to t2. Then, before any new update with bigger timestamp has been 
applied to the MemStore, if we try to read out kv1 by HTable#get and set the 
timestamp of 'Get' to t1, the StoreScanner will decide whether the 
MemStoreScanner(imagining kv1 has not been flushed) should be selected as 
candidate scanner by MemStoreScanner#shouldUseScanner. Then, the MemStore won't 
be selected in MemStoreScanner#shouldUseScanner because maximumTimestamp of the 
MemStore has been set to t2 (t2  t1). Consequently, the written kv1 can't be 
read out and kv1 is lost from user's perspective.
If the above analysis is right, after maximumTimestamp of 
MemStore#timeRangeTracker has been set to t2, user will experience data lass in 
the following situations:
1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
read request of kv1 with timestamp=t1 can not read kv1 out.
2. Before any new write with kv.timestamp  t1 has been add to the MemStore, if 
a flush happened, the data of MemStore will be flushed to StoreFile with 
StoreFile#maximumTimestamp set to t2. After that, any read request with 
timestamp=t1 can not read kv1 before next compaction(Actually, kv1.timestamp 
might not be included in timeRange of the StoreFile even after compaction).
The second situation is much more serious because the incorrect timeRange of 
MemStore has been persisted to the file. 
Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may also 
cause this problem.
As a simple way to fix the problem, we could add synchronized to 
TimeRangeTracker#includeTimestamp so that this method won't be invoked 
concurrently.

  was:
In our test environment, we find written data can't be read out occasionally. 
After debugging, we find that maximumTimestamp/minimumTimestamp of 
MemStore#timeRangeTracker might decrease/increase when 
MemStore#timeRangeTracker is updated concurrently, which might make the 
MemStore/StoreFile to be filtered incorrectly when reading data out. Let's see 
how the concurrent updating of timeRangeTracker#maximumTimestamp cause this 
problem. 
Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
kv1 and kv2 belong to the same Store(so belong to the same region), but contain 
different rowkeys. Consequently, kv1 and kv2 could be updated concurrently. 
When we see the implementation of HRegionServer#multi, kv1 and kv2 will be add 
to MemStore by HRegion#applyFamilyMapToMemstore in HRegion#doMiniBatchMutation. 
Then, MemStore#internalAdd will be invoked and MemStore#timeRangeTracker will 
be updated by TimeRangeTracker#includeTimestamp as follows:
{code}
  private void includeTimestamp(final long timestamp) {
 ...
else if (maximumTimestamp  timestamp) {
  maximumTimestamp = timestamp;
}
return;
  }
{code}
Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
includeTimestamp(...) invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 
are both set by user(then, user knows the timestamps of kv1 and kv2), and t1  
t2  t0. T1 and T2 will be executed 

[jira] [Updated] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread cuijianwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cuijianwei updated HBASE-10598:
---

Attachment: HBASE-10598-0.94.v1.patch

 Written data can not be read out because MemStore#timeRangeTracker might be 
 updated concurrently
 

 Key: HBASE-10598
 URL: https://issues.apache.org/jira/browse/HBASE-10598
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.16
Reporter: cuijianwei
 Attachments: HBASE-10598-0.94.v1.patch


 In our test environment, we find written data can't be read out occasionally. 
 After debugging, we find that maximumTimestamp/minimumTimestamp of 
 MemStore#timeRangeTracker might decrease/increase when 
 MemStore#timeRangeTracker is updated concurrently, which might make the 
 MemStore/StoreFile to be filtered incorrectly when reading data out. Let's 
 see how the concurrent updating of timeRangeTracker#maximumTimestamp cause 
 this problem. 
 Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
 kv1 and kv2 belong to the same Store(so belong to the same region), but 
 contain different rowkeys. Consequently, kv1 and kv2 could be updated 
 concurrently. When we see the implementation of HRegionServer#multi, kv1 and 
 kv2 will be add to MemStore by HRegion#applyFamilyMapToMemstore in 
 HRegion#doMiniBatchMutation. Then, MemStore#internalAdd will be invoked and 
 MemStore#timeRangeTracker will be updated by 
 TimeRangeTracker#includeTimestamp as follows:
 {code}
   private void includeTimestamp(final long timestamp) {
  ...
 else if (maximumTimestamp  timestamp) {
   maximumTimestamp = timestamp;
 }
 return;
   }
 {code}
 Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
 includeTimestamp(...) invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 
 are both set by user(then, user knows the timestamps of kv1 and kv2), and t1 
  t2  t0. T1 and T2 will be executed concurrently, therefore, the two 
 threads might both find the current maximumTimestamp is less than the 
 timestamp of its kv. After that, T1 and T2 will both set maximumTimestamp to 
 timestamp of its kv. If T1 set maximumTimestamp before T2 doing that, the 
 maximumTimestamp will be set to t2. Then, before any new update with bigger 
 timestamp has been applied to the MemStore, if we try to read out kv1 by 
 HTable#get and set the timestamp of 'Get' to t1, the StoreScanner will decide 
 whether the MemStoreScanner(imagining kv1 has not been flushed) should be 
 selected as candidate scanner by MemStoreScanner#shouldUseScanner. Then, the 
 MemStore won't be selected in MemStoreScanner#shouldUseScanner because 
 maximumTimestamp of the MemStore has been set to t2 (t2  t1). Consequently, 
 the written kv1 can't be read out and kv1 is lost from user's perspective.
 If the above analysis is right, after maximumTimestamp of 
 MemStore#timeRangeTracker has been set to t2, user will experience data lass 
 in the following situations:
 1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
 read request of kv1 with timestamp=t1 can not read kv1 out.
 2. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
 if a flush happened, the data of MemStore will be flushed to StoreFile with 
 StoreFile#maximumTimestamp set to t2. After that, any read request with 
 timestamp=t1 can not read kv1 before next compaction(Actually, kv1.timestamp 
 might not be included in timeRange of the StoreFile even after compaction).
 The second situation is much more serious because the incorrect timeRange of 
 MemStore has been persisted to the file. 
 Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may 
 also cause this problem.
 As a simple way to fix the problem, we could add synchronized to 
 TimeRangeTracker#includeTimestamp so that this method won't be invoked 
 concurrently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously

2014-02-24 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910170#comment-13910170
 ] 

Feng Honghua commented on HBASE-10575:
--

[~lhofhansl], thanks for the review! :-)

Can it be committed, or any further feedback? Thanks

 ReplicationSource thread can't be terminated if it runs into the loop to 
 contact peer's zk ensemble and fails continuously
 --

 Key: HBASE-10575
 URL: https://issues.apache.org/jira/browse/HBASE-10575
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.98.1, 0.99.0, 0.94.17
Reporter: Feng Honghua
Assignee: Feng Honghua
Priority: Critical
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: HBASE-10575-trunk_v1.patch


 When ReplicationSource thread runs into the loop to contact peer's zk 
 ensemble, it doesn't check isActive() before each retry, so if the given 
 peer's zk ensemble is not reachable due to some reason, this 
 ReplicationSource thread just can't be terminated by outside such as 
 removePeer etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10594) Speed up TestRestoreSnapshotFromClient a bit

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910185#comment-13910185
 ] 

Hudson commented on HBASE-10594:


ABORTED: Integrated in hbase-0.96 #309 (See 
[https://builds.apache.org/job/hbase-0.96/309/])
HBASE-10594 Speed up TestRestoreSnapshotFromClient a bit. (larsh: rev 1571143)
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRestoreSnapshotFromClient.java


 Speed up TestRestoreSnapshotFromClient a bit
 

 Key: HBASE-10594
 URL: https://issues.apache.org/jira/browse/HBASE-10594
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: 10594-0.94.txt, 10594-trunk.txt


 Looking through the longest running test in 0.94 I noticed that 
 TestRestoreSnapshotFromClient runs for over 10 minutes on the jenkins boxes 
 (264s on my local box).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910186#comment-13910186
 ] 

Hudson commented on HBASE-10579:


ABORTED: Integrated in HBase-0.98 #179 (See 
[https://builds.apache.org/job/HBase-0.98/179/])
HBASE-10579 ExportSnapshot tool package incorrectly documented (Aleksandr 
Shulman) (mbertozzi: rev 1571201)
* /hbase/branches/0.98/src/main/docbkx/ops_mgt.xml


 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10594) Speed up TestRestoreSnapshotFromClient a bit

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910189#comment-13910189
 ] 

Hudson commented on HBASE-10594:


ABORTED: Integrated in HBase-0.98 #179 (See 
[https://builds.apache.org/job/HBase-0.98/179/])
HBASE-10594 Speed up TestRestoreSnapshotFromClient a bit. (larsh: rev 1571142)
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRestoreSnapshotFromClient.java


 Speed up TestRestoreSnapshotFromClient a bit
 

 Key: HBASE-10594
 URL: https://issues.apache.org/jira/browse/HBASE-10594
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: 10594-0.94.txt, 10594-trunk.txt


 Looking through the longest running test in 0.94 I noticed that 
 TestRestoreSnapshotFromClient runs for over 10 minutes on the jenkins boxes 
 (264s on my local box).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910208#comment-13910208
 ] 

Hudson commented on HBASE-10525:


FAILURE: Integrated in HBase-TRUNK #4947 (See 
[https://builds.apache.org/job/HBase-TRUNK/4947/])
HBASE-10525 Allow the client to use a different thread for writing to ease 
interrupt (nkeywal: rev 1571210)
* /hbase/trunk/dev-support/findbugs-exclude.xml
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClient.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Allow the client to use a different thread for writing to ease interrupt
 

 Key: HBASE-10525
 URL: https://issues.apache.org/jira/browse/HBASE-10525
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 
 10525.v4.patch, 10525.v5.patch, 10525.v6.patch, 10525.v7.patch, 
 HBaseclient-EventualConsistency.pdf


 This is an issue in the HBASE-10070 context, but as well more generally if 
 you want to interrupt an operation with a limited cost. 
 I will attach a doc with a more detailed explanation.
 This adds a thread per region server; so it's otional. The first patch 
 activates it by default to see how it behaves on a full hadoop-qa run. The 
 target is to be unset by default.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910206#comment-13910206
 ] 

Hudson commented on HBASE-10579:


FAILURE: Integrated in HBase-TRUNK #4947 (See 
[https://builds.apache.org/job/HBase-TRUNK/4947/])
HBASE-10579 ExportSnapshot tool package incorrectly documented (Aleksandr 
Shulman) (mbertozzi: rev 1571200)
* /hbase/trunk/src/main/docbkx/ops_mgt.xml


 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10594) Speed up TestRestoreSnapshotFromClient a bit

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910207#comment-13910207
 ] 

Hudson commented on HBASE-10594:


FAILURE: Integrated in HBase-TRUNK #4947 (See 
[https://builds.apache.org/job/HBase-TRUNK/4947/])
HBASE-10594 Speed up TestRestoreSnapshotFromClient a bit. (larsh: rev 1571141)
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRestoreSnapshotFromClient.java


 Speed up TestRestoreSnapshotFromClient a bit
 

 Key: HBASE-10594
 URL: https://issues.apache.org/jira/browse/HBASE-10594
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: 10594-0.94.txt, 10594-trunk.txt


 Looking through the longest running test in 0.94 I noticed that 
 TestRestoreSnapshotFromClient runs for over 10 minutes on the jenkins boxes 
 (264s on my local box).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10499) In write heavy scenario one of the regions does not get flushed causing RegionTooBusyException

2014-02-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910245#comment-13910245
 ] 

ramkrishna.s.vasudevan commented on HBASE-10499:


[~fenghh]
The number of flushers should be the default value.  I did not change that.  
Sorry for the late reply.
bq.But if you want to raise a JIRA to replace System.currentMilllis with 
EnvirnonmentEdge.currentMillis
Yes better to change.  I am not saying this JIRA is because of this, but just 
wanted to ensure we change it.  Will raise one.

 In write heavy scenario one of the regions does not get flushed causing 
 RegionTooBusyException
 --

 Key: HBASE-10499
 URL: https://issues.apache.org/jira/browse/HBASE-10499
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10499.patch, 
 hbase-root-master-ip-10-157-0-229.zip, 
 hbase-root-regionserver-ip-10-93-128-92.zip, master_4e39.log, 
 master_576f.log, rs_4e39.log, rs_576f.log, t1.dump, t2.dump, 
 workloada_0.98.dat


 I got this while testing 0.98RC.  But am not sure if it is specific to this 
 version.  Doesn't seem so to me.  
 Also it is something similar to HBASE-5312 and HBASE-5568.
 Using 10 threads i do writes to 4 RS using YCSB. The table created has 200 
 regions.  In one of the run with 0.98 server and 0.98 client I faced this 
 problem like the hlogs became more and the system requested flushes for those 
 many regions.
 One by one everything was flushed except one and that one thing remained 
 unflushed.  The ripple effect of this on the client side
 {code}
 com.yahoo.ycsb.DBException: 
 org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
 54 actions: RegionTooBusyException: 54 times,
 at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:245)
 at com.yahoo.ycsb.DBWrapper.cleanup(DBWrapper.java:73)
 at com.yahoo.ycsb.ClientThread.run(Client.java:307)
 Caused by: 
 org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
 54 actions: RegionTooBusyException: 54 times,
 at 
 org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:187)
 at 
 org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$500(AsyncProcess.java:171)
 at 
 org.apache.hadoop.hbase.client.AsyncProcess.getErrors(AsyncProcess.java:897)
 at 
 org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:961)
 at 
 org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1225)
 at com.yahoo.ycsb.db.HBaseClient.cleanup(HBaseClient.java:232)
 ... 2 more
 {code}
 On one of the RS
 {code}
 2014-02-11 08:45:58,714 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
 many hlogs: logs=38, maxlogs=32; forcing flush of 23 regions(s): 
 97d8ae2f78910cc5ded5fbb1ddad8492, d396b8a1da05c871edcb68a15608fdf2, 
 01a68742a1be3a9705d574ad68fec1d7, 1250381046301e7465b6cf398759378e, 
 127c133f47d0419bd5ab66675aff76d4, 9f01c5d25ddc6675f750968873721253, 
 29c055b5690839c2fa357cd8e871741e, ca4e33e3eb0d5f8314ff9a870fc43463, 
 acfc6ae756e193b58d956cb71ccf0aa3, 187ea304069bc2a3c825bc10a59c7e84, 
 0ea411edc32d5c924d04bf126fa52d1e, e2f9331fc7208b1b230a24045f3c869e, 
 d9309ca864055eddf766a330352efc7a, 1a71bdf457288d449050141b5ff00c69, 
 0ba9089db28e977f86a27f90bbab9717, fdbb3242d3b673bbe4790a47bc30576f, 
 bbadaa1f0e62d8a8650080b824187850, b1a5de30d8603bd5d9022e09c574501b, 
 cc6a9fabe44347ed65e7c325faa72030, 313b17dbff2497f5041b57fe13fa651e, 
 6b788c498503ddd3e1433a4cd3fb4e39, 3d71274fe4f815882e9626e1cfa050d1, 
 acc43e4b42c1a041078774f4f20a3ff5
 ..
 2014-02-11 08:47:49,580 INFO  [regionserver60020.logRoller] wal.FSHLog: Too 
 many hlogs: logs=53, maxlogs=32; forcing flush of 2 regions(s): 
 fdbb3242d3b673bbe4790a47bc30576f, 6b788c498503ddd3e1433a4cd3fb4e39
 {code}
 {code}
 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
 flush for region 
 usertable,user3654,1392107806977.fdbb3242d3b673bbe4790a47bc30576f. after a 
 delay of 16689
 2014-02-11 09:42:44,237 INFO  [regionserver60020.periodicFlusher] 
 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
 flush for region 
 usertable,user6264,1392107806983.6b788c498503ddd3e1433a4cd3fb4e39. after a 
 delay of 15868
 2014-02-11 09:42:54,238 INFO  [regionserver60020.periodicFlusher] 
 regionserver.HRegionServer: regionserver60020.periodicFlusher requesting 
 flush for region 
 

[jira] [Created] (HBASE-10599) Replace System.currentMillis() with EnvironmentEdge.currentTimeMillis in memstore flusher and related places

2014-02-24 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-10599:
--

 Summary: Replace System.currentMillis() with 
EnvironmentEdge.currentTimeMillis in memstore flusher and related places
 Key: HBASE-10599
 URL: https://issues.apache.org/jira/browse/HBASE-10599
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.1.1, 0.98.0, 0.99.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.2, 0.98.1, 0.99.0


Memstoreflusher still uses System.currentMillis.  Better to replace it with 
EnvironmentEdge.currentMillis(),



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10599) Replace System.currentMillis() with EnvironmentEdge.currentTimeMillis in memstore flusher and related places

2014-02-24 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910259#comment-13910259
 ] 

Jean-Marc Spaggiari commented on HBASE-10599:
-

Hi Ramkrishna,

For my knowledge, why should EnvironmentEdge.currentMillis() be prefered to 
System.currentMillis()?

 Replace System.currentMillis() with EnvironmentEdge.currentTimeMillis in 
 memstore flusher and related places
 

 Key: HBASE-10599
 URL: https://issues.apache.org/jira/browse/HBASE-10599
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.96.2, 0.98.1, 0.99.0


 Memstoreflusher still uses System.currentMillis.  Better to replace it with 
 EnvironmentEdge.currentMillis(),



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10595) HBaseAdmin.getTableDescriptor can wrongly get the previous table's TableDescriptor even after the table dir in hdfs is removed

2014-02-24 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10595:
-

Attachment: HBASE-10595-trunk_v3.patch

 HBaseAdmin.getTableDescriptor can wrongly get the previous table's 
 TableDescriptor even after the table dir in hdfs is removed
 --

 Key: HBASE-10595
 URL: https://issues.apache.org/jira/browse/HBASE-10595
 Project: HBase
  Issue Type: Bug
  Components: master, util
Reporter: Feng Honghua
Assignee: Feng Honghua
 Attachments: HBASE-10595-trunk_v1.patch, HBASE-10595-trunk_v2.patch, 
 HBASE-10595-trunk_v3.patch


 When a table dir (in hdfs) is removed(by outside), HMaster will still return 
 the cached TableDescriptor to client for getTableDescriptor request.
 On the contrary, HBaseAdmin.listTables() is handled correctly in current 
 implementation, for a table whose table dir in hdfs is removed by outside, 
 getTableDescriptor can still retrieve back a valid (old) table descriptor, 
 while listTables says it doesn't exist, this is inconsistent
 The reason for this bug is because HMaster (via FSTableDescriptors) doesn't 
 check if the table dir exists for getTableDescriptor() request, (while it 
 lists all existing table dirs(not firstly respects cache) and returns 
 accordingly for listTables() request)
 When a table is deleted via deleteTable, the cache will be cleared after the 
 table dir and tableInfo file is removed, listTables/getTableDescriptor 
 inconsistency should be transient(though still exists, when table dir is 
 removed while cache is not cleared) and harder to expose



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8803) region_mover.rb should move multiple regions at a time

2014-02-24 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910276#comment-13910276
 ] 

Jean-Marc Spaggiari commented on HBASE-8803:


Thanks. But you (or any other commiter) will have to do it, since I can not 
(yet). I have been able to apply current patch without any modification in 
0.94. I can rebase if required.

 region_mover.rb should move multiple regions at a time
 --

 Key: HBASE-8803
 URL: https://issues.apache.org/jira/browse/HBASE-8803
 Project: HBase
  Issue Type: Bug
  Components: Usability
Affects Versions: 0.98.0, 0.94.8, 0.95.1
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
 Fix For: 0.99.0

 Attachments: 8803v5.txt, HBASE-8803-v0-trunk.patch, 
 HBASE-8803-v1-0.94.patch, HBASE-8803-v1-trunk.patch, 
 HBASE-8803-v2-0.94.patch, HBASE-8803-v2-0.94.patch, HBASE-8803-v3-0.94.patch, 
 HBASE-8803-v4-0.94.patch, HBASE-8803-v4-trunk.patch, 
 HBASE-8803-v5-0.94.patch, HBASE-8803-v6-0.94.patch, HBASE-8803-v6-trunk.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 When there is many regions in a cluster, rolling_restart can take hours 
 because region_mover is moving the regions one by one.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10594) Speed up TestRestoreSnapshotFromClient a bit

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910288#comment-13910288
 ] 

Hudson commented on HBASE-10594:


SUCCESS: Integrated in hbase-0.96-hadoop2 #213 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/213/])
HBASE-10594 Speed up TestRestoreSnapshotFromClient a bit. (larsh: rev 1571143)
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRestoreSnapshotFromClient.java


 Speed up TestRestoreSnapshotFromClient a bit
 

 Key: HBASE-10594
 URL: https://issues.apache.org/jira/browse/HBASE-10594
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.18

 Attachments: 10594-0.94.txt, 10594-trunk.txt


 Looking through the longest running test in 0.94 I noticed that 
 TestRestoreSnapshotFromClient runs for over 10 minutes on the jenkins boxes 
 (264s on my local box).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910287#comment-13910287
 ] 

Hudson commented on HBASE-10579:


SUCCESS: Integrated in hbase-0.96-hadoop2 #213 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/213/])
HBASE-10579 ExportSnapshot tool package incorrectly documented (Aleksandr 
Shulman) (mbertozzi: rev 1571202)
* /hbase/branches/0.96/src/main/docbkx/ops_mgt.xml


 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910290#comment-13910290
 ] 

Hudson commented on HBASE-10579:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #167 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/167/])
HBASE-10579 ExportSnapshot tool package incorrectly documented (Aleksandr 
Shulman) (mbertozzi: rev 1571201)
* /hbase/branches/0.98/src/main/docbkx/ops_mgt.xml


 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10591) Sanity check table configuration in createTable

2014-02-24 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910295#comment-13910295
 ] 

Jean-Marc Spaggiari commented on HBASE-10591:
-

Can we add a force parameter in case anyone really want to have a value 
outside of those values, and knows what he is doing?

As an example, I have a table with MAX_FILESIZE = '1638400' (Less than 2MB). 
This region handle VERY small keys/values. Value is 1 byte. Keys are less than 
32 bytes. But I want this table to be spread over all my servers. So I have to 
put a small MAX_FILESIZE value.

With current patch, I will not be able to do that anymore, which is bad for me. 
So I would really prefer to have a force option. Yes I can use 
hbase.hregion.max.filesize.limit and set it to 1MB, but since I think this is 
still a good idea, I want to have this check for my other tables :)

My 2¢.

 Sanity check table configuration in createTable
 ---

 Key: HBASE-10591
 URL: https://issues.apache.org/jira/browse/HBASE-10591
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.99.0

 Attachments: hbase-10591_v1.patch, hbase-10591_v2.patch


 We had a cluster completely become unoperational, because a couple of table 
 was erroneously created with MAX_FILESIZE set to 4K, which resulted in 180K 
 regions in a short interval, and bringing the master down due to  HBASE-4246.
 We can do some sanity checking in master.createTable() and reject the 
 requests. We already check the compression there, so it seems a good place. 
 Alter table should also check for this as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-10574) IllegalArgumentException Hadoop Hbase

2014-02-24 Thread Jean-Marc Spaggiari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Marc Spaggiari resolved HBASE-10574.
-

Resolution: Not A Problem

Closing as per request.

 IllegalArgumentException Hadoop Hbase
 -

 Key: HBASE-10574
 URL: https://issues.apache.org/jira/browse/HBASE-10574
 Project: HBase
  Issue Type: Test
  Components: hadoop2
Affects Versions: 0.96.0
 Environment: Windows 
Reporter: SSR
Priority: Critical
   Original Estimate: 96h
  Remaining Estimate: 96h

 Hi All,
 We are trying to load the data to HBase We are able to connect Hbase from 
 Eclipse.
 We are following the tutorial at:
 http://courses.coreservlets.com/Course-Materials/pdf/hadoop/04-MapRed-4-InputAndOutput.pdf
 When we run the program we are getting the below exception.
 2014-02-20 10:28:04,099 INFO  [main] mapreduce.JobSubmitter 
 (JobSubmitter.java:submitJobInternal(439)) - Cleaning up the staging area 
 file:/tmp/hadoop-yarakanaboinas/mapred/staging/yarakanaboinas1524547448/.staging/job_local1524547448_0001
 Exception in thread main java.lang.IllegalArgumentException: Pathname 
 /C:/hdp/hbase-0.96.0.2.0.6.0-0009-hadoop2/lib/hbase-client-0.96.0.2.0.6.0-0009-hadoop2.jar
  from 
 hdfs://HBADGX7900016:8020/C:/hdp/hbase-0.96.0.2.0.6.0-0009-hadoop2/lib/hbase-client-0.96.0.2.0.6.0-0009-hadoop2.jar
  is not a valid DFS filename.
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:184)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:92)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
   at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
   at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:264)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387)
   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)
   at 
 WordCountMapper.StartWithCountJob_HBase.run(StartWithCountJob_HBase.java:41)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
   at 
 WordCountMapper.StartWithCountJob_HBase.main(StartWithCountJob_HBase.java:44)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910356#comment-13910356
 ] 

Hudson commented on HBASE-10579:


FAILURE: Integrated in hbase-0.96 #310 (See 
[https://builds.apache.org/job/hbase-0.96/310/])
HBASE-10579 ExportSnapshot tool package incorrectly documented (Aleksandr 
Shulman) (mbertozzi: rev 1571202)
* /hbase/branches/0.96/src/main/docbkx/ops_mgt.xml


 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10595) HBaseAdmin.getTableDescriptor can wrongly get the previous table's TableDescriptor even after the table dir in hdfs is removed

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910362#comment-13910362
 ] 

Hadoop QA commented on HBASE-10595:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630659/HBASE-10595-trunk_v3.patch
  against trunk revision .
  ATTACHMENT ID: 12630659

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8784//console

This message is automatically generated.

 HBaseAdmin.getTableDescriptor can wrongly get the previous table's 
 TableDescriptor even after the table dir in hdfs is removed
 --

 Key: HBASE-10595
 URL: https://issues.apache.org/jira/browse/HBASE-10595
 Project: HBase
  Issue Type: Bug
  Components: master, util
Reporter: Feng Honghua
Assignee: Feng Honghua
 Attachments: HBASE-10595-trunk_v1.patch, HBASE-10595-trunk_v2.patch, 
 HBASE-10595-trunk_v3.patch


 When a table dir (in hdfs) is removed(by outside), HMaster will still return 
 the cached TableDescriptor to client for getTableDescriptor request.
 On the contrary, HBaseAdmin.listTables() is handled correctly in current 
 implementation, for a table whose table dir in hdfs is removed by outside, 
 getTableDescriptor can still retrieve back a valid (old) table descriptor, 
 while listTables says it doesn't exist, this is inconsistent
 The reason for this bug is because HMaster (via FSTableDescriptors) doesn't 
 check if the table dir exists for getTableDescriptor() request, (while it 
 lists all existing table dirs(not firstly respects cache) and returns 
 accordingly for listTables() request)
 When a table is deleted via deleteTable, the cache will be cleared after the 
 table dir and tableInfo file is removed, listTables/getTableDescriptor 
 inconsistency should be transient(though still exists, when table dir is 
 removed while cache is not cleared) and harder to expose



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt

2014-02-24 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910386#comment-13910386
 ] 

Devaraj Das commented on HBASE-10525:
-

Yes, [~nkeywal], i just had that question. Thanks for the clarification.

 Allow the client to use a different thread for writing to ease interrupt
 

 Key: HBASE-10525
 URL: https://issues.apache.org/jira/browse/HBASE-10525
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 
 10525.v4.patch, 10525.v5.patch, 10525.v6.patch, 10525.v7.patch, 
 HBaseclient-EventualConsistency.pdf


 This is an issue in the HBASE-10070 context, but as well more generally if 
 you want to interrupt an operation with a limited cost. 
 I will attach a doc with a more detailed explanation.
 This adds a thread per region server; so it's otional. The first patch 
 activates it by default to see how it behaves on a full hadoop-qa run. The 
 target is to be unset by default.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Nicolas Liochon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10566:


Status: Patch Available  (was: Open)

 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Nicolas Liochon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10566:


Attachment: 10566.v1.patch

 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10451) Enable back Tag compression on HFiles

2014-02-24 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910450#comment-13910450
 ] 

Anoop Sam John commented on HBASE-10451:


Not able to see the test result to check for zombie test !!

 Enable back Tag compression on HFiles
 -

 Key: HBASE-10451
 URL: https://issues.apache.org/jira/browse/HBASE-10451
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10451.patch, HBASE-10451_V2.patch, 
 HBASE-10451_V3.patch, HBASE-10451_V4.patch, HBASE-10451_V5.patch, 
 HBASE-10451_V6.patch


 HBASE-10443 disables tag compression on HFiles. This Jira is to fix the 
 issues we have found out in HBASE-10443 and enable it back.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910452#comment-13910452
 ] 

Nicolas Liochon commented on HBASE-10566:
-

v1 is a first attempt. I haven't run all the tests locally, but I had no error 
after a 30 minutes run.

3 different socket timeouts
- connect
- read
- write

For all of them, we should be able to set them to low value, something like 2 / 
5 / 5, without any impact. Likely I will need to write a test for this. The 
existing timeout of 60s is a global timeout for the operation. I need to double 
check how we were using the existing operationTimout, my feeling is that it was 
buggy, and that it was overriding the individual timeout. If it's the case, 
it's still buggy.




 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10451) Enable back Tag compression on HFiles

2014-02-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910513#comment-13910513
 ] 

ramkrishna.s.vasudevan commented on HBASE-10451:


+1.  You can check the zombie test once. If your testing is satisfactory then 
commit the patch.

 Enable back Tag compression on HFiles
 -

 Key: HBASE-10451
 URL: https://issues.apache.org/jira/browse/HBASE-10451
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10451.patch, HBASE-10451_V2.patch, 
 HBASE-10451_V3.patch, HBASE-10451_V4.patch, HBASE-10451_V5.patch, 
 HBASE-10451_V6.patch


 HBASE-10443 disables tag compression on HFiles. This Jira is to fix the 
 issues we have found out in HBASE-10443 and enable it back.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10587) Master metrics clusterRequests is wrong

2014-02-24 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-10587:


Priority: Minor  (was: Major)

 Master metrics clusterRequests is wrong
 ---

 Key: HBASE-10587
 URL: https://issues.apache.org/jira/browse/HBASE-10587
 Project: HBase
  Issue Type: Bug
  Components: master, metrics
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: hbase-10587.patch


 In the master jmx, metrics clusterRequests increases so fast. Looked into the 
 code and found the calculation is a little bit wrong. It's a counter. 
 However, for each region server report, the total number of requests is added 
 to clusterRequests. That means it's added multiple times.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10587) Master metrics clusterRequests is wrong

2014-02-24 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-10587:


   Resolution: Fixed
Fix Version/s: 0.99.0
   0.98.1
   0.96.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Integrated into 0.96, 0.98, and trunk. Thanks Enis for reviewing it.

 Master metrics clusterRequests is wrong
 ---

 Key: HBASE-10587
 URL: https://issues.apache.org/jira/browse/HBASE-10587
 Project: HBase
  Issue Type: Bug
  Components: master, metrics
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: hbase-10587.patch


 In the master jmx, metrics clusterRequests increases so fast. Looked into the 
 code and found the calculation is a little bit wrong. It's a counter. 
 However, for each region server report, the total number of requests is added 
 to clusterRequests. That means it's added multiple times.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10597:
---

Attachment: 10597-v2.txt

Patch v2 addresses Anoop's comments.

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10597-v1.txt, 10597-v2.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910580#comment-13910580
 ] 

Andrew Purtell commented on HBASE-10597:


Checking return values is good.

Why only a log message here? Is this an error? How should it be handled? 

{code}
Index: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
===
--- 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
 (revision 1571351)
+++ 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
 (working copy)
@@ -367,7 +367,10 @@
 if (bucketEntry.equals(backingMap.get(key))) {
   int len = bucketEntry.getLength();
   ByteBuffer bb = ByteBuffer.allocate(len);
-  ioEngine.read(bb, bucketEntry.offset());
+  int lenRead = ioEngine.read(bb, bucketEntry.offset());
+  if (lenRead != len) {
+LOG.warn(Only  + lenRead +  bytes read,  + len +  expected);
+  }
   Cacheable cachedBlock = bucketEntry.deserializerReference(
   deserialiserMap).deserialize(bb, true);
   long timeTaken = System.nanoTime() - start;
{code}


 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10597-v1.txt, 10597-v2.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10593) FileInputStream in JenkinsHash#main() is never closed

2014-02-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910585#comment-13910585
 ] 

Andrew Purtell commented on HBASE-10593:


bq. Why not work on removing this unused class instead of 'fixing' it?

+1 to that

 FileInputStream in JenkinsHash#main() is never closed
 -

 Key: HBASE-10593
 URL: https://issues.apache.org/jira/browse/HBASE-10593
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Priority: Trivial

 {code}
 FileInputStream in = new FileInputStream(args[0]);
 {code}
 The above FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910592#comment-13910592
 ] 

Hadoop QA commented on HBASE-10566:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630699/10566.v1.patch
  against trunk revision .
  ATTACHMENT ID: 12630699

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.client.TestClientOperationInterrupt

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8785//console

This message is automatically generated.

 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on 

[jira] [Commented] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910597#comment-13910597
 ] 

Ted Yu commented on HBASE-10597:


I thought about throwing exception when there is mismatch in length read.
Here is the method signature for BlockCache#getBlock():
{code}
  Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean repeat);
{code}
If the above signature is kept, some RuntimeException would be thrown.
Is that Okay ?

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10597-v1.txt, 10597-v2.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910599#comment-13910599
 ] 

Andrew Purtell commented on HBASE-10597:


bq. If the above signature is kept, some RuntimeException would be thrown. Is 
that Okay ?

Yes, I think so.

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10597-v1.txt, 10597-v2.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Nicolas Liochon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10566:


Status: Open  (was: Patch Available)

 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch, 10566.v2.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Nicolas Liochon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Liochon updated HBASE-10566:


Status: Patch Available  (was: Open)

 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch, 10566.v2.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910602#comment-13910602
 ] 

Nicolas Liochon commented on HBASE-10566:
-

v2 fixes the test error. I'm not sure that we should not get rid of 
'wrapException' however. We're spending a lot of time wrapping the exceptions, 
and then unwrapping them to discover what really happened.

 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch, 10566.v2.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10597:
---

Attachment: 10597-v3.txt

Thanks for the confirmation, Andy.

Here is patch v3.

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10597-v1.txt, 10597-v2.txt, 10597-v3.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910624#comment-13910624
 ] 

Andrew Purtell commented on HBASE-10597:


+1 on v3 if HadoopQA is happy

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10597-v1.txt, 10597-v2.txt, 10597-v3.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10591) Sanity check table configuration in createTable

2014-02-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910635#comment-13910635
 ] 

Andrew Purtell commented on HBASE-10591:


Funny, on HBASE-10571 I suggested the JIRA be re-scoped, to basically this. Can 
we also add TTL checks here and close HBASE-10571 as a dup?

 Sanity check table configuration in createTable
 ---

 Key: HBASE-10591
 URL: https://issues.apache.org/jira/browse/HBASE-10591
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.99.0

 Attachments: hbase-10591_v1.patch, hbase-10591_v2.patch


 We had a cluster completely become unoperational, because a couple of table 
 was erroneously created with MAX_FILESIZE set to 4K, which resulted in 180K 
 regions in a short interval, and bringing the master down due to  HBASE-4246.
 We can do some sanity checking in master.createTable() and reject the 
 requests. We already check the compression there, so it seems a good place. 
 Alter table should also check for this as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10591) Sanity check table configuration in createTable

2014-02-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910638#comment-13910638
 ] 

Andrew Purtell commented on HBASE-10591:


I also think sanity checking needs to be done for table schema modifications as 
well as the initial create.

 Sanity check table configuration in createTable
 ---

 Key: HBASE-10591
 URL: https://issues.apache.org/jira/browse/HBASE-10591
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.99.0

 Attachments: hbase-10591_v1.patch, hbase-10591_v2.patch


 We had a cluster completely become unoperational, because a couple of table 
 was erroneously created with MAX_FILESIZE set to 4K, which resulted in 180K 
 regions in a short interval, and bringing the master down due to  HBASE-4246.
 We can do some sanity checking in master.createTable() and reject the 
 requests. We already check the compression there, so it seems a good place. 
 Alter table should also check for this as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10600) HTable#batch() should perform validation on empty Put

2014-02-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-10600:
--

 Summary: HTable#batch() should perform validation on empty Put
 Key: HBASE-10600
 URL: https://issues.apache.org/jira/browse/HBASE-10600
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


Raised by java8964 in this thread:
http://osdir.com/ml/general/2014-02/msg44384.html

When an empty Put is passed in the List to HTable#batch(), there is no 
validation performed whereas IllegalArgumentException would have been thrown if 
this empty Put in the simple Put API call.

Validation on empty Put should be carried out in HTable#batch().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10355) Failover RPC's from client using region replicas

2014-02-24 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910643#comment-13910643
 ] 

Nicolas Liochon commented on HBASE-10355:
-

bq. We should either document this very well, or auto-enable interrupts if this 
jira is used. 
It's not easy to do that, because the RpcClient does not really know about the 
replica.
Something that we could do however, is to do a single check in HTable: if we 
have a get with Consistency != Strong, we check the value for allowsInterrupt. 
If false, we log a warning message. The other option would be to throw an 
illegalStateException, if we want to say that we support only this option with 
replica (and it would make sense).

 Failover RPC's from client using region replicas
 

 Key: HBASE-10355
 URL: https://issues.apache.org/jira/browse/HBASE-10355
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Reporter: Enis Soztutar
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10355.v1.patch, 10355.v2.patch, 10355.v3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10591) Sanity check table configuration in createTable

2014-02-24 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910652#comment-13910652
 ] 

Enis Soztutar commented on HBASE-10591:
---

bq. As an example, I have a table with MAX_FILESIZE = '1638400' (Less than 
2MB). This region handle VERY small keys/values. Value is 1 byte. Keys are less 
than 32 bytes. But I want this table to be spread over all my servers. So I 
have to put a small MAX_FILESIZE value.
Let me make it a per table configuration. 
bq. Can we also add TTL checks here and close HBASE-10571 as a dup?
Makes sense. 
bq. I also think sanity checking needs to be done for table schema 
modifications as well as the initial create.
The patch does the checks in modify table as well. 

 Sanity check table configuration in createTable
 ---

 Key: HBASE-10591
 URL: https://issues.apache.org/jira/browse/HBASE-10591
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.99.0

 Attachments: hbase-10591_v1.patch, hbase-10591_v2.patch


 We had a cluster completely become unoperational, because a couple of table 
 was erroneously created with MAX_FILESIZE set to 4K, which resulted in 180K 
 regions in a short interval, and bringing the master down due to  HBASE-4246.
 We can do some sanity checking in master.createTable() and reject the 
 requests. We already check the compression there, so it seems a good place. 
 Alter table should also check for this as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10598) Written data can not be read out because MemStore#timeRangeTracker might be updated concurrently

2014-02-24 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910669#comment-13910669
 ] 

Enis Soztutar commented on HBASE-10598:
---

Nice finding! Can we do this with two AtomicLongs with compare and set? 

 Written data can not be read out because MemStore#timeRangeTracker might be 
 updated concurrently
 

 Key: HBASE-10598
 URL: https://issues.apache.org/jira/browse/HBASE-10598
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.16
Reporter: cuijianwei
 Attachments: HBASE-10598-0.94.v1.patch


 In our test environment, we find written data can't be read out occasionally. 
 After debugging, we find that maximumTimestamp/minimumTimestamp of 
 MemStore#timeRangeTracker might decrease/increase when 
 MemStore#timeRangeTracker is updated concurrently, which might make the 
 MemStore/StoreFile to be filtered incorrectly when reading data out. Let's 
 see how the concurrent updating of timeRangeTracker#maximumTimestamp cause 
 this problem. 
 Imagining there are two threads T1 and T2 putting two KeyValues kv1 and kv2. 
 kv1 and kv2 belong to the same Store(so belong to the same region), but 
 contain different rowkeys. Consequently, kv1 and kv2 could be updated 
 concurrently. When we see the implementation of HRegionServer#multi, kv1 and 
 kv2 will be add to MemStore by HRegion#applyFamilyMapToMemstore in 
 HRegion#doMiniBatchMutation. Then, MemStore#internalAdd will be invoked and 
 MemStore#timeRangeTracker will be updated by 
 TimeRangeTracker#includeTimestamp as follows:
 {code}
   private void includeTimestamp(final long timestamp) {
  ...
 else if (maximumTimestamp  timestamp) {
   maximumTimestamp = timestamp;
 }
 return;
   }
 {code}
 Imagining the current maximumTimestamp of TimeRangeTracker is t0 before 
 includeTimestamp(...) invoked, kv1.timestamp=t1,  kv2.timestamp=t2, t1 and t2 
 are both set by user(then, user knows the timestamps of kv1 and kv2), and t1 
  t2  t0. T1 and T2 will be executed concurrently, therefore, the two 
 threads might both find the current maximumTimestamp is less than the 
 timestamp of its kv. After that, T1 and T2 will both set maximumTimestamp to 
 timestamp of its kv. If T1 set maximumTimestamp before T2 doing that, the 
 maximumTimestamp will be set to t2. Then, before any new update with bigger 
 timestamp has been applied to the MemStore, if we try to read out kv1 by 
 HTable#get and set the timestamp of 'Get' to t1, the StoreScanner will decide 
 whether the MemStoreScanner(imagining kv1 has not been flushed) should be 
 selected as candidate scanner by MemStoreScanner#shouldUseScanner. Then, the 
 MemStore won't be selected in MemStoreScanner#shouldUseScanner because 
 maximumTimestamp of the MemStore has been set to t2 (t2  t1). Consequently, 
 the written kv1 can't be read out and kv1 is lost from user's perspective.
 If the above analysis is right, after maximumTimestamp of 
 MemStore#timeRangeTracker has been set to t2, user will experience data lass 
 in the following situations:
 1. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
 read request of kv1 with timestamp=t1 can not read kv1 out.
 2. Before any new write with kv.timestamp  t1 has been add to the MemStore, 
 if a flush happened, the data of MemStore will be flushed to StoreFile with 
 StoreFile#maximumTimestamp set to t2. After that, any read request with 
 timestamp=t1 can not read kv1 before next compaction(Actually, kv1.timestamp 
 might not be included in timeRange of the StoreFile even after compaction).
 The second situation is much more serious because the incorrect timeRange of 
 MemStore has been persisted to the file. 
 Similarly, the concurrent update of TimeRangeTracker#minimumTimestamp may 
 also cause this problem.
 As a simple way to fix the problem, we could add synchronized to 
 TimeRangeTracker#includeTimestamp so that this method won't be invoked 
 concurrently.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10600) HTable#batch() should perform validation on empty Put

2014-02-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910680#comment-13910680
 ] 

Andrew Purtell commented on HBASE-10600:


Do you have a patch for this Ted?

 HTable#batch() should perform validation on empty Put
 -

 Key: HBASE-10600
 URL: https://issues.apache.org/jira/browse/HBASE-10600
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu

 Raised by java8964 in this thread:
 http://osdir.com/ml/general/2014-02/msg44384.html
 When an empty Put is passed in the List to HTable#batch(), there is no 
 validation performed whereas IllegalArgumentException would have been thrown 
 if this empty Put in the simple Put API call.
 Validation on empty Put should be carried out in HTable#batch().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10355) Failover RPC's from client using region replicas

2014-02-24 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910688#comment-13910688
 ] 

Devaraj Das commented on HBASE-10355:
-

bq. Something that we could do however, is to do a single check in HTable: if 
we have a get with Consistency != Strong,
[~nkeywal], wondering if it is possible to set the configuration to have 
interrupts enabled from the HTable layer and pass it down to the RPC layer.

 Failover RPC's from client using region replicas
 

 Key: HBASE-10355
 URL: https://issues.apache.org/jira/browse/HBASE-10355
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Reporter: Enis Soztutar
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10355.v1.patch, 10355.v2.patch, 10355.v3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10595) HBaseAdmin.getTableDescriptor can wrongly get the previous table's TableDescriptor even after the table dir in hdfs is removed

2014-02-24 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910694#comment-13910694
 ] 

Enis Soztutar commented on HBASE-10595:
---

Going to NN for checking whether table dir exists basically means that we 
should not be using the cache at all. Users are expected to not delete the 
table directory from the file system, which will cause further inconsistencies. 
Why do you think this is a problem? 

 HBaseAdmin.getTableDescriptor can wrongly get the previous table's 
 TableDescriptor even after the table dir in hdfs is removed
 --

 Key: HBASE-10595
 URL: https://issues.apache.org/jira/browse/HBASE-10595
 Project: HBase
  Issue Type: Bug
  Components: master, util
Reporter: Feng Honghua
Assignee: Feng Honghua
 Attachments: HBASE-10595-trunk_v1.patch, HBASE-10595-trunk_v2.patch, 
 HBASE-10595-trunk_v3.patch


 When a table dir (in hdfs) is removed(by outside), HMaster will still return 
 the cached TableDescriptor to client for getTableDescriptor request.
 On the contrary, HBaseAdmin.listTables() is handled correctly in current 
 implementation, for a table whose table dir in hdfs is removed by outside, 
 getTableDescriptor can still retrieve back a valid (old) table descriptor, 
 while listTables says it doesn't exist, this is inconsistent
 The reason for this bug is because HMaster (via FSTableDescriptors) doesn't 
 check if the table dir exists for getTableDescriptor() request, (while it 
 lists all existing table dirs(not firstly respects cache) and returns 
 accordingly for listTables() request)
 When a table is deleted via deleteTable, the cache will be cleared after the 
 table dir and tableInfo file is removed, listTables/getTableDescriptor 
 inconsistency should be transient(though still exists, when table dir is 
 removed while cache is not cleared) and harder to expose



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10600) HTable#batch() should perform validation on empty Put

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10600:
---

Attachment: 10600-v1.txt

Here is patch v1.

 HTable#batch() should perform validation on empty Put
 -

 Key: HBASE-10600
 URL: https://issues.apache.org/jira/browse/HBASE-10600
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Attachments: 10600-v1.txt


 Raised by java8964 in this thread:
 http://osdir.com/ml/general/2014-02/msg44384.html
 When an empty Put is passed in the List to HTable#batch(), there is no 
 validation performed whereas IllegalArgumentException would have been thrown 
 if this empty Put in the simple Put API call.
 Validation on empty Put should be carried out in HTable#batch().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10590) Update contents about tracing in the Reference Guide

2014-02-24 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HBASE-10590:
-

Attachment: HBASE-10590-0.patch

attaching patch including fixes such as:
* moved contents to newly created file,
* fixed formatting of XML,
* added usage about trace command on HBase shell.

 Update contents about tracing in the Reference Guide
 

 Key: HBASE-10590
 URL: https://issues.apache.org/jira/browse/HBASE-10590
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
 Attachments: HBASE-10590-0.patch


 Adding explanation about client side settings and shell command for tracing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10355) Failover RPC's from client using region replicas

2014-02-24 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910708#comment-13910708
 ] 

Nicolas Liochon commented on HBASE-10355:
-

The connection is shared between the tables, so you don't really know: if the 
first get is on a table that don't have replica, then the connection will be 
w/o the separate writer. HTable knows very little about replica today. It only 
sees something when it receives a get with consistency != strong.

Note that HBASE-10566 is about being able to have a single path (once the 
socket timeout it out of the way, we can have a thread pool for the readers and 
the writers).

 Failover RPC's from client using region replicas
 

 Key: HBASE-10355
 URL: https://issues.apache.org/jira/browse/HBASE-10355
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Reporter: Enis Soztutar
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10355.v1.patch, 10355.v2.patch, 10355.v3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10590) Update contents about tracing in the Reference Guide

2014-02-24 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HBASE-10590:
-

Labels: documentaion  (was: )
Status: Patch Available  (was: Open)

 Update contents about tracing in the Reference Guide
 

 Key: HBASE-10590
 URL: https://issues.apache.org/jira/browse/HBASE-10590
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
  Labels: documentaion
 Attachments: HBASE-10590-0.patch


 Adding explanation about client side settings and shell command for tracing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10587) Master metrics clusterRequests is wrong

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910726#comment-13910726
 ] 

Hudson commented on HBASE-10587:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #168 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/168/])
HBASE-10587 Master metrics clusterRequests is wrong (jxiang: rev 1571357)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterMetrics.java


 Master metrics clusterRequests is wrong
 ---

 Key: HBASE-10587
 URL: https://issues.apache.org/jira/browse/HBASE-10587
 Project: HBase
  Issue Type: Bug
  Components: master, metrics
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: hbase-10587.patch


 In the master jmx, metrics clusterRequests increases so fast. Looked into the 
 code and found the calculation is a little bit wrong. It's a counter. 
 However, for each region server report, the total number of requests is added 
 to clusterRequests. That means it's added multiple times.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910746#comment-13910746
 ] 

Hadoop QA commented on HBASE-10566:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630733/10566.v2.patch
  against trunk revision .
  ATTACHMENT ID: 12630733

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8787//console

This message is automatically generated.

 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch, 10566.v2.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to 

[jira] [Commented] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910754#comment-13910754
 ] 

Hadoop QA commented on HBASE-10597:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630734/10597-v3.txt
  against trunk revision .
  ATTACHMENT ID: 12630734

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8788//console

This message is automatically generated.

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10597-v1.txt, 10597-v2.txt, 10597-v3.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10587) Master metrics clusterRequests is wrong

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910769#comment-13910769
 ] 

Hudson commented on HBASE-10587:


SUCCESS: Integrated in hbase-0.96 #311 (See 
[https://builds.apache.org/job/hbase-0.96/311/])
HBASE-10587 Master metrics clusterRequests is wrong (jxiang: rev 1571358)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterMetrics.java


 Master metrics clusterRequests is wrong
 ---

 Key: HBASE-10587
 URL: https://issues.apache.org/jira/browse/HBASE-10587
 Project: HBase
  Issue Type: Bug
  Components: master, metrics
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: hbase-10587.patch


 In the master jmx, metrics clusterRequests increases so fast. Looked into the 
 code and found the calculation is a little bit wrong. It's a counter. 
 However, for each region server report, the total number of requests is added 
 to clusterRequests. That means it's added multiple times.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10587) Master metrics clusterRequests is wrong

2014-02-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910809#comment-13910809
 ] 

Hudson commented on HBASE-10587:


FAILURE: Integrated in HBase-TRUNK #4948 (See 
[https://builds.apache.org/job/HBase-TRUNK/4948/])
HBASE-10587 Master metrics clusterRequests is wrong (jxiang: rev 1571354)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterMetrics.java


 Master metrics clusterRequests is wrong
 ---

 Key: HBASE-10587
 URL: https://issues.apache.org/jira/browse/HBASE-10587
 Project: HBase
  Issue Type: Bug
  Components: master, metrics
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: hbase-10587.patch


 In the master jmx, metrics clusterRequests increases so fast. Looked into the 
 code and found the calculation is a little bit wrong. It's a counter. 
 However, for each region server report, the total number of requests is added 
 to clusterRequests. That means it's added multiple times.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10597:
---

Fix Version/s: 0.99.0
   0.98.1

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.1, 0.99.0

 Attachments: 10597-v1.txt, 10597-v2.txt, 10597-v3.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10597) IOEngine#read() should return the number of bytes transferred

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10597:
---

Hadoop Flags: Reviewed

 IOEngine#read() should return the number of bytes transferred
 -

 Key: HBASE-10597
 URL: https://issues.apache.org/jira/browse/HBASE-10597
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.1, 0.99.0

 Attachments: 10597-v1.txt, 10597-v2.txt, 10597-v3.txt


 IOEngine#read() is called by BucketCache#getBlock().
 IOEngine#read() should return the number of bytes transferred so that 
 BucketCache#getBlock() can check this return value against the length 
 obtained from bucketEntry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Ted Yu (JIRA)
Ted Yu created HBASE-10601:
--

 Summary: Upgrade hadoop to 2.3.0 release
 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0


Apache Hadoop 2.3.0 has been released.

This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10601:
---

Status: Patch Available  (was: Open)

 Upgrade hadoop to 2.3.0 release
 ---

 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0

 Attachments: 10601-v1.txt


 Apache Hadoop 2.3.0 has been released.
 This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10601:
---

Attachment: 10601-v1.txt

 Upgrade hadoop to 2.3.0 release
 ---

 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0

 Attachments: 10601-v1.txt


 Apache Hadoop 2.3.0 has been released.
 This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10590) Update contents about tracing in the Reference Guide

2014-02-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910868#comment-13910868
 ] 

Hadoop QA commented on HBASE-10590:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12630779/HBASE-10590-0.patch
  against trunk revision .
  ATTACHMENT ID: 12630779

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8789//console

This message is automatically generated.

 Update contents about tracing in the Reference Guide
 

 Key: HBASE-10590
 URL: https://issues.apache.org/jira/browse/HBASE-10590
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Priority: Minor
  Labels: documentaion
 Attachments: HBASE-10590-0.patch


 Adding explanation about client side settings and shell command for tracing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10590) Update contents about tracing in the Reference Guide

2014-02-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10590:
--

   Resolution: Fixed
Fix Version/s: 0.99.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thank you for the excellent addition to our doc.

 Update contents about tracing in the Reference Guide
 

 Key: HBASE-10590
 URL: https://issues.apache.org/jira/browse/HBASE-10590
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: documentaion
 Fix For: 0.99.0

 Attachments: HBASE-10590-0.patch


 Adding explanation about client side settings and shell command for tracing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HBASE-10590) Update contents about tracing in the Reference Guide

2014-02-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reassigned HBASE-10590:
-

Assignee: Masatake Iwasaki

Made you a contributor Masatake.

 Update contents about tracing in the Reference Guide
 

 Key: HBASE-10590
 URL: https://issues.apache.org/jira/browse/HBASE-10590
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: documentaion
 Fix For: 0.99.0

 Attachments: HBASE-10590-0.patch


 Adding explanation about client side settings and shell command for tracing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10600) HTable#batch() should perform validation on empty Put

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10600:
---

Attachment: 10600-v2.txt

Patch v2 adds a test.

 HTable#batch() should perform validation on empty Put
 -

 Key: HBASE-10600
 URL: https://issues.apache.org/jira/browse/HBASE-10600
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Attachments: 10600-v1.txt, 10600-v2.txt


 Raised by java8964 in this thread:
 http://osdir.com/ml/general/2014-02/msg44384.html
 When an empty Put is passed in the List to HTable#batch(), there is no 
 validation performed whereas IllegalArgumentException would have been thrown 
 if this empty Put in the simple Put API call.
 Validation on empty Put should be carried out in HTable#batch().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10600) HTable#batch() should perform validation on empty Put

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10600:
---

Assignee: Ted Yu
  Status: Patch Available  (was: Open)

 HTable#batch() should perform validation on empty Put
 -

 Key: HBASE-10600
 URL: https://issues.apache.org/jira/browse/HBASE-10600
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10600-v1.txt, 10600-v2.txt


 Raised by java8964 in this thread:
 http://osdir.com/ml/general/2014-02/msg44384.html
 When an empty Put is passed in the List to HTable#batch(), there is no 
 validation performed whereas IllegalArgumentException would have been thrown 
 if this empty Put in the simple Put API call.
 Validation on empty Put should be carried out in HTable#batch().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910914#comment-13910914
 ] 

Lars Hofhansl commented on HBASE-10601:
---

Should we rather have 2.2 and 2.3 as an option, and default to 2.3?

 Upgrade hadoop to 2.3.0 release
 ---

 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0

 Attachments: 10601-v1.txt


 Apache Hadoop 2.3.0 has been released.
 This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10601:
--

Attachment: 10601-0.94.txt

Here's what I had in mind for 0.94. I was planning to do that in a separate 
jira, but might as well do it here. Note the change for protobuf specific to 
the Hadoop version.

 Upgrade hadoop to 2.3.0 release
 ---

 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0

 Attachments: 10601-0.94.txt, 10601-v1.txt


 Apache Hadoop 2.3.0 has been released.
 This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10600) HTable#batch() should perform validation on empty Put

2014-02-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10600:
--

Priority: Trivial  (was: Major)

 HTable#batch() should perform validation on empty Put
 -

 Key: HBASE-10600
 URL: https://issues.apache.org/jira/browse/HBASE-10600
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Trivial
 Attachments: 10600-v1.txt, 10600-v2.txt


 Raised by java8964 in this thread:
 http://osdir.com/ml/general/2014-02/msg44384.html
 When an empty Put is passed in the List to HTable#batch(), there is no 
 validation performed whereas IllegalArgumentException would have been thrown 
 if this empty Put in the simple Put API call.
 Validation on empty Put should be carried out in HTable#batch().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10600) HTable#batch() should perform validation on empty Put

2014-02-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910917#comment-13910917
 ] 

stack commented on HBASE-10600:
---

Why would we allow an empty Put in the first place?

 HTable#batch() should perform validation on empty Put
 -

 Key: HBASE-10600
 URL: https://issues.apache.org/jira/browse/HBASE-10600
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 10600-v1.txt, 10600-v2.txt


 Raised by java8964 in this thread:
 http://osdir.com/ml/general/2014-02/msg44384.html
 When an empty Put is passed in the List to HTable#batch(), there is no 
 validation performed whereas IllegalArgumentException would have been thrown 
 if this empty Put in the simple Put API call.
 Validation on empty Put should be carried out in HTable#batch().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910919#comment-13910919
 ] 

Ted Yu commented on HBASE-10601:


Patch v1 is for trunk (0.99).

For 0.94, I am fine with Lars' patch.

 Upgrade hadoop to 2.3.0 release
 ---

 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0

 Attachments: 10601-0.94.txt, 10601-v1.txt


 Apache Hadoop 2.3.0 has been released.
 This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10590) Update contents about tracing in the Reference Guide

2014-02-24 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910924#comment-13910924
 ] 

Masatake Iwasaki commented on HBASE-10590:
--

Thanks [~stack]!

 Update contents about tracing in the Reference Guide
 

 Key: HBASE-10590
 URL: https://issues.apache.org/jira/browse/HBASE-10590
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
  Labels: documentaion
 Fix For: 0.99.0

 Attachments: HBASE-10590-0.patch


 Adding explanation about client side settings and shell command for tracing.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10601:
---

Status: Open  (was: Patch Available)

 Upgrade hadoop to 2.3.0 release
 ---

 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0

 Attachments: 10601-0.94.txt, 10601-v1.txt


 Apache Hadoop 2.3.0 has been released.
 This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10566) cleanup rpcTimeout in the client

2014-02-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910954#comment-13910954
 ] 

stack commented on HBASE-10566:
---

Fix javadoc warning on commit?

This is a a great comment: We're spending a lot of time wrapping the 
exceptions, and then unwrapping them to discover what really happened.  File 
an issue for this one when you get a chance.

Patch looks great to me.  Commit.



 cleanup rpcTimeout in the client
 

 Key: HBASE-10566
 URL: https://issues.apache.org/jira/browse/HBASE-10566
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.99.0
Reporter: Nicolas Liochon
Assignee: Nicolas Liochon
 Fix For: 0.99.0

 Attachments: 10566.sample.patch, 10566.v1.patch, 10566.v2.patch


 There are two issues:
 1) A confusion between the socket timeout and the call timeout
 Socket timeouts should be minimal: a default like 20 seconds, that could be 
 lowered to single digits timeouts for some apps: if we can not write to the 
 socket in 10 second, we have an issue. This is different from the total 
 duration (send query + do query + receive query), that can be longer, as it 
 can include remotes calls on the server and so on. Today, we have a single 
 value, it does not allow us to have low socket read timeouts.
 2) The timeout can be different between the calls. Typically, if the total 
 time, retries included is 60 seconds but failed after 2 seconds, then the 
 remaining is 58s. HBase does this today, but by hacking with a thread local 
 storage variable. It's a hack (it should have been a parameter of the 
 methods, the TLS allowed to bypass all the layers. May be protobuf makes this 
 complicated, to be confirmed), but as well it does not really work, because 
 we can have multithreading issues (we use the updated rpc timeout of someone 
 else, or we create a new BlockingRpcChannelImplementation with a random 
 default timeout).
 Ideally, we could send the call timeout to the server as well: it will be 
 able to dismiss alone the calls that it received but git stick in the request 
 queue or in the internal retries (on hdfs for example).
 This will make the system more reactive to failure.
 I think we can solve this now, especially after 10525. The main issue is to 
 something that fits well with protobuf...
 Then it should be easy to have a pool of thread for writers and readers, w/o 
 a single thread per region server as today. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10451) Enable back Tag compression on HFiles

2014-02-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910956#comment-13910956
 ] 

Ted Yu commented on HBASE-10451:


{code}
-TagCompressionContext context = new 
TagCompressionContext(LRUDictionary.class);
+TagCompressionContext context = new 
TagCompressionContext(LRUDictionary.class, Byte.MAX_VALUE);
{code}
There are some calls to TagCompressionContext ctor where Short.MAX_VALUE is 
used.
Is the different capacity intended ?

 Enable back Tag compression on HFiles
 -

 Key: HBASE-10451
 URL: https://issues.apache.org/jira/browse/HBASE-10451
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.1, 0.99.0

 Attachments: HBASE-10451.patch, HBASE-10451_V2.patch, 
 HBASE-10451_V3.patch, HBASE-10451_V4.patch, HBASE-10451_V5.patch, 
 HBASE-10451_V6.patch


 HBASE-10443 disables tag compression on HFiles. This Jira is to fix the 
 issues we have found out in HBASE-10443 and enable it back.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10601) Upgrade hadoop to 2.3.0 release

2014-02-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910974#comment-13910974
 ] 

Lars Hofhansl commented on HBASE-10601:
---

[~apurtell], [~stack], any opinions for 0.96 and 0.98?

 Upgrade hadoop to 2.3.0 release
 ---

 Key: HBASE-10601
 URL: https://issues.apache.org/jira/browse/HBASE-10601
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.0

 Attachments: 10601-0.94.txt, 10601-v1.txt


 Apache Hadoop 2.3.0 has been released.
 This issue is to upgrade hadoop dependency to 2.3.0



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   >